F.LLI PISTOLESI Snc Company

2018-02-23 Thread .F.LLI PISTOLESI
Hello , 
 


I am looking for a reliable supplier /manufacturer of products for sell in 
Europe.

I came across your listing and wanted to get some information regarding minimum 
Order Quantities, FOB pricing and also the possibility of packaging including 
payments terms.

So could you please get back to be with the above informations as soon as 
possible .

My email is :tm6428...@gmail.com

Many thanks and i looking forward to hearing from you and hopefully placing an 
order with you company.

Best Regards
Lorenzo Delleani.

F.LLI PISTOLESI Snc Company P.O. box 205
2740 AE Waddinxveen
The Netherlands


Re: [PATCH] commit-graph: fix some "plain integer as NULL pointer" warnings

2018-02-23 Thread René Scharfe
Am 24.02.2018 um 03:24 schrieb Ramsay Jones:
> diff --git a/commit-graph.c b/commit-graph.c
> index fc5ee7e99..c2f443436 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -45,7 +45,7 @@ char *get_graph_latest_filename(const char *obj_dir)
>   {
>   struct strbuf fname = STRBUF_INIT;
>   strbuf_addf(, "%s/info/graph-latest", obj_dir);
> - return strbuf_detach(, 0);
> + return strbuf_detach(, NULL);
>   }

You could also replace that function body with:

return xstrfmt("%s/info/graph-latest", obj_dir);

René


Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command

2018-02-23 Thread Jeff King
On Fri, Feb 23, 2018 at 04:19:54PM -0800, Brandon Williams wrote:

> > We always have the ability to extend the patterns accepted via a feature
> > (or capability) to ls-refs, so maybe the best thing to do now would only
> > support a few patterns with specific semantics.  Something like if you
> > say "master" only match against refs/heads/ and refs/tags/ and if you
> > want something else you would need to specify "refs/pull/master"?
> > 
> > That way we could only support globs at the end "master*" where * can
> > match anything (including slashes)
> 
> After some in-office discussion it seems like the best thing to do for
> this (right now since if we change our mind we can just introduce a
> capability which extends the patterns supported) would be to left-anchor
> the ref-patterns and only allow for a single wildcard character '*'
> which matches zero or more characters (and doesn't care about slashes
> '/').  This wildcard character should only be supported at the end of
> the ref pattern.  This means that if a client wants 'master' then they
> would need to specify 'refs/heads/master' (and the other
> ref_rev_parse_rules expansions) as a ref pattern. But they could say
> "refs/heads/*" for all refs under refs/heads.

Heh, I just responded without having read this and came up with the same
suggestion.

So I agree that is the right path. Or the simplification I mentioned
that "refs/heads/master" would return that ref or possibly
"refs/heads/master/foo" if it exists. Remember that it's fine to be
overly broad here. This is purely an optimization in the advertisement,
as we'd still pick out the refs we care about in a separate step.

-Peff


Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command

2018-02-23 Thread Jeff King
On Thu, Feb 22, 2018 at 04:45:14PM -0800, Brandon Williams wrote:

> > This kind of tail matching can't quite implement all of the current
> > behavior. Because we actually do the normal dwim_ref() matching, which
> > includes stuff like "refs/remotes/%s/HEAD".
> > 
> > The other problem with tail-matching is that it's inefficient on the
> > server. Ideally we could get a request for "master" and only look up
> > refs/heads/master, refs/tags/master, etc. And if there are 50,000 refs
> > in refs/pull, we wouldn't have to process those at all. Of course this
> > is no worse than the current code, which not only looks at each ref but
> > actually _sends_ it. But it would be nice if we could fix this.
> > 
> > There's some more discussion in this old thread:
> > 
> >   
> > https://public-inbox.org/git/20161024132932.i42rqn2vlpocq...@sigill.intra.peff.net/
> 
> Thanks for the pointer.  I was told to be wary a while about about
> performance implications on the server but no discussion ensued till now
> about it :)
> 
> We always have the ability to extend the patterns accepted via a feature
> (or capability) to ls-refs, so maybe the best thing to do now would only
> support a few patterns with specific semantics.  Something like if you
> say "master" only match against refs/heads/ and refs/tags/ and if you
> want something else you would need to specify "refs/pull/master"?

The big question is whether you want to break compatibility with the
existing program behavior. If not, then I think you have to ask for
every variant in ref_rev_parse_rules (of which there are 6 variants).

Which sounds pretty gross, but it actually may not be _too_ bad. Most
fetches tend to ask for either a single name, or they use left-anchored
wildcards. So it would work to just have the client expand all of the
possibilities itself into fully-qualified refs, and keep the server as
dumb as possible.

And then the server for now can just cull based on the pattern list,
like you have here. But later, we could optimize it to look up the
individual patterns, which should be cheaper, since we'd generally have
many fewer patterns than total refs.

> > Does the client have to be aware that we're using wildmatch? I think
> > they'd need "refs/heads/**" to actually implement what we usually
> > specify in refspecs as "refs/heads/*". Or does the lack of WM_PATHNAME
> > make this work with just "*"?
> > 
> > Do we anticipate that the client would left-anchor the refspec like
> > "/refs/heads/*" so that in theory the server could avoid looking outside
> > of /refs/heads/?
> 
> Yeah we may want to anchor it by providing the leading '/' instead of
> just "refs/".

I actually wonder if we should just specify that the patterns must
_always_ be fully-qualified, but may end with a single "/*" to iterate
over wildcards. Or even simpler, that "refs/heads/foo" would find that
ref itself, and anything under it.

That drops any question about how wildcards work (e.g., does "refs/foo*"
work to find "refs/foobar"?).

> I need to read over the discussion you linked to more but what sort of
> ref patterns do you believe we should support as part of the initial
> release of v2?  It seems like you wanted this at some point in the past
> so I assume you have an idea of what sort of filtering would be
> beneficial.

My goals were just optimizing:

  1. Don't send all the refs across the wire if we can avoid it.

  2. Don't even iterate over all the refs internally if we can avoid it.

Especially with the new binary-searching packed-refs code, we should be
able to serve a request like "ls-refs refs/heads/*" without looking into
"refs/pull" or "refs/changes" at all.

-Peff


Re: [PATCH 2/2] Revert "repository: pre-initialize hash algo pointer"

2018-02-23 Thread Duy Nguyen
On Sat, Feb 24, 2018 at 5:29 AM, brian m. carlson
 wrote:
>> @@ -40,5 +41,8 @@ int main(int argc, const char **argv)
>>
>>   restore_sigpipe_to_default();
>>
>> + if (getenv("GIT_HASH_FIXUP"))
>> + repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
>
> I'm lukewarm on adding this environment variable, but considering our
> history here, we had probably better.  We can always remove it after a
> few releases.

Yes that's the intention. But after writing cover letter for v2 and
sending it out, it looks to me that this thing must stay until all our
code is converted to using the_hash_algo (I don't know if there are
more to convert or it's finished already). So an alternative is we do
the opposite: default to GIT_HASH_SHA1, but when an env variable is
set, reset it back to NULL. This env variable will _always_ be set by
the test suite to help us catch problems.
-- 
Duy


[PATCH v2 3/5] index-pack: check (and optionally set) hash algo based on input file

2018-02-23 Thread Nguyễn Thái Ngọc Duy
After 454253f059 (builtin/index-pack: improve hash function abstraction
- 2018-02-01), index-pack uses the_hash_algo for hashing. If "git
index-pack" is executed without a repository, we do not know what hash
algorithm to be used and the_hash_algo in theory could be undefined.

Since there should be some information about the hash algorithm in the
input pack file, we can initialize the correct hash algorithm with that
if the_hash_algo is not yet initialized. This assumes that pack files
with new hash algorithm MUST step up pack version.

While at there, make sure the hash algorithm requested by the pack file
and configured by the repository (if we're running with a repo) are
consistent.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/index-pack.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 7e3e1a461c..1144458140 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -326,10 +326,31 @@ static const char *open_pack_file(const char *pack_name)
output_fd = -1;
nothread_data.pack_fd = input_fd;
}
-   the_hash_algo->init_fn(_ctx);
return pack_name;
 }
 
+static void prepare_hash_algo(uint32_t pack_version)
+{
+   const struct git_hash_algo *pack_algo;
+
+   switch (pack_version) {
+   case 2:
+   case 3:
+   pack_algo = _algos[GIT_HASH_SHA1];
+   break;
+   default:
+   die("BUG: how to determine hash algo for new version?");
+   }
+
+   if (!the_hash_algo) /* running without repo */
+   the_hash_algo = pack_algo;
+
+   if (the_hash_algo != pack_algo)
+   die(_("incompatible hash algorithm, "
+ "configured for %s but the pack file needs %s"),
+   the_hash_algo->name, pack_algo->name);
+}
+
 static void parse_pack_header(void)
 {
struct pack_header *hdr = fill(sizeof(struct pack_header));
@@ -341,6 +362,9 @@ static void parse_pack_header(void)
die(_("pack version %"PRIu32" unsupported"),
ntohl(hdr->hdr_version));
 
+   prepare_hash_algo(ntohl(hdr->hdr_version));
+   the_hash_algo->init_fn(_ctx);
+
nr_objects = ntohl(hdr->hdr_entries);
use(sizeof(struct pack_header));
 }
-- 
2.16.1.435.g8f24da2e1a



[PATCH v2 2/5] sha1_file.c: keep a good name for "unknown" hash_algos[UNKNOWN]

2018-02-23 Thread Nguyễn Thái Ngọc Duy
This is mostly for displaying the hash algorithm name when we report
errors. Printing "unknown" with '%s' is much better than '(null)' in
glibc printf version (and probably could crash if other implementations
do not check for NULL pointer)

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 sha1_file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sha1_file.c b/sha1_file.c
index 826d7a0ae3..4b266b1c68 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -71,7 +71,7 @@ static void git_hash_unknown_final(unsigned char *hash, 
git_hash_ctx *ctx)
 
 const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
{
-   NULL,
+   "unknown",
0x,
0,
0,
-- 
2.16.1.435.g8f24da2e1a



[PATCH v2 0/5] Fix initializing the_hash_algo

2018-02-23 Thread Nguyễn Thái Ngọc Duy
Brian questioned the unnecessary alarms in v1 when "git diff" or "git
index-pack" run in no-repository mode. I had the same feeling after
sending v1. But instead of suppresing alarms, we could do better.

v2 breaks those "fall back to SHA-1" code into separate patches and
handles it properly (I hope) instead of blind fall back like v1:

- for index-pack, we can determine the needed algorithm from the pack
  file. I'm making an assumption here that pack files with new hash
  algo must step up file format version. But I think it's a reasonable
  assumption.

- for diff --no-index, I still fall back to SHA-1 to generate the
  hashes like before. We could probably introduce a new command line
  option to use a different hash. But that work could be done later
  when an actual new hash has come

Note, I didn't test but this series could potentially break 'pu' (in a
good way). I initially was puzzled why the test suite didn't fail when
the_repository->hash_algo was NULL (i.e. before Brian's fix). Then I
found out that more work to abstract away SHA-1 hashing functions have
been done since then. But because the_hash_algo is pre-initialized
with SHA-1, these special cases did not show up until now. If there
are more the_hash_algo conversion on 'pu', more "no-repo" cases
could be spotted by the test suite.

I think this makes pre-initializing the_hash_algo to NULL a very good
point: during the abstraction work, we at least must identify the use
cases like this (code running without repo). We don't have to fix it
right away, but we can start thinking about how to deal with it. If we
ignore it and switch to a new hash, these code will keep using SHA-1
and cause more headache in future.

Nguyễn Thái Ngọc Duy (5):
  setup.c: initialize the_repository correctly in all cases
  sha1_file.c: keep a good name for "unknown" hash_algos[UNKNOWN]
  index-pack: check (and optionally set) hash algo based on input file
  diff.c: initialize hash algo when running in --no-index mode
  Revert "repository: pre-initialize hash algo pointer"

 builtin/index-pack.c | 26 +-
 builtin/init-db.c|  3 ++-
 cache.h  |  3 ++-
 common-main.c| 10 ++
 diff.c   | 12 
 path.c   |  2 +-
 repository.c |  2 +-
 setup.c  |  5 -
 sha1_file.c  |  2 +-
 t/helper/test-dump-split-index.c |  2 ++
 10 files changed, 60 insertions(+), 7 deletions(-)

-- 
2.16.1.435.g8f24da2e1a



[PATCH v2 4/5] diff.c: initialize hash algo when running in --no-index mode

2018-02-23 Thread Nguyễn Thái Ngọc Duy
Our "git diff" command supports running as a standalone tool. In this
code path, we try to hash the file content but after
18e2588e11 (sha1_file: switch uses of SHA-1 to the_hash_algo -
2018-02-01), there is a chance that the_hash_algo (required by
index_path) may still be uninitialized if no repository is found.

Executing index_path() when the_hash_algo is NULL (or points to unknown
algo) either crashes or dies. Let's make it a bit safer by explicitly
falling back to SHA-1 (so that the diff output remains the same as
before, compared to the alternative that we simply do not hash).

dòng được

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 diff.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/diff.c b/diff.c
index 21c3838b25..f5755425b6 100644
--- a/diff.c
+++ b/diff.c
@@ -3995,6 +3995,18 @@ static void run_diff(struct diff_filepair *p, struct 
diff_options *o)
return;
}
 
+   /*
+* NEEDSWORK: When running in no-index mode (and no repo is
+* found, thus no hash algo conifugred), fall back to SHA-1
+* hashing (which is used by diff_fill_oid_info below) to
+* avoid regression in diff output.
+*
+* In future, perhaps we can allow the user to specify their
+* hash algorithm from command line in this mode.
+*/
+   if (o->flags.no_index && !the_hash_algo)
+   the_hash_algo = _algos[GIT_HASH_SHA1];
+
diff_fill_oid_info(one);
diff_fill_oid_info(two);
 
-- 
2.16.1.435.g8f24da2e1a



[PATCH v2 1/5] setup.c: initialize the_repository correctly in all cases

2018-02-23 Thread Nguyễn Thái Ngọc Duy
There are many ways for any command to access a git repository:

- most of them will try to discover the .git dir via
  setup_git_directory() and friends

- the server side programs already know where the repo is and prepare
  with enter_repo()

- special commands that deal with repo creation (init/clone) use
  init_db() once the new repo is ready for access.

- somebody accesses $GIT_DIR before any of above functions are called
  and accidentally sets up a git repository by set_git_dir() alone

"the_repository" is partially set up via set_git_dir() at some point
in all four cases. The hash algorithm though is configured later after
.git/config is read.

So far proper repo initialization is done only for the first case [1].
The second case is not covered (but that's fine [3]). The third case
was found and worked around in [2]. The fourth case is a buggy one,
which should be fixed already by jk/no-looking-at-dotgit-outside-repo
and never happens again.

This patch makes sure all cases initialize the hash algorithm in
the_repository correctly. Both second and third cases must run
check_repo_format() before "entering" it. Eventually we probably just
rename this function to init_repo() or something.

[1] 78a6766802 (Integrate hash algorithm support with repo setup -
2017-11-12)

[2] e26f7f19b6 (repository: pre-initialize hash algo pointer -
2018-01-19)

[3] the reason server side is still running ok with no hash algo before
[2] is because the programs that use enter_repo() do very
little (and unlikely to hash anything) then spawn a new
program (like pack-objects or upload-archive) to do the heavy
lifting. These programs already use setup_git_directory() or the
gently version

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/init-db.c | 3 ++-
 cache.h   | 3 ++-
 path.c| 2 +-
 setup.c   | 5 -
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/builtin/init-db.c b/builtin/init-db.c
index c9b7946bad..d119d9906b 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -9,6 +9,7 @@
 #include "builtin.h"
 #include "exec_cmd.h"
 #include "parse-options.h"
+#include "repository.h"
 
 #ifndef DEFAULT_GIT_TEMPLATE_DIR
 #define DEFAULT_GIT_TEMPLATE_DIR "/usr/share/git-core/templates"
@@ -369,7 +370,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
 * config file, so this will not fail.  What we are catching
 * is an attempt to reinitialize new repository with an old tool.
 */
-   check_repository_format();
+   check_repository_format(the_repository);
 
reinit = create_default_files(template_dir, original_git_dir);
 
diff --git a/cache.h b/cache.h
index 21fbcc2414..6b97138264 100644
--- a/cache.h
+++ b/cache.h
@@ -894,6 +894,7 @@ extern int repository_format_precious_objects;
 extern char *repository_format_partial_clone;
 extern const char *core_partial_clone_filter_default;
 
+struct repository;
 struct repository_format {
int version;
int precious_objects;
@@ -926,7 +927,7 @@ int verify_repository_format(const struct repository_format 
*format,
  * set_git_dir() before calling this, and use it only for "are we in a valid
  * repo?".
  */
-extern void check_repository_format(void);
+extern void check_repository_format(struct repository *);
 
 #define MTIME_CHANGED  0x0001
 #define CTIME_CHANGED  0x0002
diff --git a/path.c b/path.c
index da8b655730..a544252198 100644
--- a/path.c
+++ b/path.c
@@ -827,7 +827,7 @@ const char *enter_repo(const char *path, int strict)
 
if (is_git_directory(".")) {
set_git_dir(".");
-   check_repository_format();
+   check_repository_format(the_repository);
return path;
}
 
diff --git a/setup.c b/setup.c
index c5d55dcee4..a82103832e 100644
--- a/setup.c
+++ b/setup.c
@@ -1180,11 +1180,14 @@ int git_config_perm(const char *var, const char *value)
return -(i & 0666);
 }
 
-void check_repository_format(void)
+/* optionally configure "repo" to the correct format */
+void check_repository_format(struct repository *repo)
 {
struct repository_format repo_fmt;
check_repository_format_gently(get_git_dir(), _fmt, NULL);
startup_info->have_repository = 1;
+   if (repo)
+   repo_set_hash_algo(repo, repo_fmt.hash_algo);
 }
 
 /*
-- 
2.16.1.435.g8f24da2e1a



[PATCH v2 5/5] Revert "repository: pre-initialize hash algo pointer"

2018-02-23 Thread Nguyễn Thái Ngọc Duy
This reverts commit e26f7f19b6c7485f04234946a59ab8f4fd21d6d1. The root
problem, git clone not setting up the_hash_algo, has been fixed in the
previous patch.

Since this is a dangerous move and could potentially break stuff after
release (and leads to workaround like the reverted commit), the
workaround technically remains, but is hidden behind a new environment
variable GIT_HASH_FIXUP. This should let the users continue to use git
while we fix the problem. This variable can be deleted after one or two
releases.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 common-main.c| 10 ++
 repository.c |  2 +-
 t/helper/test-dump-split-index.c |  2 ++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/common-main.c b/common-main.c
index 6a689007e7..fbfa98c3b8 100644
--- a/common-main.c
+++ b/common-main.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "exec_cmd.h"
 #include "attr.h"
+#include "repository.h"
 
 /*
  * Many parts of Git have subprograms communicate via pipe, expect the
@@ -40,5 +41,14 @@ int main(int argc, const char **argv)
 
restore_sigpipe_to_default();
 
+   /*
+* Temporary recovery measure while hashing code is being
+* refactored. This variable should be gone after everybody
+* has used the_hash_algo in one or two releases and nobody
+* complains anything.
+*/
+   if (getenv("GIT_HASH_FIXUP"))
+   repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
return cmd_main(argc, argv);
 }
diff --git a/repository.c b/repository.c
index 4ffbe9bc94..0d715f4fdb 100644
--- a/repository.c
+++ b/repository.c
@@ -5,7 +5,7 @@
 
 /* The main repository */
 static struct repository the_repo = {
-   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, _index, 
_algos[GIT_HASH_SHA1], 0, 0
+   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, _index, NULL, 
0, 0
 };
 struct repository *the_repository = _repo;
 
diff --git a/t/helper/test-dump-split-index.c b/t/helper/test-dump-split-index.c
index e44430b699..b759fe3fc1 100644
--- a/t/helper/test-dump-split-index.c
+++ b/t/helper/test-dump-split-index.c
@@ -12,6 +12,8 @@ int cmd_main(int ac, const char **av)
struct split_index *si;
int i;
 
+   setup_git_directory();
+
do_read_index(_index, av[1], 1);
printf("own %s\n", sha1_to_hex(the_index.sha1));
si = the_index.split_index;
-- 
2.16.1.435.g8f24da2e1a



Re: [PATCH 1/2] setup.c: initialize the_repository correctly in all cases

2018-02-23 Thread Duy Nguyen
On Sat, Feb 24, 2018 at 5:17 AM, brian m. carlson
 wrote:
> On Fri, Feb 23, 2018 at 04:56:39PM +0700, Nguyễn Thái Ngọc Duy wrote:
>> [3] the reason server side is still running ok with no hash algo
>> before [2] is because the programs that use enter_repo() do very
>> little then spawn a new program (like pack-objects or
>> upload-archive) to do the heavy lifting. These programs already
>> use setup_git_dir..()
>
> You have "..()" here.  Did you want to say "()." instead?

It's because we have two functions, setup_git_directory() and
setup_git_directory_gently() and I tend to just put '*' instead of
listing both. But since it causes confusion, I'm listing both in v2.
-- 
Duy


[PATCH] commit-graph: fix some "plain integer as NULL pointer" warnings

2018-02-23 Thread Ramsay Jones

Signed-off-by: Ramsay Jones 
---

Hi Derrick,

If you need to re-roll your 'ds/commit-graph' branch, could you
please squash this into the relevant patches (corresponding to
commits 0fd2d95ee6 ["commit-graph: implement --set-latest"],
a33b9b79ff ["commit-graph: implement --delete-expired"] and finally
commit b534f9e961 ["commit-graph: implement git commit-graph read"]).

Also, I note that two symbols are only used in commit-graph.c but
are declared extern. Namely 'commit_graph' (declared line #42) and
'prepare_commit_graph' (declared line #212). Do these symbols need
to be extern (in future patches, perhaps)?

Thanks!

ATB,
Ramsay Jones

 builtin/commit-graph.c | 2 +-
 commit-graph.c | 8 
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index be1883f1b..efdb003ca 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -33,7 +33,7 @@ static struct opts_commit_graph {
 
 static int graph_read(int argc, const char **argv)
 {
-   struct commit_graph *graph = 0;
+   struct commit_graph *graph = NULL;
struct strbuf full_path = STRBUF_INIT;
 
static struct option builtin_commit_graph_read_options[] = {
diff --git a/commit-graph.c b/commit-graph.c
index fc5ee7e99..c2f443436 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -45,7 +45,7 @@ char *get_graph_latest_filename(const char *obj_dir)
 {
struct strbuf fname = STRBUF_INIT;
strbuf_addf(, "%s/info/graph-latest", obj_dir);
-   return strbuf_detach(, 0);
+   return strbuf_detach(, NULL);
 }
 
 char *get_graph_latest_contents(const char *obj_dir)
@@ -60,7 +60,7 @@ char *get_graph_latest_contents(const char *obj_dir)
FREE_AND_NULL(fname);
 
if (!f)
-   return 0;
+   return NULL;
 
while (!feof(f)) {
if (fgets(buf, sizeof(buf), f))
@@ -95,10 +95,10 @@ struct commit_graph *load_commit_graph_one(const char 
*graph_file)
unsigned char graph_version, hash_version;
 
if (fd < 0)
-   return 0;
+   return NULL;
if (fstat(fd, )) {
close(fd);
-   return 0;
+   return NULL;
}
graph_size = xsize_t(st.st_size);
 
-- 
2.16.0


[PATCH v2 2/2] t5536: simplify checking of messages output to stderr

2018-02-23 Thread Ramsay Jones
From: SZEDER Gábor 

Commit 2071e05ed2 ("t5536: new test of refspec conflicts when
fetching", 2013-10-30), introduced the verify_stderr() function
which was used to verify that certain fatal/warning messages were
issued by a given git command. In addition, verify_stderr() would
filter a specific "fatal: The remote end hung up unexpectedly"
message, which may, or may not, be present (depending on the
relative timing of the git-fetch and git-upload-pack processes).

The verify_stderr() function has seen several modifications, which
has introduced a couple of minor problems. For example, commit
1edbaac3bb ("tests: use test_i18n* functions to suppress false
positives", 2016-06-17) introduced an inappropriate test_i18ngrep
call and commit f096e6e826 ("fetch: improve the error messages
emitted for conflicting refspecs", 2013-10-30) included an
ineffective invocation of sort at the end of a grep pipeline.

Instead of fixing these minor problems in verify_stderr(), we take
the simpler approach of directly searching the error file, using
test_i18ngrep, for the specific message(s) we expect. (The only
minor downside is that we would not notice any new messages).

Signed-off-by: Ramsay Jones 
---
 t/t5536-fetch-conflicts.sh | 22 --
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/t/t5536-fetch-conflicts.sh b/t/t5536-fetch-conflicts.sh
index 644736b8a..91f28c2f7 100755
--- a/t/t5536-fetch-conflicts.sh
+++ b/t/t5536-fetch-conflicts.sh
@@ -18,14 +18,6 @@ setup_repository () {
)
 }
 
-verify_stderr () {
-   cat >expected &&
-   # We're not interested in the error
-   # "fatal: The remote end hung up unexpectedly":
-   test_i18ngrep -E '^(fatal|warning):' error | grep -v 'hung up' >actual 
| sort &&
-   test_i18ncmp expected actual
-}
-
 test_expect_success 'setup' '
git commit --allow-empty -m "Initial" &&
git branch branch1 &&
@@ -48,9 +40,7 @@ test_expect_success 'fetch conflict: config vs. config' '
"+refs/heads/branch2:refs/remotes/origin/branch1" && (
cd ccc &&
test_must_fail git fetch origin 2>error &&
-   verify_stderr <<-\EOF
-   fatal: Cannot fetch both refs/heads/branch1 and 
refs/heads/branch2 to refs/remotes/origin/branch1
-   EOF
+   test_i18ngrep "fatal: Cannot fetch both refs/heads/branch1 and 
refs/heads/branch2 to refs/remotes/origin/branch1" error
)
 '
 
@@ -77,9 +67,7 @@ test_expect_success 'fetch conflict: arg vs. arg' '
test_must_fail git fetch origin \
refs/heads/*:refs/remotes/origin/* \
refs/heads/branch2:refs/remotes/origin/branch1 2>error 
&&
-   verify_stderr <<-\EOF
-   fatal: Cannot fetch both refs/heads/branch1 and 
refs/heads/branch2 to refs/remotes/origin/branch1
-   EOF
+   test_i18ngrep "fatal: Cannot fetch both refs/heads/branch1 and 
refs/heads/branch2 to refs/remotes/origin/branch1" error
)
 '
 
@@ -90,10 +78,8 @@ test_expect_success 'fetch conflict: criss-cross args' '
git fetch origin \
refs/heads/branch1:refs/remotes/origin/branch2 \
refs/heads/branch2:refs/remotes/origin/branch1 2>error 
&&
-   verify_stderr <<-\EOF
-   warning: refs/remotes/origin/branch1 usually tracks 
refs/heads/branch1, not refs/heads/branch2
-   warning: refs/remotes/origin/branch2 usually tracks 
refs/heads/branch2, not refs/heads/branch1
-   EOF
+   test_i18ngrep "warning: refs/remotes/origin/branch1 usually 
tracks refs/heads/branch1, not refs/heads/branch2" error &&
+   test_i18ngrep "warning: refs/remotes/origin/branch2 usually 
tracks refs/heads/branch2, not refs/heads/branch1" error
)
 '
 
-- 
2.16.0


[PATCH v2 1/2] t4151: consolidate multiple calls to test_i18ngrep

2018-02-23 Thread Ramsay Jones

Attempting to grep the output of test_i18ngrep will not work under a
poison build, since the output is (almost) guaranteed not to have the
string you are looking for. In this case, we have a test_i18ngrep call
which attempts to filter the contents of a file, which was itself the
result of a call to test_i18ngrep. In this case, we can achieve the
same effect with a single call to test_i18ngrep (without creating the
intermediate file), since the second regular expression can be used
without change to filter the original input.

Also, replace a call to test_i18ncmp with test_cmp, since the content
being compared is not subject to i18n anyway.

Signed-off-by: Ramsay Jones 
---
 t/t4151-am-abort.sh | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/t/t4151-am-abort.sh b/t/t4151-am-abort.sh
index 9473c2779..16432781d 100755
--- a/t/t4151-am-abort.sh
+++ b/t/t4151-am-abort.sh
@@ -46,9 +46,8 @@ do
 
test_expect_success "am$with3 --skip continue after failed am$with3" '
test_must_fail git am$with3 --skip >output &&
-   test_i18ngrep "^Applying" output >output.applying &&
-   test_i18ngrep "^Applying: 6$" output.applying &&
-   test_i18ncmp file-2-expect file-2 &&
+   test_i18ngrep "^Applying: 6$" output &&
+   test_cmp file-2-expect file-2 &&
test ! -f .git/MERGE_RR
'
 
-- 
2.16.0


[PATCH v2 0/2] test_i18ngrep

2018-02-23 Thread Ramsay Jones

Since 'sg/test-i18ngrep' has now graduated to master, unlike v1, these
patches are built on top of current master (@e3a80781f). In addition,
the second patch has been replaced with the patch from SZEDER Gábor[1].

[Hopefully, I have included that patch correctly! If not, please amend
as required. Thanks!]

[1] 
https://public-inbox.org/git/%3c20180213100437.15685-1-szeder@gmail.com%3E/

Ramsay Jones (2):
  t4151: consolidate multiple calls to test_i18ngrep
  t5536: simplify checking of messages output to stderr

 t/t4151-am-abort.sh|  5 ++---
 t/t5536-fetch-conflicts.sh | 22 --
 2 files changed, 6 insertions(+), 21 deletions(-)

-- 
2.16.0


Re: [PATCH v3 21/35] fetch-pack: perform a fetch using v2

2018-02-23 Thread Jonathan Tan
On Tue,  6 Feb 2018 17:12:58 -0800
Brandon Williams  wrote:

> + while ((oid = get_rev())) {
> + packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
> + if (++haves_added >= INITIAL_FLUSH)
> + break;
> + };

Unnecessary semicolon after closing brace.

> + /* Filter 'ref' by 'sought' and those that aren't local 
> */
> + if (everything_local(args, , sought, nr_sought))
> + state = FETCH_DONE;
> + else
> + state = FETCH_SEND_REQUEST;
> + break;

I haven't looked at this patch in detail, but I found a bug that can be
reproduced if you patch the following onto this patch:

--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -124,6 +124,7 @@ test_expect_success 'clone with file:// using protocol 
v2' '
 
 test_expect_success 'fetch with file:// using protocol v2' '
test_commit -C file_parent two &&
+   git -C file_parent tag -d one &&
 
GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=2 \
fetch origin 2>log &&
@@ -133,7 +134,8 @@ test_expect_success 'fetch with file:// using protocol 
v2' '
test_cmp expect actual &&
 
# Server responded using protocol v2
-   grep "fetch< version 2" log
+   grep "fetch< version 2" log &&
+   grep "have " log
 '

Merely including the second hunk (the one with 'grep "have "') does not
make the test fail, but including both the first and second hunks does.
That is, fetch v2 emits "have" only for remote refs that point to
objects we already have, not for local refs.

Everything still appears to work, except that packfiles are usually much
larger than they need to be.

I did some digging in the code and found out that the equivalent of
find_common() (which calls `for_each_ref(rev_list_insert_ref_oid,
NULL)`) was not called in v2. In v1, find_common() is called immediately
after everything_local(), but there is no equivalent in v2. (I quoted
the invocation of everything_local() in v2 above.)


[PATCHv4 11/27] sha1_file: add repository argument to read_info_alternates

2018-02-23 Thread Stefan Beller
See previous patch for explanation.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index e5fe0aa04ef..f93d3b95b54 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -387,7 +387,9 @@ static int alt_odb_usable(struct raw_object_store *o,
  * SHA1, an extra slash for the first level indirection, and the
  * terminating NUL.
  */
-static void read_info_alternates(const char * relative_base, int depth);
+#define read_info_alternates(r, rb, d) read_info_alternates_##r(rb, d)
+static void read_info_alternates_the_repository(const char *relative_base,
+   int depth);
 #define link_alt_odb_entry(r, e, rb, d, n) link_alt_odb_entry_##r(e, rb, d, n)
 static int link_alt_odb_entry_the_repository(const char *entry,
const char *relative_base, int depth, const char *normalized_objdir)
@@ -428,7 +430,7 @@ static int link_alt_odb_entry_the_repository(const char 
*entry,
ent->next = NULL;
 
/* recursively add alternates */
-   read_info_alternates(pathbuf.buf, depth + 1);
+   read_info_alternates(the_repository, pathbuf.buf, depth + 1);
 
strbuf_release();
return 0;
@@ -494,7 +496,8 @@ static void link_alt_odb_entries(const char *alt, int sep,
strbuf_release();
 }
 
-static void read_info_alternates(const char * relative_base, int depth)
+static void read_info_alternates_the_repository(const char *relative_base,
+   int depth)
 {
char *path;
struct strbuf buf = STRBUF_INIT;
@@ -678,7 +681,7 @@ void prepare_alt_odb(void)
_repository->objects.alt_odb_list;
link_alt_odb_entries(alt, PATH_SEP, NULL, 0);
 
-   read_info_alternates(get_object_directory(), 0);
+   read_info_alternates(the_repository, get_object_directory(), 0);
 }
 
 /* Returns 1 if we have successfully freshened the file, 0 otherwise. */
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 14/27] sha1_file: allow link_alt_odb_entries to handle arbitrary repositories

2018-02-23 Thread Stefan Beller
Actually this also allows read_info_alternates and link_alt_odb_entry to
handle arbitrary repositories, but link_alt_odb_entries is the most
interesting function in this set of functions, hence the commit subject.

These functions span a strongly connected component in the function
graph, i.e. the recursive call chain might look like

  -> link_alt_odb_entries
-> link_alt_odb_entry
  -> read_info_alternates
-> link_alt_odb_entries

That is why we need to convert all these functions at the same time.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 object-store.h |  4 
 sha1_file.c| 36 
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/object-store.h b/object-store.h
index 95f328482aa..08cd48ade11 100644
--- a/object-store.h
+++ b/object-store.h
@@ -18,6 +18,10 @@ struct alternate_object_database {
char loose_objects_subdir_seen[256];
struct oid_array loose_objects_cache;
 
+   /*
+* Path to the alternative object store. If this is a relative path,
+* it is relative to the current working directory.
+*/
char path[FLEX_ARRAY];
 };
 #define prepare_alt_odb(r) prepare_alt_odb_##r()
diff --git a/sha1_file.c b/sha1_file.c
index 381eaf8221d..ab42d430b8a 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -388,11 +388,10 @@ static int alt_odb_usable(struct raw_object_store *o,
  * SHA1, an extra slash for the first level indirection, and the
  * terminating NUL.
  */
-#define read_info_alternates(r, rb, d) read_info_alternates_##r(rb, d)
-static void read_info_alternates_the_repository(const char *relative_base,
-   int depth);
-#define link_alt_odb_entry(r, e, rb, d, n) link_alt_odb_entry_##r(e, rb, d, n)
-static int link_alt_odb_entry_the_repository(const char *entry,
+static void read_info_alternates(struct repository *r,
+const char *relative_base,
+int depth);
+static int link_alt_odb_entry(struct repository *r, const char *entry,
const char *relative_base, int depth, const char *normalized_objdir)
 {
struct alternate_object_database *ent;
@@ -418,7 +417,7 @@ static int link_alt_odb_entry_the_repository(const char 
*entry,
while (pathbuf.len && pathbuf.buf[pathbuf.len - 1] == '/')
strbuf_setlen(, pathbuf.len - 1);
 
-   if (!alt_odb_usable(_repository->objects, , 
normalized_objdir)) {
+   if (!alt_odb_usable(>objects, , normalized_objdir)) {
strbuf_release();
return -1;
}
@@ -426,12 +425,12 @@ static int link_alt_odb_entry_the_repository(const char 
*entry,
ent = alloc_alt_odb(pathbuf.buf);
 
/* add the alternate entry */
-   *the_repository->objects.alt_odb_tail = ent;
-   the_repository->objects.alt_odb_tail = &(ent->next);
+   *r->objects.alt_odb_tail = ent;
+   r->objects.alt_odb_tail = &(ent->next);
ent->next = NULL;
 
/* recursively add alternates */
-   read_info_alternates(the_repository, pathbuf.buf, depth + 1);
+   read_info_alternates(r, pathbuf.buf, depth + 1);
 
strbuf_release();
return 0;
@@ -466,12 +465,8 @@ static const char *parse_alt_odb_entry(const char *string,
return end;
 }
 
-#define link_alt_odb_entries(r, a, s, rb, d) \
-   link_alt_odb_entries_##r(a, s, rb, d)
-static void link_alt_odb_entries_the_repository(const char *alt,
-   int sep,
-   const char *relative_base,
-   int depth)
+static void link_alt_odb_entries(struct repository *r, const char *alt,
+int sep, const char *relative_base, int depth)
 {
struct strbuf objdirbuf = STRBUF_INIT;
struct strbuf entry = STRBUF_INIT;
@@ -485,7 +480,7 @@ static void link_alt_odb_entries_the_repository(const char 
*alt,
return;
}
 
-   strbuf_add_absolute_path(, get_object_directory());
+   strbuf_add_absolute_path(, r->objects.objectdir);
if (strbuf_normalize_path() < 0)
die("unable to normalize object directory: %s",
objdirbuf.buf);
@@ -494,15 +489,16 @@ static void link_alt_odb_entries_the_repository(const 
char *alt,
alt = parse_alt_odb_entry(alt, sep, );
if (!entry.len)
continue;
-   link_alt_odb_entry(the_repository, entry.buf,
+   link_alt_odb_entry(r, entry.buf,
   relative_base, depth, objdirbuf.buf);
}
strbuf_release();
strbuf_release();
 }
 
-static void read_info_alternates_the_repository(const char *relative_base,
-   int depth)
+static void 

[PATCHv4 12/27] sha1_file: add repository argument to link_alt_odb_entries

2018-02-23 Thread Stefan Beller
See previous patch for explanation.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index f93d3b95b54..8fb56ca0362 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -465,8 +465,12 @@ static const char *parse_alt_odb_entry(const char *string,
return end;
 }
 
-static void link_alt_odb_entries(const char *alt, int sep,
-const char *relative_base, int depth)
+#define link_alt_odb_entries(r, a, s, rb, d) \
+   link_alt_odb_entries_##r(a, s, rb, d)
+static void link_alt_odb_entries_the_repository(const char *alt,
+   int sep,
+   const char *relative_base,
+   int depth)
 {
struct strbuf objdirbuf = STRBUF_INIT;
struct strbuf entry = STRBUF_INIT;
@@ -509,7 +513,7 @@ static void read_info_alternates_the_repository(const char 
*relative_base,
return;
}
 
-   link_alt_odb_entries(buf.buf, '\n', relative_base, depth);
+   link_alt_odb_entries(the_repository, buf.buf, '\n', relative_base, 
depth);
strbuf_release();
free(path);
 }
@@ -563,7 +567,8 @@ void add_to_alternates_file(const char *reference)
if (commit_lock_file())
die_errno("unable to move new alternates file into 
place");
if (the_repository->objects.alt_odb_tail)
-   link_alt_odb_entries(reference, '\n', NULL, 0);
+   link_alt_odb_entries(the_repository, reference,
+'\n', NULL, 0);
}
free(alts);
 }
@@ -576,7 +581,8 @@ void add_to_alternates_memory(const char *reference)
 */
prepare_alt_odb();
 
-   link_alt_odb_entries(reference, '\n', NULL, 0);
+   link_alt_odb_entries(the_repository, reference,
+'\n', NULL, 0);
 }
 
 /*
@@ -679,7 +685,8 @@ void prepare_alt_odb(void)
 
the_repository->objects.alt_odb_tail =
_repository->objects.alt_odb_list;
-   link_alt_odb_entries(alt, PATH_SEP, NULL, 0);
+   link_alt_odb_entries(the_repository, alt,
+PATH_SEP, NULL, 0);
 
read_info_alternates(the_repository, get_object_directory(), 0);
 }
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 18/27] sha1_file: add repository argument to open_sha1_file

2018-02-23 Thread Stefan Beller
Add a repository argument to allow the open_sha1_file caller to be
more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index b7e6e4a9bfc..7e471e0cf4b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -898,7 +898,9 @@ static int stat_sha1_file_the_repository(const unsigned 
char *sha1,
  * Like stat_sha1_file(), but actually open the object and return the
  * descriptor. See the caveats on the "path" parameter above.
  */
-static int open_sha1_file(const unsigned char *sha1, const char **path)
+#define open_sha1_file(r, s, p) open_sha1_file_##r(s, p)
+static int open_sha1_file_the_repository(const unsigned char *sha1,
+const char **path)
 {
int fd;
struct alternate_object_database *alt;
@@ -941,7 +943,7 @@ static void *map_sha1_file_1(const char *path,
if (path)
fd = git_open(path);
else
-   fd = open_sha1_file(sha1, );
+   fd = open_sha1_file(the_repository, sha1, );
map = NULL;
if (fd >= 0) {
struct stat st;
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 17/27] sha1_file: add repository argument to stat_sha1_file

2018-02-23 Thread Stefan Beller
Add a repository argument to allow the stat_sha1_file caller to be
more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 8ef2f856035..b7e6e4a9bfc 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -869,8 +869,9 @@ int git_open_cloexec(const char *name, int flags)
  * Note that it may point to static storage and is only valid until another
  * call to sha1_file_name(), etc.
  */
-static int stat_sha1_file(const unsigned char *sha1, struct stat *st,
- const char **path)
+#define stat_sha1_file(r, s, st, p) stat_sha1_file_##r(s, st, p)
+static int stat_sha1_file_the_repository(const unsigned char *sha1,
+struct stat *st, const char **path)
 {
struct alternate_object_database *alt;
static struct strbuf buf = STRBUF_INIT;
@@ -1176,7 +1177,7 @@ static int sha1_loose_object_info(const unsigned char 
*sha1,
if (!oi->typep && !oi->typename && !oi->sizep && !oi->contentp) {
const char *path;
struct stat st;
-   if (stat_sha1_file(sha1, , ) < 0)
+   if (stat_sha1_file(the_repository, sha1, , ) < 0)
return -1;
if (oi->disk_sizep)
*oi->disk_sizep = st.st_size;
@@ -1390,7 +1391,7 @@ void *read_sha1_file_extended(const unsigned char *sha1,
die("replacement %s not found for %s",
sha1_to_hex(repl), sha1_to_hex(sha1));
 
-   if (!stat_sha1_file(repl, , ))
+   if (!stat_sha1_file(the_repository, repl, , ))
die("loose object %s (stored in %s) is corrupt",
sha1_to_hex(repl), path);
 
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 07/27] pack: move prepare_packed_git_run_once to object store

2018-02-23 Thread Stefan Beller
Each repository's object store can be initialized independently, so
they must not share a run_once variable.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 object-store.h | 8 +++-
 packfile.c | 7 +++
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/object-store.h b/object-store.h
index 4f768465a14..bd6441e525f 100644
--- a/object-store.h
+++ b/object-store.h
@@ -89,9 +89,15 @@ struct raw_object_store {
 
struct alternate_object_database *alt_odb_list;
struct alternate_object_database **alt_odb_tail;
+
+   /*
+* Whether packed_git has already been populated with this repository's
+* packs.
+*/
+   unsigned packed_git_initialized : 1;
 };
 
-#define RAW_OBJECT_STORE_INIT(o) { NULL, NULL, 
LIST_HEAD_INIT(o.packed_git_mru), NULL, NULL }
+#define RAW_OBJECT_STORE_INIT(o) { NULL, NULL, 
LIST_HEAD_INIT(o.packed_git_mru), NULL, NULL, 0 }
 
 void raw_object_store_clear(struct raw_object_store *o);
 
diff --git a/packfile.c b/packfile.c
index 65d9a4f6c61..3ee349ab1bd 100644
--- a/packfile.c
+++ b/packfile.c
@@ -883,12 +883,11 @@ static void prepare_packed_git_mru(void)
list_add_tail(>mru, _repository->objects.packed_git_mru);
 }
 
-static int prepare_packed_git_run_once = 0;
 void prepare_packed_git(void)
 {
struct alternate_object_database *alt;
 
-   if (prepare_packed_git_run_once)
+   if (the_repository->objects.packed_git_initialized)
return;
prepare_packed_git_one(get_object_directory(), 1);
prepare_alt_odb();
@@ -896,13 +895,13 @@ void prepare_packed_git(void)
prepare_packed_git_one(alt->path, 0);
rearrange_packed_git();
prepare_packed_git_mru();
-   prepare_packed_git_run_once = 1;
+   the_repository->objects.packed_git_initialized = 1;
 }
 
 void reprepare_packed_git(void)
 {
approximate_object_count_valid = 0;
-   prepare_packed_git_run_once = 0;
+   the_repository->objects.packed_git_initialized = 0;
prepare_packed_git();
 }
 
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 09/27] sha1_file: add raw_object_store argument to alt_odb_usable

2018-02-23 Thread Stefan Beller
Add a raw_object_store to alt_odb_usable to be more specific about which
repository to act on. The choice of the repository is delegated to its
only caller link_alt_odb_entry.

Signed-off-by: Stefan Beller 
---
 sha1_file.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index ac25146076e..8e6f213f9d9 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -344,7 +344,9 @@ static const char *alt_sha1_path(struct 
alternate_object_database *alt,
 /*
  * Return non-zero iff the path is usable as an alternate object database.
  */
-static int alt_odb_usable(struct strbuf *path, const char *normalized_objdir)
+static int alt_odb_usable(struct raw_object_store *o,
+ struct strbuf *path,
+ const char *normalized_objdir)
 {
struct alternate_object_database *alt;
 
@@ -360,7 +362,7 @@ static int alt_odb_usable(struct strbuf *path, const char 
*normalized_objdir)
 * Prevent the common mistake of listing the same
 * thing twice, or object directory itself.
 */
-   for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
+   for (alt = o->alt_odb_list; alt; alt = alt->next) {
if (!fspathcmp(path->buf, alt->path))
return 0;
}
@@ -412,7 +414,7 @@ static int link_alt_odb_entry(const char *entry, const char 
*relative_base,
while (pathbuf.len && pathbuf.buf[pathbuf.len - 1] == '/')
strbuf_setlen(, pathbuf.len - 1);
 
-   if (!alt_odb_usable(, normalized_objdir)) {
+   if (!alt_odb_usable(_repository->objects, , 
normalized_objdir)) {
strbuf_release();
return -1;
}
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 24/27] sha1_file: allow open_sha1_file to handle arbitrary repositories

2018-02-23 Thread Stefan Beller
Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 952deb02195..73bbe3a3e88 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -897,9 +897,8 @@ static int stat_sha1_file(struct repository *r, const 
unsigned char *sha1,
  * Like stat_sha1_file(), but actually open the object and return the
  * descriptor. See the caveats on the "path" parameter above.
  */
-#define open_sha1_file(r, s, p) open_sha1_file_##r(s, p)
-static int open_sha1_file_the_repository(const unsigned char *sha1,
-const char **path)
+static int open_sha1_file(struct repository *r,
+ const unsigned char *sha1, const char **path)
 {
int fd;
struct alternate_object_database *alt;
@@ -907,7 +906,7 @@ static int open_sha1_file_the_repository(const unsigned 
char *sha1,
static struct strbuf buf = STRBUF_INIT;
 
strbuf_reset();
-   sha1_file_name(the_repository, , sha1);
+   sha1_file_name(r, , sha1);
*path = buf.buf;
 
fd = git_open(*path);
@@ -915,8 +914,8 @@ static int open_sha1_file_the_repository(const unsigned 
char *sha1,
return fd;
most_interesting_errno = errno;
 
-   prepare_alt_odb(the_repository);
-   for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
+   prepare_alt_odb(r);
+   for (alt = r->objects.alt_odb_list; alt; alt = alt->next) {
*path = alt_sha1_path(alt, sha1);
fd = git_open(*path);
if (fd >= 0)
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 13/27] sha1_file: add repository argument to prepare_alt_odb

2018-02-23 Thread Stefan Beller
See previous patch for explanation.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 builtin/fsck.c |  2 +-
 object-store.h |  3 ++-
 packfile.c |  2 +-
 sha1_file.c| 13 +++--
 sha1_name.c|  2 +-
 5 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index c3a339193c8..cdcf2fa926c 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -722,7 +722,7 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
} else {
fsck_object_dir(get_object_directory());
 
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
for (alt = the_repository->objects.alt_odb_list;
 alt; alt = alt->next)
fsck_object_dir(alt->path);
diff --git a/object-store.h b/object-store.h
index 31ca4d3f340..95f328482aa 100644
--- a/object-store.h
+++ b/object-store.h
@@ -20,7 +20,8 @@ struct alternate_object_database {
 
char path[FLEX_ARRAY];
 };
-void prepare_alt_odb(void);
+#define prepare_alt_odb(r) prepare_alt_odb_##r()
+void prepare_alt_odb_the_repository(void);
 char *compute_alternate_path(const char *path, struct strbuf *err);
 typedef int alt_odb_fn(struct alternate_object_database *, void *);
 int foreach_alt_odb(alt_odb_fn, void*);
diff --git a/packfile.c b/packfile.c
index 8d7b1dd5f9c..1e02fe3b0bb 100644
--- a/packfile.c
+++ b/packfile.c
@@ -889,7 +889,7 @@ void prepare_packed_git(void)
if (the_repository->objects.packed_git_initialized)
return;
prepare_packed_git_one(get_object_directory(), 1);
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next)
prepare_packed_git_one(alt->path, 0);
rearrange_packed_git();
diff --git a/sha1_file.c b/sha1_file.c
index 8fb56ca0362..381eaf8221d 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -23,6 +23,7 @@
 #include "sha1-lookup.h"
 #include "bulk-checkin.h"
 #include "repository.h"
+#include "object-store.h"
 #include "streaming.h"
 #include "dir.h"
 #include "list.h"
@@ -579,7 +580,7 @@ void add_to_alternates_memory(const char *reference)
 * Make sure alternates are initialized, or else our entry may be
 * overwritten when they are.
 */
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
 
link_alt_odb_entries(the_repository, reference,
 '\n', NULL, 0);
@@ -665,7 +666,7 @@ int foreach_alt_odb(alt_odb_fn fn, void *cb)
struct alternate_object_database *ent;
int r = 0;
 
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
for (ent = the_repository->objects.alt_odb_list; ent; ent = ent->next) {
r = fn(ent, cb);
if (r)
@@ -674,7 +675,7 @@ int foreach_alt_odb(alt_odb_fn fn, void *cb)
return r;
 }
 
-void prepare_alt_odb(void)
+void prepare_alt_odb_the_repository(void)
 {
const char *alt;
 
@@ -728,7 +729,7 @@ static int check_and_freshen_local(const unsigned char 
*sha1, int freshen)
 static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen)
 {
struct alternate_object_database *alt;
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
const char *path = alt_sha1_path(alt, sha1);
if (check_and_freshen_file(path, freshen))
@@ -887,7 +888,7 @@ static int stat_sha1_file(const unsigned char *sha1, struct 
stat *st,
if (!lstat(*path, st))
return 0;
 
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
errno = ENOENT;
for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
*path = alt_sha1_path(alt, sha1);
@@ -918,7 +919,7 @@ static int open_sha1_file(const unsigned char *sha1, const 
char **path)
return fd;
most_interesting_errno = errno;
 
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
*path = alt_sha1_path(alt, sha1);
fd = git_open(*path);
diff --git a/sha1_name.c b/sha1_name.c
index 016c883b5c7..627e93f1447 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -353,7 +353,7 @@ static int init_object_disambiguation(const char *name, int 
len,
 
ds->len = len;
ds->hex_pfx[len] = '\0';
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
return 0;
 }
 
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 22/27] sha1_file: allow sha1_file_name to handle arbitrary repositories

2018-02-23 Thread Stefan Beller
Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 object-store.h | 3 +--
 sha1_file.c| 4 ++--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/object-store.h b/object-store.h
index dbd589e0083..8b38330d256 100644
--- a/object-store.h
+++ b/object-store.h
@@ -117,8 +117,7 @@ void raw_object_store_clear(struct raw_object_store *o);
  * Put in `buf` the name of the file in the local object database that
  * would be used to store a loose object with the specified sha1.
  */
-#define sha1_file_name(r, b, s) sha1_file_name_##r(b, s)
-void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
*sha1);
+void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned 
char *sha1);
 
 #define map_sha1_file(r, s, sz) map_sha1_file_##r(s, sz)
 void *map_sha1_file_the_repository(const unsigned char *sha1, unsigned long 
*size);
diff --git a/sha1_file.c b/sha1_file.c
index 831e2da3572..59993a0878d 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -321,9 +321,9 @@ static void fill_sha1_path(struct strbuf *buf, const 
unsigned char *sha1)
}
 }
 
-void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
*sha1)
+void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned 
char *sha1)
 {
-   strbuf_addstr(buf, get_object_directory());
+   strbuf_addstr(buf, r->objects.objectdir);
strbuf_addch(buf, '/');
fill_sha1_path(buf, sha1);
 }
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 19/27] sha1_file: add repository argument to map_sha1_file_1

2018-02-23 Thread Stefan Beller
Add a repository argument to allow the map_sha1_file_1 caller to be
more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 7e471e0cf4b..d4af8f56814 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -933,9 +933,10 @@ static int open_sha1_file_the_repository(const unsigned 
char *sha1,
  * Map the loose object at "path" if it is not NULL, or the path found by
  * searching for a loose object named "sha1".
  */
-static void *map_sha1_file_1(const char *path,
-const unsigned char *sha1,
-unsigned long *size)
+#define map_sha1_file_1(r, p, s, si) map_sha1_file_1_##r(p, s, si)
+static void *map_sha1_file_1_the_repository(const char *path,
+   const unsigned char *sha1,
+   unsigned long *size)
 {
void *map;
int fd;
@@ -964,7 +965,7 @@ static void *map_sha1_file_1(const char *path,
 
 void *map_sha1_file(const unsigned char *sha1, unsigned long *size)
 {
-   return map_sha1_file_1(NULL, sha1, size);
+   return map_sha1_file_1(the_repository, NULL, sha1, size);
 }
 
 static int unpack_sha1_short_header(git_zstream *stream,
@@ -2200,7 +2201,7 @@ int read_loose_object(const char *path,
 
*contents = NULL;
 
-   map = map_sha1_file_1(path, NULL, );
+   map = map_sha1_file_1(the_repository, path, NULL, );
if (!map) {
error_errno("unable to mmap %s", path);
goto out;
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 06/27] object-store: close all packs upon clearing the object store

2018-02-23 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 builtin/am.c   | 2 +-
 builtin/clone.c| 2 +-
 builtin/fetch.c| 2 +-
 builtin/merge.c| 2 +-
 builtin/receive-pack.c | 2 +-
 object.c   | 7 +++
 packfile.c | 4 ++--
 packfile.h | 2 +-
 8 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 6661edc162b..fc1d9b43c51 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1859,7 +1859,7 @@ static void am_run(struct am_state *state, int resume)
 */
if (!state->rebasing) {
am_destroy(state);
-   close_all_packs();
+   close_all_packs(_repository->objects);
run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
}
 }
diff --git a/builtin/clone.c b/builtin/clone.c
index 101c27a593f..13cfaa6488a 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1217,7 +1217,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
transport_disconnect(transport);
 
if (option_dissociate) {
-   close_all_packs();
+   close_all_packs(_repository->objects);
dissociate_from_references();
}
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 8ee998ea2ee..4d72efca781 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1478,7 +1478,7 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
 
string_list_clear(, 0);
 
-   close_all_packs();
+   close_all_packs(_repository->objects);
 
argv_array_pushl(_gc_auto, "gc", "--auto", NULL);
if (verbosity < 0)
diff --git a/builtin/merge.c b/builtin/merge.c
index 92ba99a1a5e..0af0c53a632 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -411,7 +411,7 @@ static void finish(struct commit *head_commit,
 * We ignore errors in 'gc --auto', since the
 * user should see them.
 */
-   close_all_packs();
+   close_all_packs(_repository->objects);
run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
}
}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 75e7f18acef..8e25aae54c8 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -2027,7 +2027,7 @@ int cmd_receive_pack(int argc, const char **argv, const 
char *prefix)
proc.git_cmd = 1;
proc.argv = argv_gc_auto;
 
-   close_all_packs();
+   close_all_packs(_repository->objects);
if (!start_command()) {
if (use_sideband)
copy_to_sideband(proc.err, -1, NULL);
diff --git a/object.c b/object.c
index 367441efa94..a7c238339bd 100644
--- a/object.c
+++ b/object.c
@@ -4,6 +4,7 @@
 #include "tree.h"
 #include "commit.h"
 #include "tag.h"
+#include "packfile.h"
 
 static struct object **obj_hash;
 static int nr_objs, obj_hash_size;
@@ -468,8 +469,6 @@ void raw_object_store_clear(struct raw_object_store *o)
o->alt_odb_tail = NULL;
 
INIT_LIST_HEAD(>packed_git_mru);
-   /*
-* TODO: call close_all_packs once migrated to
-* take an object store argument
-*/
+   close_all_packs(o);
+   o->packed_git = NULL;
 }
diff --git a/packfile.c b/packfile.c
index 59210deaaf7..65d9a4f6c61 100644
--- a/packfile.c
+++ b/packfile.c
@@ -310,11 +310,11 @@ static void close_pack(struct packed_git *p)
close_pack_index(p);
 }
 
-void close_all_packs(void)
+void close_all_packs(struct raw_object_store *o)
 {
struct packed_git *p;
 
-   for (p = the_repository->objects.packed_git; p; p = p->next)
+   for (p = o->packed_git; p; p = p->next)
if (p->do_not_close)
die("BUG: want to close pack marked 'do-not-close'");
else
diff --git a/packfile.h b/packfile.h
index a7fca598d67..6a2c57045ca 100644
--- a/packfile.h
+++ b/packfile.h
@@ -63,7 +63,7 @@ extern void close_pack_index(struct packed_git *);
 
 extern unsigned char *use_pack(struct packed_git *, struct pack_window **, 
off_t, unsigned long *);
 extern void close_pack_windows(struct packed_git *);
-extern void close_all_packs(void);
+extern void close_all_packs(struct raw_object_store *o);
 extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, 
int local);
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 23/27] sha1_file: allow stat_sha1_file to handle arbitrary repositories

2018-02-23 Thread Stefan Beller
Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 59993a0878d..952deb02195 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -869,23 +869,22 @@ int git_open_cloexec(const char *name, int flags)
  * Note that it may point to static storage and is only valid until another
  * call to sha1_file_name(), etc.
  */
-#define stat_sha1_file(r, s, st, p) stat_sha1_file_##r(s, st, p)
-static int stat_sha1_file_the_repository(const unsigned char *sha1,
-struct stat *st, const char **path)
+static int stat_sha1_file(struct repository *r, const unsigned char *sha1,
+ struct stat *st, const char **path)
 {
struct alternate_object_database *alt;
static struct strbuf buf = STRBUF_INIT;
 
strbuf_reset();
-   sha1_file_name(the_repository, , sha1);
+   sha1_file_name(r, , sha1);
*path = buf.buf;
 
if (!lstat(*path, st))
return 0;
 
-   prepare_alt_odb(the_repository);
+   prepare_alt_odb(r);
errno = ENOENT;
-   for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
+   for (alt = r->objects.alt_odb_list; alt; alt = alt->next) {
*path = alt_sha1_path(alt, sha1);
if (!lstat(*path, st))
return 0;
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 27/27] sha1_file: allow sha1_loose_object_info to handle arbitrary repositories

2018-02-23 Thread Stefan Beller
From: Jonathan Nieder 

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 902cba42106..dfc8deec38e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1151,10 +1151,9 @@ int parse_sha1_header(const char *hdr, unsigned long 
*sizep)
return parse_sha1_header_extended(hdr, , 0);
 }
 
-#define sha1_loose_object_info(r, s, o, f) sha1_loose_object_info_##r(s, o, f)
-static int sha1_loose_object_info_the_repository(const unsigned char *sha1,
-struct object_info *oi,
-int flags)
+static int sha1_loose_object_info(struct repository *r,
+ const unsigned char *sha1,
+ struct object_info *oi, int flags)
 {
int status = 0;
unsigned long mapsize;
@@ -1178,14 +1177,14 @@ static int sha1_loose_object_info_the_repository(const 
unsigned char *sha1,
if (!oi->typep && !oi->typename && !oi->sizep && !oi->contentp) {
const char *path;
struct stat st;
-   if (stat_sha1_file(the_repository, sha1, , ) < 0)
+   if (stat_sha1_file(r, sha1, , ) < 0)
return -1;
if (oi->disk_sizep)
*oi->disk_sizep = st.st_size;
return 0;
}
 
-   map = map_sha1_file(the_repository, sha1, );
+   map = map_sha1_file(r, sha1, );
if (!map)
return -1;
 
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 20/27] sha1_file: add repository argument to map_sha1_file

2018-02-23 Thread Stefan Beller
Add a repository argument to allow map_sha1_file callers to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

While at it, move the declaration to object-store.h, where it should
be easier to find.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 cache.h| 1 -
 object-store.h | 3 +++
 sha1_file.c| 4 ++--
 streaming.c| 5 -
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index 94d85a5c692..57173991839 100644
--- a/cache.h
+++ b/cache.h
@@ -1226,7 +1226,6 @@ extern int force_object_loose(const struct object_id 
*oid, time_t mtime);
 
 extern int git_open_cloexec(const char *name, int flags);
 #define git_open(name) git_open_cloexec(name, O_RDONLY)
-extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, 
unsigned long mapsize, void *buffer, unsigned long bufsiz);
 extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
 
diff --git a/object-store.h b/object-store.h
index c0d1292cfc2..dbd589e0083 100644
--- a/object-store.h
+++ b/object-store.h
@@ -120,4 +120,7 @@ void raw_object_store_clear(struct raw_object_store *o);
 #define sha1_file_name(r, b, s) sha1_file_name_##r(b, s)
 void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
*sha1);
 
+#define map_sha1_file(r, s, sz) map_sha1_file_##r(s, sz)
+void *map_sha1_file_the_repository(const unsigned char *sha1, unsigned long 
*size);
+
 #endif /* OBJECT_STORE_H */
diff --git a/sha1_file.c b/sha1_file.c
index d4af8f56814..6ec43e745d7 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -963,7 +963,7 @@ static void *map_sha1_file_1_the_repository(const char 
*path,
return map;
 }
 
-void *map_sha1_file(const unsigned char *sha1, unsigned long *size)
+void *map_sha1_file_the_repository(const unsigned char *sha1, unsigned long 
*size)
 {
return map_sha1_file_1(the_repository, NULL, sha1, size);
 }
@@ -1187,7 +1187,7 @@ static int sha1_loose_object_info(const unsigned char 
*sha1,
return 0;
}
 
-   map = map_sha1_file(sha1, );
+   map = map_sha1_file(the_repository, sha1, );
if (!map)
return -1;
 
diff --git a/streaming.c b/streaming.c
index 5892b50bd89..22d27df55eb 100644
--- a/streaming.c
+++ b/streaming.c
@@ -3,6 +3,8 @@
  */
 #include "cache.h"
 #include "streaming.h"
+#include "repository.h"
+#include "object-store.h"
 #include "packfile.h"
 
 enum input_source {
@@ -335,7 +337,8 @@ static struct stream_vtbl loose_vtbl = {
 
 static open_method_decl(loose)
 {
-   st->u.loose.mapped = map_sha1_file(sha1, >u.loose.mapsize);
+   st->u.loose.mapped = map_sha1_file(the_repository,
+  sha1, >u.loose.mapsize);
if (!st->u.loose.mapped)
return -1;
if ((unpack_sha1_header(>z,
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 08/27] pack: move approximate object count to object store

2018-02-23 Thread Stefan Beller
The approximate_object_count() function maintains a rough count of
objects in a repository to estimate how long object name abbreviates
should be.  Object names are scoped to a repository and the
appropriate length may differ by repository, so the object count
should not be global.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 object-store.h | 10 +-
 packfile.c | 11 +--
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/object-store.h b/object-store.h
index bd6441e525f..31ca4d3f340 100644
--- a/object-store.h
+++ b/object-store.h
@@ -90,6 +90,14 @@ struct raw_object_store {
struct alternate_object_database *alt_odb_list;
struct alternate_object_database **alt_odb_tail;
 
+   /*
+* A fast, rough count of the number of objects in the repository.
+* These two fields are not meant for direct access. Use
+* approximate_object_count() instead.
+*/
+   unsigned long approximate_object_count;
+   unsigned approximate_object_count_valid : 1;
+
/*
 * Whether packed_git has already been populated with this repository's
 * packs.
@@ -97,7 +105,7 @@ struct raw_object_store {
unsigned packed_git_initialized : 1;
 };
 
-#define RAW_OBJECT_STORE_INIT(o) { NULL, NULL, 
LIST_HEAD_INIT(o.packed_git_mru), NULL, NULL, 0 }
+#define RAW_OBJECT_STORE_INIT(o) { NULL, NULL, 
LIST_HEAD_INIT(o.packed_git_mru), NULL, NULL, 0, 0, 0 }
 
 void raw_object_store_clear(struct raw_object_store *o);
 
diff --git a/packfile.c b/packfile.c
index 3ee349ab1bd..8d7b1dd5f9c 100644
--- a/packfile.c
+++ b/packfile.c
@@ -802,8 +802,6 @@ static void prepare_packed_git_one(char *objdir, int local)
strbuf_release();
 }
 
-static int approximate_object_count_valid;
-
 /*
  * Give a fast, rough count of the number of objects in the repository. This
  * ignores loose objects completely. If you have a lot of them, then either
@@ -813,8 +811,8 @@ static int approximate_object_count_valid;
  */
 unsigned long approximate_object_count(void)
 {
-   static unsigned long count;
-   if (!approximate_object_count_valid) {
+   if (!the_repository->objects.approximate_object_count_valid) {
+   unsigned long count;
struct packed_git *p;
 
prepare_packed_git();
@@ -824,8 +822,9 @@ unsigned long approximate_object_count(void)
continue;
count += p->num_objects;
}
+   the_repository->objects.approximate_object_count = count;
}
-   return count;
+   return the_repository->objects.approximate_object_count;
 }
 
 static void *get_next_packed_git(const void *p)
@@ -900,7 +899,7 @@ void prepare_packed_git(void)
 
 void reprepare_packed_git(void)
 {
-   approximate_object_count_valid = 0;
+   the_repository->objects.approximate_object_count_valid = 0;
the_repository->objects.packed_git_initialized = 0;
prepare_packed_git();
 }
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 15/27] sha1_file: allow prepare_alt_odb to handle arbitrary repositories

2018-02-23 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 object-store.h |  3 +--
 sha1_file.c| 12 +---
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/object-store.h b/object-store.h
index 08cd48ade11..24b9a750aef 100644
--- a/object-store.h
+++ b/object-store.h
@@ -24,8 +24,7 @@ struct alternate_object_database {
 */
char path[FLEX_ARRAY];
 };
-#define prepare_alt_odb(r) prepare_alt_odb_##r()
-void prepare_alt_odb_the_repository(void);
+void prepare_alt_odb(struct repository *r);
 char *compute_alternate_path(const char *path, struct strbuf *err);
 typedef int alt_odb_fn(struct alternate_object_database *, void *);
 int foreach_alt_odb(alt_odb_fn, void*);
diff --git a/sha1_file.c b/sha1_file.c
index ab42d430b8a..03b9bbe8bb7 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -671,21 +671,19 @@ int foreach_alt_odb(alt_odb_fn fn, void *cb)
return r;
 }
 
-void prepare_alt_odb_the_repository(void)
+void prepare_alt_odb(struct repository *r)
 {
const char *alt;
 
-   if (the_repository->objects.alt_odb_tail)
+   if (r->objects.alt_odb_tail)
return;
 
alt = getenv(ALTERNATE_DB_ENVIRONMENT);
 
-   the_repository->objects.alt_odb_tail =
-   _repository->objects.alt_odb_list;
-   link_alt_odb_entries(the_repository, alt,
-PATH_SEP, NULL, 0);
+   r->objects.alt_odb_tail = >objects.alt_odb_list;
+   link_alt_odb_entries(r, alt, PATH_SEP, NULL, 0);
 
-   read_info_alternates(the_repository, get_object_directory(), 0);
+   read_info_alternates(r, r->objects.objectdir, 0);
 }
 
 /* Returns 1 if we have successfully freshened the file, 0 otherwise. */
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 10/27] sha1_file: add repository argument to link_alt_odb_entry

2018-02-23 Thread Stefan Beller
Add a repository argument to allow the link_alt_odb_entry caller to be
more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

Since the implementation does not yet work with other repositories,
use a wrapper macro to enforce that the caller passes in
the_repository as the first argument. It would be more appealing to
use BUILD_ASSERT_OR_ZERO to enforce this, but that doesn't work
because it requires a compile-time constant and common compilers like
gcc 4.8.4 do not consider "r == the_repository" a compile-time
constant.

This and the following three patches add repository arguments to
link_alt_odb_entry, read_info_alternates, link_alt_odb_entries
and prepare_alt_odb. Three out of the four functions are found
in a recursive call chain, calling each other, and one of them
accesses the repositories `objectdir` (which was migrated; it
was an obvious choice) and `ignore_env` (which we need to keep in
the repository struct for clarify); hence we will pass through the
repository unlike just the object store object + the ignore_env flag.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 8e6f213f9d9..e5fe0aa04ef 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -388,8 +388,9 @@ static int alt_odb_usable(struct raw_object_store *o,
  * terminating NUL.
  */
 static void read_info_alternates(const char * relative_base, int depth);
-static int link_alt_odb_entry(const char *entry, const char *relative_base,
-   int depth, const char *normalized_objdir)
+#define link_alt_odb_entry(r, e, rb, d, n) link_alt_odb_entry_##r(e, rb, d, n)
+static int link_alt_odb_entry_the_repository(const char *entry,
+   const char *relative_base, int depth, const char *normalized_objdir)
 {
struct alternate_object_database *ent;
struct strbuf pathbuf = STRBUF_INIT;
@@ -486,7 +487,8 @@ static void link_alt_odb_entries(const char *alt, int sep,
alt = parse_alt_odb_entry(alt, sep, );
if (!entry.len)
continue;
-   link_alt_odb_entry(entry.buf, relative_base, depth, 
objdirbuf.buf);
+   link_alt_odb_entry(the_repository, entry.buf,
+  relative_base, depth, objdirbuf.buf);
}
strbuf_release();
strbuf_release();
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 26/27] sha1_file: allow map_sha1_file to handle arbitrary repositories

2018-02-23 Thread Stefan Beller
Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 object-store.h | 3 +--
 sha1_file.c| 5 +++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/object-store.h b/object-store.h
index 8b38330d256..afe2f93459b 100644
--- a/object-store.h
+++ b/object-store.h
@@ -119,7 +119,6 @@ void raw_object_store_clear(struct raw_object_store *o);
  */
 void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned 
char *sha1);
 
-#define map_sha1_file(r, s, sz) map_sha1_file_##r(s, sz)
-void *map_sha1_file_the_repository(const unsigned char *sha1, unsigned long 
*size);
+void *map_sha1_file(struct repository *r, const unsigned char *sha1, unsigned 
long *size);
 
 #endif /* OBJECT_STORE_H */
diff --git a/sha1_file.c b/sha1_file.c
index e32f1da6b69..902cba42106 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -959,9 +959,10 @@ static void *map_sha1_file_1(struct repository *r, const 
char *path,
return map;
 }
 
-void *map_sha1_file_the_repository(const unsigned char *sha1, unsigned long 
*size)
+void *map_sha1_file(struct repository *r,
+   const unsigned char *sha1, unsigned long *size)
 {
-   return map_sha1_file_1(the_repository, NULL, sha1, size);
+   return map_sha1_file_1(r, NULL, sha1, size);
 }
 
 static int unpack_sha1_short_header(git_zstream *stream,
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 16/27] sha1_file: add repository argument to sha1_file_name

2018-02-23 Thread Stefan Beller
Add a repository argument to allow sha1_file_name callers to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

While at it, move the declaration to object-store.h, where it should
be easier to find.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 cache.h|  6 --
 http-walker.c  |  3 ++-
 http.c |  5 ++---
 object-store.h |  7 +++
 sha1_file.c| 10 +-
 5 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/cache.h b/cache.h
index 63889acb591..94d85a5c692 100644
--- a/cache.h
+++ b/cache.h
@@ -936,12 +936,6 @@ extern void check_repository_format(void);
 #define DATA_CHANGED0x0020
 #define TYPE_CHANGED0x0040
 
-/*
- * Put in `buf` the name of the file in the local object database that
- * would be used to store a loose object with the specified sha1.
- */
-extern void sha1_file_name(struct strbuf *buf, const unsigned char *sha1);
-
 /*
  * Return an abbreviated sha1 unique within this repository's object database.
  * The result will be at least `len` characters long, and will be NUL
diff --git a/http-walker.c b/http-walker.c
index 07c2b1af826..2c33969123a 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "repository.h"
 #include "commit.h"
 #include "walker.h"
 #include "http.h"
@@ -545,7 +546,7 @@ static int fetch_object(struct walker *walker, unsigned 
char *sha1)
ret = error("File %s has bad hash", hex);
} else if (req->rename < 0) {
struct strbuf buf = STRBUF_INIT;
-   sha1_file_name(, req->sha1);
+   sha1_file_name(the_repository, , req->sha1);
ret = error("unable to write sha1 filename %s", buf.buf);
strbuf_release();
}
diff --git a/http.c b/http.c
index 31755023a4f..df9dbea59c5 100644
--- a/http.c
+++ b/http.c
@@ -2246,7 +2246,7 @@ struct http_object_request *new_http_object_request(const 
char *base_url,
hashcpy(freq->sha1, sha1);
freq->localfile = -1;
 
-   sha1_file_name(, sha1);
+   sha1_file_name(the_repository, , sha1);
snprintf(freq->tmpfile, sizeof(freq->tmpfile),
 "%s.temp", filename.buf);
 
@@ -2395,8 +2395,7 @@ int finish_http_object_request(struct http_object_request 
*freq)
unlink_or_warn(freq->tmpfile);
return -1;
}
-
-   sha1_file_name(, freq->sha1);
+   sha1_file_name(the_repository, , freq->sha1);
freq->rename = finalize_object_file(freq->tmpfile, filename.buf);
strbuf_release();
 
diff --git a/object-store.h b/object-store.h
index 24b9a750aef..c0d1292cfc2 100644
--- a/object-store.h
+++ b/object-store.h
@@ -113,4 +113,11 @@ struct raw_object_store {
 
 void raw_object_store_clear(struct raw_object_store *o);
 
+/*
+ * Put in `buf` the name of the file in the local object database that
+ * would be used to store a loose object with the specified sha1.
+ */
+#define sha1_file_name(r, b, s) sha1_file_name_##r(b, s)
+void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
*sha1);
+
 #endif /* OBJECT_STORE_H */
diff --git a/sha1_file.c b/sha1_file.c
index 03b9bbe8bb7..8ef2f856035 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -321,7 +321,7 @@ static void fill_sha1_path(struct strbuf *buf, const 
unsigned char *sha1)
}
 }
 
-void sha1_file_name(struct strbuf *buf, const unsigned char *sha1)
+void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
*sha1)
 {
strbuf_addstr(buf, get_object_directory());
strbuf_addch(buf, '/');
@@ -715,7 +715,7 @@ static int check_and_freshen_local(const unsigned char 
*sha1, int freshen)
static struct strbuf buf = STRBUF_INIT;
 
strbuf_reset();
-   sha1_file_name(, sha1);
+   sha1_file_name(the_repository, , sha1);
 
return check_and_freshen_file(buf.buf, freshen);
 }
@@ -876,7 +876,7 @@ static int stat_sha1_file(const unsigned char *sha1, struct 
stat *st,
static struct strbuf buf = STRBUF_INIT;
 
strbuf_reset();
-   sha1_file_name(, sha1);
+   sha1_file_name(the_repository, , sha1);
*path = buf.buf;
 
if (!lstat(*path, st))
@@ -905,7 +905,7 @@ static int open_sha1_file(const unsigned char *sha1, const 
char **path)
static struct strbuf buf = STRBUF_INIT;
 
strbuf_reset();
-   sha1_file_name(, sha1);
+   sha1_file_name(the_repository, , sha1);
*path = buf.buf;
 
fd = git_open(*path);
@@ -1591,7 +1591,7 @@ static int write_loose_object(const struct object_id 
*oid, char *hdr,
static struct strbuf filename = STRBUF_INIT;
 
strbuf_reset();
-   

[PATCHv4 03/27] object-store: move alt_odb_list and alt_odb_tail to object store

2018-02-23 Thread Stefan Beller
In a process with multiple repositories open, alternates should be
associated to a single repository and not shared globally. Move
alt_odb_list and alt_odb_tail into the_repository and adjust callers
to reflect this.

Now that the alternative object data base is per repository, we're
leaking its memory upon freeing a repository. The next patch plugs
this hole.

No functional change intended.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 builtin/fsck.c |  4 +++-
 object-store.h |  9 ++---
 packfile.c |  3 ++-
 sha1_file.c| 25 -
 sha1_name.c|  3 ++-
 5 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 9981db22637..78edcb75c45 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -1,5 +1,6 @@
 #include "builtin.h"
 #include "cache.h"
+#include "repository.h"
 #include "config.h"
 #include "commit.h"
 #include "tree.h"
@@ -722,7 +723,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
fsck_object_dir(get_object_directory());
 
prepare_alt_odb();
-   for (alt = alt_odb_list; alt; alt = alt->next)
+   for (alt = the_repository->objects.alt_odb_list;
+alt; alt = alt->next)
fsck_object_dir(alt->path);
 
if (check_full) {
diff --git a/object-store.h b/object-store.h
index 5678aa1136c..e78eea1dde3 100644
--- a/object-store.h
+++ b/object-store.h
@@ -1,7 +1,7 @@
 #ifndef OBJECT_STORE_H
 #define OBJECT_STORE_H
 
-extern struct alternate_object_database {
+struct alternate_object_database {
struct alternate_object_database *next;
 
/* see alt_scratch_buf() */
@@ -19,7 +19,7 @@ extern struct alternate_object_database {
struct oid_array loose_objects_cache;
 
char path[FLEX_ARRAY];
-} *alt_odb_list;
+};
 void prepare_alt_odb(void);
 char *compute_alternate_path(const char *path, struct strbuf *err);
 typedef int alt_odb_fn(struct alternate_object_database *, void *);
@@ -58,8 +58,11 @@ struct raw_object_store {
 * Cannot be NULL after initialization.
 */
char *objectdir;
+
+   struct alternate_object_database *alt_odb_list;
+   struct alternate_object_database **alt_odb_tail;
 };
-#define RAW_OBJECT_STORE_INIT { NULL }
+#define RAW_OBJECT_STORE_INIT { NULL, NULL, NULL }
 
 void raw_object_store_clear(struct raw_object_store *o);
 
diff --git a/packfile.c b/packfile.c
index 7dbe8739d17..216ea836ee3 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "list.h"
 #include "pack.h"
+#include "repository.h"
 #include "dir.h"
 #include "mergesort.h"
 #include "packfile.h"
@@ -891,7 +892,7 @@ void prepare_packed_git(void)
return;
prepare_packed_git_one(get_object_directory(), 1);
prepare_alt_odb();
-   for (alt = alt_odb_list; alt; alt = alt->next)
+   for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next)
prepare_packed_git_one(alt->path, 0);
rearrange_packed_git();
prepare_packed_git_mru();
diff --git a/sha1_file.c b/sha1_file.c
index 826d7a0ae37..ac25146076e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -22,6 +22,7 @@
 #include "pack-revindex.h"
 #include "sha1-lookup.h"
 #include "bulk-checkin.h"
+#include "repository.h"
 #include "streaming.h"
 #include "dir.h"
 #include "list.h"
@@ -340,9 +341,6 @@ static const char *alt_sha1_path(struct 
alternate_object_database *alt,
return buf->buf;
 }
 
-struct alternate_object_database *alt_odb_list;
-static struct alternate_object_database **alt_odb_tail;
-
 /*
  * Return non-zero iff the path is usable as an alternate object database.
  */
@@ -362,7 +360,7 @@ static int alt_odb_usable(struct strbuf *path, const char 
*normalized_objdir)
 * Prevent the common mistake of listing the same
 * thing twice, or object directory itself.
 */
-   for (alt = alt_odb_list; alt; alt = alt->next) {
+   for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
if (!fspathcmp(path->buf, alt->path))
return 0;
}
@@ -422,8 +420,8 @@ static int link_alt_odb_entry(const char *entry, const char 
*relative_base,
ent = alloc_alt_odb(pathbuf.buf);
 
/* add the alternate entry */
-   *alt_odb_tail = ent;
-   alt_odb_tail = &(ent->next);
+   *the_repository->objects.alt_odb_tail = ent;
+   the_repository->objects.alt_odb_tail = &(ent->next);
ent->next = NULL;
 
/* recursively add alternates */
@@ -557,7 +555,7 @@ void add_to_alternates_file(const char *reference)
fprintf_or_die(out, "%s\n", reference);
if (commit_lock_file())
die_errno("unable to move new alternates file into 
place");
-   if (alt_odb_tail)
+   if 

[PATCHv4 21/27] sha1_file: add repository argument to sha1_loose_object_info

2018-02-23 Thread Stefan Beller
Add a repository argument to allow the sha1_loose_object_info caller
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 6ec43e745d7..831e2da3572 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1154,9 +1154,10 @@ int parse_sha1_header(const char *hdr, unsigned long 
*sizep)
return parse_sha1_header_extended(hdr, , 0);
 }
 
-static int sha1_loose_object_info(const unsigned char *sha1,
- struct object_info *oi,
- int flags)
+#define sha1_loose_object_info(r, s, o, f) sha1_loose_object_info_##r(s, o, f)
+static int sha1_loose_object_info_the_repository(const unsigned char *sha1,
+struct object_info *oi,
+int flags)
 {
int status = 0;
unsigned long mapsize;
@@ -1275,7 +1276,7 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
break;
 
/* Most likely it's a loose object. */
-   if (!sha1_loose_object_info(real, oi, flags))
+   if (!sha1_loose_object_info(the_repository, real, oi, flags))
return 0;
 
/* Not a loose object; someone else may have just packed it. */
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 25/27] sha1_file: allow map_sha1_file_1 to handle arbitrary repositories

2018-02-23 Thread Stefan Beller
From: Jonathan Nieder 

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 73bbe3a3e88..e32f1da6b69 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -931,10 +931,8 @@ static int open_sha1_file(struct repository *r,
  * Map the loose object at "path" if it is not NULL, or the path found by
  * searching for a loose object named "sha1".
  */
-#define map_sha1_file_1(r, p, s, si) map_sha1_file_1_##r(p, s, si)
-static void *map_sha1_file_1_the_repository(const char *path,
-   const unsigned char *sha1,
-   unsigned long *size)
+static void *map_sha1_file_1(struct repository *r, const char *path,
+const unsigned char *sha1, unsigned long *size)
 {
void *map;
int fd;
@@ -942,7 +940,7 @@ static void *map_sha1_file_1_the_repository(const char 
*path,
if (path)
fd = git_open(path);
else
-   fd = open_sha1_file(the_repository, sha1, );
+   fd = open_sha1_file(r, sha1, );
map = NULL;
if (fd >= 0) {
struct stat st;
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 04/27] object-store: free alt_odb_list

2018-02-23 Thread Stefan Beller
Free the memory and reset alt_odb_{list, tail} to NULL.

Signed-off-by: Stefan Beller 
---
 object.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/object.c b/object.c
index a2acdee1405..920dc4463fa 100644
--- a/object.c
+++ b/object.c
@@ -446,7 +446,24 @@ void clear_commit_marks_all(unsigned int flags)
}
 }
 
+static void free_alt_odb(struct alternate_object_database *alt)
+{
+   strbuf_release(>scratch);
+   oid_array_clear(>loose_objects_cache);
+}
+
+static void free_alt_odbs(struct raw_object_store *o)
+{
+   while (o->alt_odb_list) {
+   free_alt_odb(o->alt_odb_list);
+   o->alt_odb_list = o->alt_odb_list->next;
+   }
+}
+
 void raw_object_store_clear(struct raw_object_store *o)
 {
FREE_AND_NULL(o->objectdir);
+
+   free_alt_odbs(o);
+   o->alt_odb_tail = NULL;
 }
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 05/27] object-store: move packed_git and packed_git_mru to object store

2018-02-23 Thread Stefan Beller
In a process with multiple repositories open, packfile accessors
should be associated to a single repository and not shared globally.
Move packed_git and packed_git_mru into the_repository and adjust
callers to reflect this.

Patch generated by

 1. Moving the struct packed_git declaration to object-store.h
and packed_git, packed_git_mru globals to struct object_store.

 2. Apply the following semantic patch to adjust callers:
@@ @@
- packed_git
+ the_repository->objects.packed_git

@@ @@
- packed_git_mru
+ the_repository->objects.packed_git_mru

 3. Applying line wrapping fixes from "make style" to break the
resulting long lines.

 4. Adding missing #includes of repository.h where needed.

 5. As the packfiles are now owned by an objectstore, which is ephemeral
unlike globals, we introduce memory leaks. So address them in
raw_object_store_clear(). Defer freeing packed_git to the next
patch due to the size of this patch.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 builtin/count-objects.c  |  5 +++--
 builtin/fsck.c   |  6 --
 builtin/gc.c |  3 ++-
 builtin/pack-objects.c   | 20 +++-
 builtin/pack-redundant.c |  5 +++--
 cache.h  | 29 -
 fast-import.c|  7 +--
 http-backend.c   |  5 +++--
 object-store.h   | 31 ++-
 object.c |  6 ++
 pack-bitmap.c|  3 ++-
 packfile.c   | 39 ---
 repository.c |  4 +++-
 server-info.c|  5 +++--
 sha1_name.c  |  5 +++--
 15 files changed, 98 insertions(+), 75 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index 33343818c83..6aab8d08fcc 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -7,6 +7,7 @@
 #include "cache.h"
 #include "config.h"
 #include "dir.h"
+#include "repository.h"
 #include "builtin.h"
 #include "parse-options.h"
 #include "quote.h"
@@ -120,9 +121,9 @@ int cmd_count_objects(int argc, const char **argv, const 
char *prefix)
struct strbuf loose_buf = STRBUF_INIT;
struct strbuf pack_buf = STRBUF_INIT;
struct strbuf garbage_buf = STRBUF_INIT;
-   if (!packed_git)
+   if (!the_repository->objects.packed_git)
prepare_packed_git();
-   for (p = packed_git; p; p = p->next) {
+   for (p = the_repository->objects.packed_git; p; p = p->next) {
if (!p->pack_local)
continue;
if (open_pack_index(p))
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 78edcb75c45..c3a339193c8 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -735,7 +735,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
prepare_packed_git();
 
if (show_progress) {
-   for (p = packed_git; p; p = p->next) {
+   for (p = the_repository->objects.packed_git; p;
+p = p->next) {
if (open_pack_index(p))
continue;
total += p->num_objects;
@@ -743,7 +744,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
 
progress = start_progress(_("Checking 
objects"), total);
}
-   for (p = packed_git; p; p = p->next) {
+   for (p = the_repository->objects.packed_git; p;
+p = p->next) {
/* verify gives error messages itself */
if (verify_pack(p, fsck_obj_buffer,
progress, count))
diff --git a/builtin/gc.c b/builtin/gc.c
index 77fa720bd0b..59bf5c2ea0b 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -11,6 +11,7 @@
  */
 
 #include "builtin.h"
+#include "repository.h"
 #include "config.h"
 #include "tempfile.h"
 #include "lockfile.h"
@@ -173,7 +174,7 @@ static int too_many_packs(void)
return 0;
 
prepare_packed_git();
-   for (cnt = 0, p = packed_git; p; p = p->next) {
+   for (cnt = 0, p = the_repository->objects.packed_git; p; p = p->next) {
if (!p->pack_local)
continue;
if (p->pack_keep)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5c674b2843c..a64005760be 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1,5 +1,6 @@
 #include "builtin.h"
 #include "cache.h"
+#include "repository.h"
 #include "config.h"
 #include "attr.h"
 #include "object.h"
@@ -1025,8 +1026,7 @@ 

[PATCHv4 00/27] Moving global state into the repository object (part 1)

2018-02-23 Thread Stefan Beller
v4:
* addressed feedback from the single patches (mostly nits)
* rebased on latest master

v3:
* reverted back to use the repository for most of the functions
  (including unduplicating 'ignore_env')
* rebased on master again (I lost that state when doing v2, as
  I did both rebase as well as conversion to object store in one go)
* one additional patch, that moves the alternates related things to
  object-store.h, such that inclusion of cache.h (in object-store.h)
  is not required any more.

v2:
* duplicated the 'ignore_env' bit into the object store as well
* the #define trick no longer works as we do not have a "the_objectstore" 
global,
  which means there is just one patch per function that is converted.
  As this follows the same structure of the previous series, I am still 
confident
  there is no hidden dependencies to globals outside the object store in these
  converted functions.
* rebased on top of current master, resolving the merge conflicts.
  I think I used the list.h APIs right, but please double check.

Thanks,
Stefan

v1:
This is a real take on the first part of the recent RFC[1].

Jonathan Tan suggested[2] that "sha1_loose_object_info to handle arbitrary 
repositories"
might be a good breaking point for a first part at that RFC at patch 38.
This series is smaller and contains only 26 patches as the patches in the big
RFC were slightly out of order.

I developed this series partly by writing patches, but mostly by cherrypicking
from that RFC on top of current master. I noticed no external conflicts apart
from one addition to the repositories _INIT macro, which was easy to resolve.

Comments in the early range of that RFC were on 003 where Junio pointed out
that the coccinelle patch ought to be not in contrib/coccinelle, so I put it
in a sub directory there, as 'make coccicheck' doesn't traverse subdirs.

brian had a questoin on patch 25 in the RFC, but that seemed to resolve itself
without any suggestion to include into this series[3].

Duy suggested that we shall not use the repository blindly, but should carefully
examine whether to pass on an object store or the refstore or such[4], which
I agree with if it makes sense. This series unfortunately has an issue with that
as I would not want to pass down the `ignore_env` flag separately from the 
object
store, so I made all functions that only take the object store to have the raw
object store as the first parameter, and others using the full repository.

Eric Sunshine brought up memory leaks with the RFC, and I would think to
have plugged all holes.

[1] https://public-inbox.org/git/20180205235508.216277-1-sbel...@google.com/
[2] 
https://public-inbox.org/git/20180207143300.ce1c39ca07f6a0d64fe0e...@google.com/
[3] 
https://public-inbox.org/git/20180206011940.gd7...@genre.crustytoothpaste.net/
[4] 
https://public-inbox.org/git/cacsjy8cggekpx4czkyytsprj87uqvkzsol7fyt__p2dh_1l...@mail.gmail.com/

Thanks,
Stefan

Jonathan Nieder (2):
  sha1_file: allow map_sha1_file_1 to handle arbitrary repositories
  sha1_file: allow sha1_loose_object_info to handle arbitrary
repositories

Stefan Beller (25):
  repository: introduce raw object store field
  object-store: migrate alternates struct and functions from cache.h
  object-store: move alt_odb_list and alt_odb_tail to object store
  object-store: free alt_odb_list
  object-store: move packed_git and packed_git_mru to object store
  object-store: close all packs upon clearing the object store
  pack: move prepare_packed_git_run_once to object store
  pack: move approximate object count to object store
  sha1_file: add raw_object_store argument to alt_odb_usable
  sha1_file: add repository argument to link_alt_odb_entry
  sha1_file: add repository argument to read_info_alternates
  sha1_file: add repository argument to link_alt_odb_entries
  sha1_file: add repository argument to prepare_alt_odb
  sha1_file: allow link_alt_odb_entries to handle arbitrary repositories
  sha1_file: allow prepare_alt_odb to handle arbitrary repositories
  sha1_file: add repository argument to sha1_file_name
  sha1_file: add repository argument to stat_sha1_file
  sha1_file: add repository argument to open_sha1_file
  sha1_file: add repository argument to map_sha1_file_1
  sha1_file: add repository argument to map_sha1_file
  sha1_file: add repository argument to sha1_loose_object_info
  sha1_file: allow sha1_file_name to handle arbitrary repositories
  sha1_file: allow stat_sha1_file to handle arbitrary repositories
  sha1_file: allow open_sha1_file to handle arbitrary repositories
  sha1_file: allow map_sha1_file to handle arbitrary repositories

 builtin/am.c |   2 +-
 builtin/clone.c  |   2 +-
 builtin/count-objects.c  |   5 +-
 builtin/fetch.c  |   2 +-
 builtin/fsck.c   |  12 ++--
 builtin/gc.c |   3 +-
 builtin/grep.c   |   2 +-
 builtin/merge.c  |   2 +-
 builtin/pack-objects.c   |  20 ---
 builtin/pack-redundant.c |   5 +-
 

[PATCHv4 02/27] object-store: migrate alternates struct and functions from cache.h

2018-02-23 Thread Stefan Beller
Migrate the struct alternate_object_database and all its related
functions to the object store as these functions are easier found in
that header. The migration is just a verbatim copy, no need to
include the object store header at any C file, because cache.h includes
repository.h which in turn includes the object-store.h

Signed-off-by: Stefan Beller 
---
 cache.h| 51 --
 object-store.h | 51 ++
 2 files changed, 51 insertions(+), 51 deletions(-)

diff --git a/cache.h b/cache.h
index 21fbcc24149..99eb2bdb893 100644
--- a/cache.h
+++ b/cache.h
@@ -1560,57 +1560,6 @@ extern int has_dirs_only_path(const char *name, int len, 
int prefix_len);
 extern void schedule_dir_for_removal(const char *name, int len);
 extern void remove_scheduled_dirs(void);
 
-extern struct alternate_object_database {
-   struct alternate_object_database *next;
-
-   /* see alt_scratch_buf() */
-   struct strbuf scratch;
-   size_t base_len;
-
-   /*
-* Used to store the results of readdir(3) calls when searching
-* for unique abbreviated hashes.  This cache is never
-* invalidated, thus it's racy and not necessarily accurate.
-* That's fine for its purpose; don't use it for tasks requiring
-* greater accuracy!
-*/
-   char loose_objects_subdir_seen[256];
-   struct oid_array loose_objects_cache;
-
-   char path[FLEX_ARRAY];
-} *alt_odb_list;
-extern void prepare_alt_odb(void);
-extern char *compute_alternate_path(const char *path, struct strbuf *err);
-typedef int alt_odb_fn(struct alternate_object_database *, void *);
-extern int foreach_alt_odb(alt_odb_fn, void*);
-
-/*
- * Allocate a "struct alternate_object_database" but do _not_ actually
- * add it to the list of alternates.
- */
-struct alternate_object_database *alloc_alt_odb(const char *dir);
-
-/*
- * Add the directory to the on-disk alternates file; the new entry will also
- * take effect in the current process.
- */
-extern void add_to_alternates_file(const char *dir);
-
-/*
- * Add the directory to the in-memory list of alternates (along with any
- * recursive alternates it points to), but do not modify the on-disk alternates
- * file.
- */
-extern void add_to_alternates_memory(const char *dir);
-
-/*
- * Returns a scratch strbuf pre-filled with the alternate object directory,
- * including a trailing slash, which can be used to access paths in the
- * alternate. Always use this over direct access to alt->scratch, as it
- * cleans up any previous use of the scratch buffer.
- */
-extern struct strbuf *alt_scratch_buf(struct alternate_object_database *alt);
-
 struct pack_window {
struct pack_window *next;
unsigned char *base;
diff --git a/object-store.h b/object-store.h
index cf35760ceb6..5678aa1136c 100644
--- a/object-store.h
+++ b/object-store.h
@@ -1,6 +1,57 @@
 #ifndef OBJECT_STORE_H
 #define OBJECT_STORE_H
 
+extern struct alternate_object_database {
+   struct alternate_object_database *next;
+
+   /* see alt_scratch_buf() */
+   struct strbuf scratch;
+   size_t base_len;
+
+   /*
+* Used to store the results of readdir(3) calls when searching
+* for unique abbreviated hashes.  This cache is never
+* invalidated, thus it's racy and not necessarily accurate.
+* That's fine for its purpose; don't use it for tasks requiring
+* greater accuracy!
+*/
+   char loose_objects_subdir_seen[256];
+   struct oid_array loose_objects_cache;
+
+   char path[FLEX_ARRAY];
+} *alt_odb_list;
+void prepare_alt_odb(void);
+char *compute_alternate_path(const char *path, struct strbuf *err);
+typedef int alt_odb_fn(struct alternate_object_database *, void *);
+int foreach_alt_odb(alt_odb_fn, void*);
+
+/*
+ * Allocate a "struct alternate_object_database" but do _not_ actually
+ * add it to the list of alternates.
+ */
+struct alternate_object_database *alloc_alt_odb(const char *dir);
+
+/*
+ * Add the directory to the on-disk alternates file; the new entry will also
+ * take effect in the current process.
+ */
+void add_to_alternates_file(const char *dir);
+
+/*
+ * Add the directory to the in-memory list of alternates (along with any
+ * recursive alternates it points to), but do not modify the on-disk alternates
+ * file.
+ */
+void add_to_alternates_memory(const char *dir);
+
+/*
+ * Returns a scratch strbuf pre-filled with the alternate object directory,
+ * including a trailing slash, which can be used to access paths in the
+ * alternate. Always use this over direct access to alt->scratch, as it
+ * cleans up any previous use of the scratch buffer.
+ */
+struct strbuf *alt_scratch_buf(struct alternate_object_database *alt);
+
 struct raw_object_store {
/*
 * Path to the repository's object store.
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv4 01/27] repository: introduce raw object store field

2018-02-23 Thread Stefan Beller
The raw object store field will contain any objects needed for
access to objects in a given repository.

This patch introduces the raw object store and populates it with the
`objectdir`, which used to be part of the repository struct.

As the struct gains members, we'll also populate the function to clear
the memory for these members.

In a later we'll introduce a struct object_parser, that will complement
the object parsing in a repository struct: The raw object parser is the
layer that will provide access to raw object content, while the higher
level object parser code will parse raw objects and keeps track of
parenthood and other object relationships using 'struct object'.
For now only add the lower level to the repository struct.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 builtin/grep.c |  2 +-
 environment.c  |  5 +++--
 object-store.h | 15 +++
 object.c   |  5 +
 path.c |  2 +-
 repository.c   | 19 ++-
 repository.h   |  7 ---
 7 files changed, 43 insertions(+), 12 deletions(-)
 create mode 100644 object-store.h

diff --git a/builtin/grep.c b/builtin/grep.c
index 3ca4ac80d8c..0f0c195705f 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -432,7 +432,7 @@ static int grep_submodule(struct grep_opt *opt, struct 
repository *superproject,
 * object.
 */
grep_read_lock();
-   add_to_alternates_memory(submodule.objectdir);
+   add_to_alternates_memory(submodule.objects.objectdir);
grep_read_unlock();
 
if (oid) {
diff --git a/environment.c b/environment.c
index de8431e01e7..ec10b062e68 100644
--- a/environment.c
+++ b/environment.c
@@ -13,6 +13,7 @@
 #include "refs.h"
 #include "fmt-merge-msg.h"
 #include "commit.h"
+#include "object-store.h"
 
 int trust_executable_bit = 1;
 int trust_ctime = 1;
@@ -244,9 +245,9 @@ const char *get_git_work_tree(void)
 
 char *get_object_directory(void)
 {
-   if (!the_repository->objectdir)
+   if (!the_repository->objects.objectdir)
BUG("git environment hasn't been setup");
-   return the_repository->objectdir;
+   return the_repository->objects.objectdir;
 }
 
 int odb_mkstemp(struct strbuf *template, const char *pattern)
diff --git a/object-store.h b/object-store.h
new file mode 100644
index 000..cf35760ceb6
--- /dev/null
+++ b/object-store.h
@@ -0,0 +1,15 @@
+#ifndef OBJECT_STORE_H
+#define OBJECT_STORE_H
+
+struct raw_object_store {
+   /*
+* Path to the repository's object store.
+* Cannot be NULL after initialization.
+*/
+   char *objectdir;
+};
+#define RAW_OBJECT_STORE_INIT { NULL }
+
+void raw_object_store_clear(struct raw_object_store *o);
+
+#endif /* OBJECT_STORE_H */
diff --git a/object.c b/object.c
index 9e6f9ff20b0..a2acdee1405 100644
--- a/object.c
+++ b/object.c
@@ -445,3 +445,8 @@ void clear_commit_marks_all(unsigned int flags)
obj->flags &= ~flags;
}
 }
+
+void raw_object_store_clear(struct raw_object_store *o)
+{
+   FREE_AND_NULL(o->objectdir);
+}
diff --git a/path.c b/path.c
index da8b655730d..81a42d91155 100644
--- a/path.c
+++ b/path.c
@@ -382,7 +382,7 @@ static void adjust_git_path(const struct repository *repo,
strbuf_splice(buf, 0, buf->len,
  repo->index_file, strlen(repo->index_file));
else if (dir_prefix(base, "objects"))
-   replace_dir(buf, git_dir_len + 7, repo->objectdir);
+   replace_dir(buf, git_dir_len + 7, repo->objects.objectdir);
else if (git_hooks_path && dir_prefix(base, "hooks"))
replace_dir(buf, git_dir_len + 5, git_hooks_path);
else if (repo->different_commondir)
diff --git a/repository.c b/repository.c
index 4ffbe9bc94e..fb65e8072d5 100644
--- a/repository.c
+++ b/repository.c
@@ -1,11 +1,18 @@
 #include "cache.h"
 #include "repository.h"
+#include "object-store.h"
 #include "config.h"
 #include "submodule-config.h"
 
 /* The main repository */
 static struct repository the_repo = {
-   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, _index, 
_algos[GIT_HASH_SHA1], 0, 0
+   NULL, NULL,
+   RAW_OBJECT_STORE_INIT,
+   NULL, NULL, NULL,
+   NULL, NULL, NULL,
+   _index,
+   _algos[GIT_HASH_SHA1],
+   0, 0
 };
 struct repository *the_repository = _repo;
 
@@ -42,9 +49,10 @@ static void repo_setup_env(struct repository *repo)
!repo->ignore_env);
free(repo->commondir);
repo->commondir = strbuf_detach(, NULL);
-   free(repo->objectdir);
-   repo->objectdir = git_path_from_env(DB_ENVIRONMENT, repo->commondir,
-   "objects", !repo->ignore_env);
+   raw_object_store_clear(>objects);
+   repo->objects.objectdir =
+   git_path_from_env(DB_ENVIRONMENT, repo->commondir,
+  

Git 2.16.2: Build failure linking static libcurl / static SSL

2018-02-23 Thread Paul Smith
I'm compiling Git with my own static libcurl and my own static
LibreSSL.  They live in two different locations.

Building Curl with a pointer to my LibreSSL works fine, and compiling
Git works fine: the correct -I options are added to the compile line
when I configure with --with-openssl=/path/to/libressl/dist

However, linking fails to find the crypto and ssl libraries, because
the OPENSSLDIR lib directory is not added to the link line with -L. 
The link line in question is from Makefile:

  git-http-fetch$X: http.o http-walker.o http-fetch.o GIT-LDFLAGS $(GITLIBS)
$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
$(CURL_LIBCURL) $(LIBS)
  git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS)
$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
$(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS)

The OpenSSL libraries are included in CURL_LIBCURL, but the -L flags
are not:

ifdef NEEDS_SSL_WITH_CURL
CURL_LIBCURL += -lssl
ifdef NEEDS_CRYPTO_WITH_SSL
CURL_LIBCURL += -lcrypto
endif
endif

This section needs to add in the OPENSSL_LINK variable, or maybe it has
to go directly in the git-http-fetch/push recipe, I'm not sure which is
appropriate.  There seems to be a lot of different variables that have
similar content, that maybe should be aligned (OPENSSL_LIBSSL,
OPENSSL_LINK, LIB_4_CRYPTO, CURL_LIBCURL, etc.)

But, the -L is definitely missing:

  gcc -g -O2 -I. -DGIT_HOST_CPU="\"x86_64\"" -DHAVE_ALLOCA_H 
-I/work/src/git/Linux-Release-make/dist/include 
-I/work/src/libressl/Linux-Release-make/dist/include -DNO_GETTEXT -DSHA1_DC 
-DSHA1DC_NO_STANDARD_INCLUDES -DSHA1DC_INIT_SAFE_HASH_DEFAULT=0 
-DSHA1DC_CUSTOM_INCLUDE_SHA1_C="\"cache.h\"" 
-DSHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C="\"git-compat-util.h\"" -pthread 
-DHAVE_PATHS_H -DHAVE_STRINGS_H -DHAVE_DEV_TTY -DHAVE_CLOCK_GETTIME 
-DHAVE_CLOCK_MONOTONIC -DHAVE_GETDELIM  -DFREAD_READS_DIRECTORIES -DNO_STRLCPY 
-DSHELL_PATH='"/bin/sh"' -DPAGER_ENV='"LESS=FRX LV=-c"' -o git-http-fetch   
http.o http-walker.o http-fetch.o common-main.o \
-L/work/src/git/Linux-Release-make/dist/lib 
-L/work/src/git/Linux-Release-make/dist/lib -lcurl -lssl -lcrypto -lidn 
libgit.a xdiff/lib.a  -lz -pthread -lrt
  ld: error: cannot find -lssl
  ld: error: cannot find -lcrypto

Note how the -I.../libressl/.../include option is present, but the
-L.../libressl/.../lib option is missing.


Re: Issue in switching branches with different submodules

2018-02-23 Thread Stefan Beller
On Thu, Feb 22, 2018 at 4:10 AM,
 wrote:
> Hello,
> I've found an issue in git when working with submodules.
> My Project is set up using hundreds of components by submodules (and
> nested submodules), that are independent created in central development
> groups.
> Now it occurs that the structure of the submodules is changed over time.
>
> E.g.
> Project1(OldBranch)
>   - ComponentX/SubComp1 -> ssh://Server/ComponentX/SubComp1
>   - ComponentX/SubComp2 -> ssh://Server/ComponentX/SubComp2
>   - ComponentX/SubComp2 -> ssh://Server/ComponentX/SubComp2
>
> Project1(Masster)
>   - ComponentX-> ssh://Server/ComponentX
>
> There is both a repository for the old subcomponents, and for the new
> Component on the server.

ok, so you're saying this is all a client side problem?

> When trying to switch between the branches all git commands are failing.
> It seems like most git commands are not creating the SubComponent
> submodules because the .git file from the Component is not deleted
>
> A 'git submodule sync' fails with:
> fatal: Not a git repository:
> D:/Project1/ComponentX/../.git/modules/ComponentX
>
> Looking into the folders I see:
> D:/Project1/.git/modules/ComponentX/SubComp1
> D:/Project1/.git/modules/ComponentX/SubComp2
> D:/Project1/.git/modules/ComponentX/SubComp3
> D:/Project1/ComponentX/.git (file)

As a quick workaround to repair your current corrupted repo,
you can just delete the .git file, which presumably contains

  gitdir: ../.git/modules/ComponentX

I think this reveals (yet another) problem that we have with
with submodule names. Submodules used to have its git directory
inside its own working tree, but now they are encouraged to have
it in the superproject in .git/modules/.

Considering your submodules
"ComponentX/..."
"ComponentX"
you may have a directory conflict, because there is already a
directory named "ComponentX" so we cannot store the later
submodules git directory.

> A 'git submodule update --init also fails with this folders
> Neither a forced checkout or a hard reset is working.

For this problem you could manually repair the superproject by
giving better names to submodules, such that you have unique
names, such that one is not the prefix directory of another.

> Similar errors can occur when switching between branches with a different
> number of components used as submodules vs. project specific content.
> As a result it happens that people are working with an incosistend state
> of the working directory.
>
> My expectation would be that, when switching between branches/commits with
> a git checkout --recurse-submodules the branche including all submodules
> is checked out fully to exactly the state of the desired branch/commit
> I.e the following usecases are an issue
> - Deleted submodule
> - Added submodule as replacement of local directory
> - Changed remote location of submodule (One component is available from
> different providers having individual repositories)
> - Restructured Component (see example)

I agree with this expectation and we'd need to discuss how to make this happen
as the submodule naming scheme doesn't allow for this to work flawlessly.

> Similar issues are existing when creating the new structure to commit it,
> but here the error is more obvious and people are manually deleting as
> much as required to get the new submodules correctly in.
>
> Best regards

Thanks for this bug report; I needed some time to ingest it and come up
with speculation on what might be the cause. (Not sure if it is the correct
root cause though)

Thanks,
Stefan


Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command

2018-02-23 Thread Brandon Williams
On 02/22, Brandon Williams wrote:
> On 02/22, Jeff King wrote:
> > On Tue, Feb 06, 2018 at 05:12:50PM -0800, Brandon Williams wrote:
> > 
> > > +ls-refs takes in the following parameters wrapped in packet-lines:
> > > +
> > > +symrefs
> > > + In addition to the object pointed by it, show the underlying ref
> > > + pointed by it when showing a symbolic ref.
> > > +peel
> > > + Show peeled tags.
> > > +ref-pattern 
> > > + When specified, only references matching the one of the provided
> > > + patterns are displayed.
> > 
> > How do we match those patterns? That's probably an important thing to
> > include in the spec.
> 
> Yeah I thought about it when I first wrote it and was hoping that
> someone who nudge me in the right direction :)
> 
> > 
> > Looking at the code, I see:
> > 
> > > +/*
> > > + * Check if one of the patterns matches the tail part of the ref.
> > > + * If no patterns were provided, all refs match.
> > > + */
> > > +static int ref_match(const struct argv_array *patterns, const char 
> > > *refname)
> > 
> > This kind of tail matching can't quite implement all of the current
> > behavior. Because we actually do the normal dwim_ref() matching, which
> > includes stuff like "refs/remotes/%s/HEAD".
> > 
> > The other problem with tail-matching is that it's inefficient on the
> > server. Ideally we could get a request for "master" and only look up
> > refs/heads/master, refs/tags/master, etc. And if there are 50,000 refs
> > in refs/pull, we wouldn't have to process those at all. Of course this
> > is no worse than the current code, which not only looks at each ref but
> > actually _sends_ it. But it would be nice if we could fix this.
> > 
> > There's some more discussion in this old thread:
> > 
> >   
> > https://public-inbox.org/git/20161024132932.i42rqn2vlpocq...@sigill.intra.peff.net/
> 
> Thanks for the pointer.  I was told to be wary a while about about
> performance implications on the server but no discussion ensued till now
> about it :)
> 
> We always have the ability to extend the patterns accepted via a feature
> (or capability) to ls-refs, so maybe the best thing to do now would only
> support a few patterns with specific semantics.  Something like if you
> say "master" only match against refs/heads/ and refs/tags/ and if you
> want something else you would need to specify "refs/pull/master"?
> 
> That way we could only support globs at the end "master*" where * can
> match anything (including slashes)

After some in-office discussion it seems like the best thing to do for
this (right now since if we change our mind we can just introduce a
capability which extends the patterns supported) would be to left-anchor
the ref-patterns and only allow for a single wildcard character '*'
which matches zero or more characters (and doesn't care about slashes
'/').  This wildcard character should only be supported at the end of
the ref pattern.  This means that if a client wants 'master' then they
would need to specify 'refs/heads/master' (and the other
ref_rev_parse_rules expansions) as a ref pattern. But they could say
"refs/heads/*" for all refs under refs/heads.

> 
> > 
> > > +{
> > > + char *pathbuf;
> > > + int i;
> > > +
> > > + if (!patterns->argc)
> > > + return 1; /* no restriction */
> > > +
> > > + pathbuf = xstrfmt("/%s", refname);
> > > + for (i = 0; i < patterns->argc; i++) {
> > > + if (!wildmatch(patterns->argv[i], pathbuf, 0)) {
> > > + free(pathbuf);
> > > + return 1;
> > > + }
> > > + }
> > > + free(pathbuf);
> > > + return 0;
> > > +}
> > 
> > Does the client have to be aware that we're using wildmatch? I think
> > they'd need "refs/heads/**" to actually implement what we usually
> > specify in refspecs as "refs/heads/*". Or does the lack of WM_PATHNAME
> > make this work with just "*"?
> > 
> > Do we anticipate that the client would left-anchor the refspec like
> > "/refs/heads/*" so that in theory the server could avoid looking outside
> > of /refs/heads/?
> 
> Yeah we may want to anchor it by providing the leading '/' instead of
> just "refs/".
> 
> > 
> > -Peff
> 
> I need to read over the discussion you linked to more but what sort of
> ref patterns do you believe we should support as part of the initial
> release of v2?  It seems like you wanted this at some point in the past
> so I assume you have an idea of what sort of filtering would be
> beneficial.
> 
> -- 
> Brandon Williams

-- 
Brandon Williams


Re: [GSoC][PATCH v2] ref-filter: Make "--contains " less chatty if is invalid

2018-02-23 Thread Junio C Hamano
Junio C Hamano  writes:

> Paul-Sebastian Ungureanu  writes:
>
>> Basically, the option parser only parses strings that represent
>> commits and the ref-filter performs the commit look-up. If an
>> error occurs during the option parsing, then it must be an invalid
>> argument and the user should be informed of usage, but if a error
>> occurs during ref-filtering, then it is a problem with the
>> argument.

After staring the code a bit longer, I started to dislike the
approach taken by this patch quite a lot.  Isn't the problem that we
let parse-options machinery to show the usage help, which is
designed to be shown when the user does not know what options the
command supports, even when we recognised the option perfectly fine?

That is, if we added a mechanism for a callback function given to
OPT_CALLBACK to tell the calling parse-options machinery "I
recognise the option; but the value given to the option is wrong and
that is why I am returning an error", and made the caller in the
parse-options machinery to refrain from showing the usage help,
would it solve the issue with minimum fuss and stop the execution at
the very first error we detect?

Stepping back even further, I wonder if any error detected in a
custom callback handler given to OPT_CALLBACK even wants to show the
usage help.  I offhand do not think of any situation--- the callback
was called only because OPT_CALLBACK item in the options[] list
matched what the user gave us, so at that point we know the user
gave us one of the valid options.  The error message from the
callback may say "Hey I only take commit object name", or it could
(theoretically) even be "Sorry I do not take any values", but in any
case, I do not think there is a reason for a failure detected in the
callback should lead to the usage help.

So perhaps "if we added a machanism...to tell..." part in the
previous paragraph is not even needed, and the only thing we need to
do is to make the caller in parse-options that calls a custom
callback function given to OPT_CALLBACK to stop giving the usage
help.  Wouldn't that automatically fix the "branch --points-at
garbage" issue that is not addressed by this patch, too?


[PATCH 05/11] t5500-fetch-pack: don't check the stderr of a subshell

2018-02-23 Thread SZEDER Gábor
Three "missing reference" tests in 't5500-fetch-pack.sh' fail when the
test script is run with '-x' tracing (and using a shell other than a
Bash version supporting BASH_XTRACEFD).  The reason for those failures
is that the tests check a subshell's stderr, which includes the trace
of executing commands in that subshell as well, throwing off the
comparison with the expected output.

Save the stderr of 'git fetch-pack' only instead of the whole
subshell, so it remains free from tracing output.

After this change t5500 passes with '-x', even when running with
/bin/sh.

Signed-off-by: SZEDER Gábor 
---
 t/t5500-fetch-pack.sh | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index ec9ba9bf6e..0680dec808 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -482,24 +482,24 @@ test_expect_success 'set up tests of missing reference' '
 test_expect_success 'test lonely missing ref' '
(
cd client &&
-   test_must_fail git fetch-pack --no-progress .. refs/heads/xyzzy
-   ) >/dev/null 2>error-m &&
+   test_must_fail git fetch-pack --no-progress .. refs/heads/xyzzy 
2>../error-m
+   ) &&
test_i18ncmp expect-error error-m
 '
 
 test_expect_success 'test missing ref after existing' '
(
cd client &&
-   test_must_fail git fetch-pack --no-progress .. refs/heads/A 
refs/heads/xyzzy
-   ) >/dev/null 2>error-em &&
+   test_must_fail git fetch-pack --no-progress .. refs/heads/A 
refs/heads/xyzzy 2>../error-em
+   ) &&
test_i18ncmp expect-error error-em
 '
 
 test_expect_success 'test missing ref before existing' '
(
cd client &&
-   test_must_fail git fetch-pack --no-progress .. refs/heads/xyzzy 
refs/heads/A
-   ) >/dev/null 2>error-me &&
+   test_must_fail git fetch-pack --no-progress .. refs/heads/xyzzy 
refs/heads/A 2>../error-me
+   ) &&
test_i18ncmp expect-error error-me
 '
 
-- 
2.16.2.400.g911b7cc0da



[PATCH 07/11] t5570-git-daemon: don't check the stderr of a subshell

2018-02-23 Thread SZEDER Gábor
The test 'no-op fetch without "-v" is quiet' in 't5570-git-daemon.sh'
fails when the test script is run with '-x' tracing (and using a shell
other than a Bash version supporting BASH_XTRACEFD).  The reason for
the failure is that the test checks the emptiness of a subshell's
stderr, which includes the trace of commands executed in that subshell
as well, throwing off the emptiness check.

Save the stderr of 'git fetch' only instead of the whole subshell's, so
it remains free from tracing output.

After this change t5570 passes with '-x', even when running with
/bin/sh.

Signed-off-by: SZEDER Gábor 
---
 t/t5570-git-daemon.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5570-git-daemon.sh b/t/t5570-git-daemon.sh
index 755b05a8ae..0d4c52016b 100755
--- a/t/t5570-git-daemon.sh
+++ b/t/t5570-git-daemon.sh
@@ -50,7 +50,7 @@ test_expect_success 'no-op fetch -v stderr is as expected' '
 '
 
 test_expect_success 'no-op fetch without "-v" is quiet' '
-   (cd clone && git fetch) 2>stderr &&
+   (cd clone && git fetch 2>../stderr) &&
! test -s stderr
 '
 
-- 
2.16.2.400.g911b7cc0da



[PATCH 09/11] t1510-repo-setup: mark as untraceable with '-x'

2018-02-23 Thread SZEDER Gábor
't1510-repo-setup.sh' checks the stderr of nested function calls way
too many times, resulting in several failures when using '-x' tracing,
unless it's executed with a Bash version supporting BASH_XTRACEFD.

Maybe someday we will clear up this test script, but until then mark
it as 'test_untraceable'.

After this change

  make GIT_TEST_OPTS='-x --verbose-log' test

finally fully passes without setting TEST_SHELL_PATH to Bash.

Signed-off-by: SZEDER Gábor 
---
 t/t1510-repo-setup.sh | 4 
 1 file changed, 4 insertions(+)

diff --git a/t/t1510-repo-setup.sh b/t/t1510-repo-setup.sh
index 13ae12dfa7..e6854b828e 100755
--- a/t/t1510-repo-setup.sh
+++ b/t/t1510-repo-setup.sh
@@ -39,6 +39,10 @@ A few rules for repo setup:
 11. When user's cwd is outside worktree, cwd remains unchanged,
 prefix is NULL.
 "
+
+# This test heavily relies on the standard error of nested function calls.
+test_untraceable=UnfortunatelyYes
+
 . ./test-lib.sh
 
 here=$(pwd)
-- 
2.16.2.400.g911b7cc0da



[PATCH 04/11] t3030-merge-recursive: don't check the stderr of a subshell

2018-02-23 Thread SZEDER Gábor
The two test checking 'git mmerge-recursive' in an empty worktree in
't3030-merge-recursive.sh' fail when the test script is run with '-x'
tracing (and using a shell other than a Bash version supporting
BASH_XTRACEFD).  The reason for those failures is that the tests check
the emptiness of a subshell's stderr, which includes the trace of
commands executed in that subshell as well, throwing off the emptiness
check.

Note that both subshells execute four git commands each, meaning that
checking the emptiness of the whole subshell implicitly ensures that
not only 'git merge-recursive' but none of the other three commands
outputs anything to their stderr.  Note also that if one of those
commands were to output anything on its stderr, then the current
combined check would not tell us which one of those four commands the
unexpected output came from.

Save the stderr of those four commands only instead of the whole
subshell, so it remains free from tracing output, and save and check
them individually, so they will show us from which command the
unexpected output came from.

After this change t3030 passes with '-x', even when running with
/bin/sh.

Signed-off-by: SZEDER Gábor 
---
 t/t3030-merge-recursive.sh | 36 
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/t/t3030-merge-recursive.sh b/t/t3030-merge-recursive.sh
index cdc38fe5d1..cbeea1cf94 100755
--- a/t/t3030-merge-recursive.sh
+++ b/t/t3030-merge-recursive.sh
@@ -525,20 +525,22 @@ test_expect_success 'merge-recursive w/ empty work tree - 
ours has rename' '
GIT_INDEX_FILE="$PWD/ours-has-rename-index" &&
export GIT_INDEX_FILE &&
mkdir "$GIT_WORK_TREE" &&
-   git read-tree -i -m $c7 &&
-   git update-index --ignore-missing --refresh &&
-   git merge-recursive $c0 -- $c7 $c3 &&
-   git ls-files -s >actual-files
-   ) 2>actual-err &&
-   >expected-err &&
+   git read-tree -i -m $c7 2>actual-err &&
+   test_must_be_empty expected-err &&
+   git update-index --ignore-missing --refresh 2>actual-err &&
+   test_must_be_empty expected-err &&
+   git merge-recursive $c0 -- $c7 $c3 2>actual-err &&
+   test_must_be_empty expected-err &&
+   git ls-files -s >actual-files 2>actual-err &&
+   test_must_be_empty expected-err
+   ) &&
cat >expected-files <<-EOF &&
100644 $o3 0b/c
100644 $o0 0c
100644 $o0 0d/e
100644 $o0 0e
EOF
-   test_cmp expected-files actual-files &&
-   test_cmp expected-err actual-err
+   test_cmp expected-files actual-files
 '
 
 test_expect_success 'merge-recursive w/ empty work tree - theirs has rename' '
@@ -548,20 +550,22 @@ test_expect_success 'merge-recursive w/ empty work tree - 
theirs has rename' '
GIT_INDEX_FILE="$PWD/theirs-has-rename-index" &&
export GIT_INDEX_FILE &&
mkdir "$GIT_WORK_TREE" &&
-   git read-tree -i -m $c3 &&
-   git update-index --ignore-missing --refresh &&
-   git merge-recursive $c0 -- $c3 $c7 &&
-   git ls-files -s >actual-files
-   ) 2>actual-err &&
-   >expected-err &&
+   git read-tree -i -m $c3 2>actual-err &&
+   test_must_be_empty expected-err &&
+   git update-index --ignore-missing --refresh 2>>actual-err &&
+   test_must_be_empty expected-err &&
+   git merge-recursive $c0 -- $c3 $c7 2>>actual-err &&
+   test_must_be_empty expected-err &&
+   git ls-files -s >actual-files 2>>actual-err &&
+   test_must_be_empty expected-err
+   ) &&
cat >expected-files <<-EOF &&
100644 $o3 0b/c
100644 $o0 0c
100644 $o0 0d/e
100644 $o0 0e
EOF
-   test_cmp expected-files actual-files &&
-   test_cmp expected-err actual-err
+   test_cmp expected-files actual-files
 '
 
 test_expect_success 'merge removes empty directories' '
-- 
2.16.2.400.g911b7cc0da



[PATCH 08/11] t9903-bash-prompt: don't check the stderr of __git_ps1()

2018-02-23 Thread SZEDER Gábor
A test in 't9903-bash-prompt.sh' fails when the test script is run
with '-x' tracing and a Bash version not yet supporting BASH_XTRACEFD,
notably the default Bash version shipped in OSX.  The reason for the
failure is that the test checks the emptiness of __git_ps1()'s stderr,
which includes the trace of all commands executed within __git_ps1()
as well, throwing off the emptiness check.

Having only a single test checking the empty stderr doesn't bring us
much when none of the other tests do so, so remove this test for now.

After this change t9903 passes with '-x', even when running with a
Bash version not yet supporing BASH_XTRACEFD.

In the future we might want to consider checking the emptiness of
__git_ps1()'s stderr in each and every test, in which case we'd have
to mark this test script as 'test_untraceable', but that's a different
topic.

Signed-off-by: SZEDER Gábor 
---
 t/t9903-bash-prompt.sh | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/t/t9903-bash-prompt.sh b/t/t9903-bash-prompt.sh
index 97c9b32c2e..8f5c811dd7 100755
--- a/t/t9903-bash-prompt.sh
+++ b/t/t9903-bash-prompt.sh
@@ -735,22 +735,12 @@ test_expect_success 'prompt - hide if pwd ignored - env 
var set, config unset, p
test_cmp expected "$actual"
 '
 
-test_expect_success 'prompt - hide if pwd ignored - inside gitdir (stdout)' '
+test_expect_success 'prompt - hide if pwd ignored - inside gitdir' '
printf " (GIT_DIR!)" >expected &&
(
GIT_PS1_HIDE_IF_PWD_IGNORED=y &&
cd .git &&
-   __git_ps1 >"$actual" 2>/dev/null
-   ) &&
-   test_cmp expected "$actual"
-'
-
-test_expect_success 'prompt - hide if pwd ignored - inside gitdir (stderr)' '
-   printf "" >expected &&
-   (
-   GIT_PS1_HIDE_IF_PWD_IGNORED=y &&
-   cd .git &&
-   __git_ps1 >/dev/null 2>"$actual"
+   __git_ps1 >"$actual"
) &&
test_cmp expected "$actual"
 '
-- 
2.16.2.400.g911b7cc0da



[PATCH 06/11] t5526: use $TRASH_DIRECTORY to specify the path of GIT_TRACE log file

2018-02-23 Thread SZEDER Gábor
The test 'fetch --recurse-submodules -j2 has the same output
behaviour' in 't5526-fetch-submodules.sh' fails when the test script
is run with '-x' tracing (and using a shell other than a Bash version
supporting BASH_XTRACEFD).  The reason of that failure is the
following command:

  GIT_TRACE=$(pwd)/../trace.out git fetch <...> 2>../actual.err

because the trace of executing 'pwd' in the command substitution ends
up in 'actual.err' as well, throwing off the subsequent
'test_i18ncmp'.

Use $TRASH_DIRECTORY to specify the path of the GIT_TRACE log file
instead of $(pwd), so the command's stderr remains free from tracing
output.

After this change t5526 passes with '-x', even when running with
/bin/sh.

Signed-off-by: SZEDER Gábor 
---
 t/t5526-fetch-submodules.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index a552ad4ead..ce44d8aa46 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -85,7 +85,7 @@ test_expect_success "fetch --recurse-submodules -j2 has the 
same output behaviou
add_upstream_commit &&
(
cd downstream &&
-   GIT_TRACE=$(pwd)/../trace.out git fetch --recurse-submodules 
-j2 2>../actual.err
+   GIT_TRACE="$TRASH_DIRECTORY/trace.out" git fetch 
--recurse-submodules -j2 2>../actual.err
) &&
test_must_be_empty actual.out &&
test_i18ncmp expect.err actual.err &&
-- 
2.16.2.400.g911b7cc0da



[PATCH 10/11] t/README: add a note about don't saving stderr of compound commands

2018-02-23 Thread SZEDER Gábor
Explain in 't/README' why it is a bad idea to redirect and verify the
stderr of compound commands, in the hope that future contributions
will follow this advice and the test suite will keep working with '-x'
tracing and /bin/sh.

While at it, since we can now run the test suite with '-x' without
needing a Bash version supporting BASH_XTRACEFD, remove the now
outdated caution note about non-Bash shells from the description of
the '-x' option.

Signed-off-by: SZEDER Gábor 
---
 t/README | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/t/README b/t/README
index c430e9c52c..615682263e 100644
--- a/t/README
+++ b/t/README
@@ -84,9 +84,7 @@ appropriately before running "make".
 
 -x::
Turn on shell tracing (i.e., `set -x`) during the tests
-   themselves. Implies `--verbose`. Note that in non-bash shells,
-   this can cause failures in some tests which redirect and test
-   the output of shell functions. Use with caution.
+   themselves. Implies `--verbose`.
Ignored in test scripts that set the variable 'test_untraceable'
to a non-empty value, unless it's run with a Bash version
supporting BASH_XTRACEFD, i.e. v4.1 or later.
@@ -455,6 +453,22 @@ Don't:
causing the next test to start in an unexpected directory.  Do so
inside a subshell if necessary.
 
+ - save and verify the standard error of compound commands, i.e. group
+   commands, subshells, and shell functions (except test helper
+   functions like 'test_must_fail') like this:
+
+ ( cd dir && git cmd ) 2>error &&
+ test_cmp expect error
+
+   When running the test with '-x' tracing, then the trace of commands
+   executed in the compound command will be included in standard error
+   as well, quite possibly throwing off the subsequent checks examining
+   the output.  Instead, save only the relevant git command's standard
+   error:
+
+ ( cd dir && git cmd 2>../error ) &&
+ test_cmp expect error
+
  - Break the TAP output
 
The raw output from your test may be interpreted by a TAP harness. TAP
-- 
2.16.2.400.g911b7cc0da



[PATCH 03/11] t1507-rev-parse-upstream: don't check the stderr of a shell function

2018-02-23 Thread SZEDER Gábor
Three tests in 't1507-rev-parse-upstream.sh' fail when the test script
is run with '-x' tracing (and using a shell other than a Bash version
supporting BASH_XTRACEFD).  The reason for those failures is that the
tests check the stderr of the function 'error_message', which includes
the trace of commands executed in that function as well, throwing off
the comparison with the expected output.

Save stderr of 'git rev-parse' only instead of the whole function, so
it remains free from tracing output.

After this change t1507 passes with '-x', even when running with
/bin/sh.

Signed-off-by: SZEDER Gábor 
---
 t/t1507-rev-parse-upstream.sh | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/t/t1507-rev-parse-upstream.sh b/t/t1507-rev-parse-upstream.sh
index b23c4e3fab..2ce68cc277 100755
--- a/t/t1507-rev-parse-upstream.sh
+++ b/t/t1507-rev-parse-upstream.sh
@@ -42,7 +42,7 @@ commit_subject () {
 
 error_message () {
(cd clone &&
-test_must_fail git rev-parse --verify "$@")
+test_must_fail git rev-parse --verify "$@" 2>../error)
 }
 
 test_expect_success '@{upstream} resolves to correct full name' '
@@ -159,8 +159,8 @@ test_expect_success 'branch@{u} error message when no 
upstream' '
cat >expect <<-EOF &&
fatal: no upstream configured for branch ${sq}non-tracking${sq}
EOF
-   error_message non-tracking@{u} 2>actual &&
-   test_i18ncmp expect actual
+   error_message non-tracking@{u} &&
+   test_i18ncmp expect error
 '
 
 test_expect_success '@{u} error message when no upstream' '
@@ -175,8 +175,8 @@ test_expect_success 'branch@{u} error message with misspelt 
branch' '
cat >expect <<-EOF &&
fatal: no such branch: ${sq}no-such-branch${sq}
EOF
-   error_message no-such-branch@{u} 2>actual &&
-   test_i18ncmp expect actual
+   error_message no-such-branch@{u} &&
+   test_i18ncmp expect error
 '
 
 test_expect_success '@{u} error message when not on a branch' '
@@ -192,8 +192,8 @@ test_expect_success 'branch@{u} error message if upstream 
branch not fetched' '
cat >expect <<-EOF &&
fatal: upstream branch ${sq}refs/heads/side${sq} not stored as a 
remote-tracking branch
EOF
-   error_message bad-upstream@{u} 2>actual &&
-   test_i18ncmp expect actual
+   error_message bad-upstream@{u} &&
+   test_i18ncmp expect error
 '
 
 test_expect_success 'pull works when tracking a local branch' '
-- 
2.16.2.400.g911b7cc0da



[PATCH 11/11] travis-ci: run tests with '-x' tracing

2018-02-23 Thread SZEDER Gábor
Now that the test suite runs successfully with '-x' tracing even with
/bin/sh, enable it on Travis CI in order to

  - get more information about test failures, and

  - catch constructs breaking '-x' with /bin/sh sneaking into our test
suite.

Signed-off-by: SZEDER Gábor 
---
 ci/lib-travisci.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ci/lib-travisci.sh b/ci/lib-travisci.sh
index 1efee55ef7..109ef280da 100755
--- a/ci/lib-travisci.sh
+++ b/ci/lib-travisci.sh
@@ -97,7 +97,7 @@ fi
 export DEVELOPER=1
 export DEFAULT_TEST_TARGET=prove
 export GIT_PROVE_OPTS="--timer --jobs 3 --state=failed,slow,save"
-export GIT_TEST_OPTS="--verbose-log"
+export GIT_TEST_OPTS="--verbose-log -x"
 export GIT_TEST_CLONE_2GB=YesPlease
 
 case "$jobname" in
-- 
2.16.2.400.g911b7cc0da



[PATCH 01/11] t: prevent '-x' tracing from interfering with test helpers' stderr

2018-02-23 Thread SZEDER Gábor
Running a test script with '-x' turns on 'set -x' tracing, the output
of which is normally sent to stderr.  This causes a lot of
test failures, because many tests redirect and verify the stderr
of shell functions, most frequently that of 'test_must_fail'.
These issues were worked around somewhat in d88785e424 (test-lib: set
BASH_XTRACEFD automatically, 2016-05-11), so at least we could
reliably run tests with '-x' tracing under a Bash version supporting
BASH_XTRACEFD, i.e. v4.1 and later.

This patch makes it safe to redirect and verify the stderr of those
test helper functions which are meant to run the tested command given
as argument, even when running tests with '-x' and /bin/sh.  This is
achieved through a couple of file descriptor redirections:

  - Duplicate stderr of the tested command executed in the test helper
function from the function's fd 7 (see next point), to ensure that
the tested command's error messages go to a different fd than the
'-x' trace of the commands executed in the function.

  - Duplicate the test helper function's fd 7 from the function's
original stderr, meaning that, after taking a detour through fd 7,
the error messages of the tested command do end up on the
function's original stderr.

  - Duplicate stderr of the test helper function from fd 4, i.e. the
fd connected to the test script's original stderr and the fd used
for BASH_XTRACEFD.  This ensures that the '-x' trace of the
commands executed in the function

  - doesn't go to the function's original stderr, so it won't mess
with callers who want to save and verify the tested command's
stderr.

  - does go to the same fd independently from the shell running
the test script, be it /bin/sh, an older Bash without
BASH_XTRACEFD, or a more recent Bash already supporting
BASH_XTRACEFD.

  - Specify the latter two redirections above in the test helper
function's definition, so they are performed every time the
function is invoked, without the need to modify the callsites of
the function.

Perform these redirections in those test helper functions which can be
expected to have their stderr redirected, i.e. in the functions
'test_must_fail', 'test_might_fail', 'test_expect_code', 'test_env',
'nongit', 'test_terminal' and 'perl'.  Note that 'test_might_fail',
'test_env', and 'nongit' are not involved in any test failures when
running tests with '-x' and /bin/sh.

The other test helper functions are left unchanged, because they
either don't run commands specified as their arguments, or redirecting
their stderr wouldn't make sense, or both.

With this change the number of failures when running the test suite
with '-x' tracing and /bin/sh goes down from 340 failed tests in 43
test scripts to 22 failed tests in 6 scripts (or 23 in 7, if the
system (OSX) uses an older Bash version without BASH_XTRACEFD to run
't9903-bash-prompt.sh').

Signed-off-by: SZEDER Gábor 
---
 t/lib-terminal.sh   |  4 ++--
 t/test-lib-functions.sh | 24 
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/t/lib-terminal.sh b/t/lib-terminal.sh
index cd220e378e..b3acb4c6f8 100644
--- a/t/lib-terminal.sh
+++ b/t/lib-terminal.sh
@@ -9,8 +9,8 @@ test_terminal () {
echo >&4 "test_terminal: need to declare TTY prerequisite"
return 127
fi
-   perl "$TEST_DIRECTORY"/test-terminal.perl "$@"
-}
+   perl "$TEST_DIRECTORY"/test-terminal.perl "$@" 2>&9
+} 9>&2 2>&4
 
 test_lazy_prereq TTY '
test_have_prereq PERL &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 67b5994afb..37eb34044a 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -621,7 +621,7 @@ test_must_fail () {
_test_ok=
;;
esac
-   "$@"
+   "$@" 2>&7
exit_code=$?
if test $exit_code -eq 0 && ! list_contains "$_test_ok" success
then
@@ -644,7 +644,7 @@ test_must_fail () {
return 1
fi
return 0
-}
+} 7>&2 2>&4
 
 # Similar to test_must_fail, but tolerates success, too.  This is
 # meant to be used in contexts like:
@@ -658,8 +658,8 @@ test_must_fail () {
 # because we want to notice if it fails due to segv.
 
 test_might_fail () {
-   test_must_fail ok=success "$@"
-}
+   test_must_fail ok=success "$@" 2>&7
+} 7>&2 2>&4
 
 # Similar to test_must_fail and test_might_fail, but check that a
 # given command exited with a given exit code. Meant to be used as:
@@ -671,7 +671,7 @@ test_might_fail () {
 test_expect_code () {
want_code=$1
shift
-   "$@"
+   "$@" 2>&7
exit_code=$?
if test $exit_code = $want_code
then
@@ -680,7 +680,7 @@ test_expect_code () {
 
echo >&2 "test_expect_code: command exited with $exit_code, we wanted 
$want_code $*"
return 1
-}
+} 7>&2 2>&4
 
 # test_cmp is a helper function 

[PATCH 02/11] t: add means to disable '-x' tracing for individual test scripts

2018-02-23 Thread SZEDER Gábor
The previous patch resolved most of the test failures caused by
running our test suite with '-x' tracing and /bin/sh, and the
following patches in this series will resolve almost all of the
remaining failures.  Unfortunately, not yet all.

Add means to disable '-x' tracing for individual test scripts by
setting the $test_untraceable variable to a non-empty value in the
test script before sourcing 'test-lib.sh'.  However, since '-x'
tracing is not an issue with recent Bash versions supporting
BASH_XTRACEFD, i.e. v4.1 and later, don't disable tracing when the
test script is run with such a Bash version even when
$test_untraceable is set.

Signed-off-by: SZEDER Gábor 
---
 t/README  |  3 +++
 t/test-lib.sh | 19 ++-
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/t/README b/t/README
index b3f7b449c3..c430e9c52c 100644
--- a/t/README
+++ b/t/README
@@ -87,6 +87,9 @@ appropriately before running "make".
themselves. Implies `--verbose`. Note that in non-bash shells,
this can cause failures in some tests which redirect and test
the output of shell functions. Use with caution.
+   Ignored in test scripts that set the variable 'test_untraceable'
+   to a non-empty value, unless it's run with a Bash version
+   supporting BASH_XTRACEFD, i.e. v4.1 or later.
 
 -d::
 --debug::
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 33f6ce26f6..732213ef1b 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -263,7 +263,24 @@ do
GIT_TEST_CHAIN_LINT=0
shift ;;
-x)
-   trace=t
+   # Some test scripts can't be reliably traced  with '-x',
+   # unless the test is run with a Bash version supporting
+   # BASH_XTRACEFD (introduced in Bash v4.1).  Check whether
+   # this test is marked as such, and ignore '-x' if it
+   # isn't executed with a suitable Bash version.
+   if test -z "$test_untraceable" || {
+test -n "$BASH_VERSION" && {
+  test ${BASH_VERSINFO[0]} -gt 4 || {
+test ${BASH_VERSINFO[0]} -eq 4 &&
+test ${BASH_VERSINFO[1]} -ge 1
+  }
+}
+  }
+   then
+   trace=t
+   else
+   echo >&2 "warning: ignoring -x; '$0' is untraceable 
without BASH_XTRACEFD"
+   fi
shift ;;
--verbose-log)
verbose_log=t
-- 
2.16.2.400.g911b7cc0da



[PATCH 00/11] Make the test suite pass with '-x' and /bin/sh

2018-02-23 Thread SZEDER Gábor
This patch series makes '-x' tracing of tests work reliably even when
running them with /bin/sh, and setting TEST_SHELL_PATH=bash will be
unnecessary.

  make GIT_TEST_OPTS='-x --verbose-log' test

passes on my setup and on all Travis CI build jobs (though neither me
nor Travis CI run all tests, e.g. CVS).


The first patch is the most important: with a couple of well-placed file
descriptor redirections it ensures that the stderr of the test helper
functions running git commands only contain the stderr of the tested
command, thereby resolving over 90% of the failures resulting from
running the test suite with '-x' and /bin/sh.

Most of the following patches resolve the remaining failures, one test
script at a time, in most cases by limiting the scope of stderr
redirections from functions and subshells to the tested git commands.
Except the second and ninth patches, which, arguably, could be
considered as cheating...  I admit, my enthusiasm suddenly run out when
I saw t1510 :)

The last two patches are just finishing touches with a bit of
documentation updates and enabling '-x' tracing in Travis CI build jobs.


There is currently nothing in 'pu' that would require additional fixes
to make this patch series work.


SZEDER Gábor (11):
  t: prevent '-x' tracing from interfering with test helpers' stderr
  t: add means to disable '-x' tracing for individual test scripts
  t1507-rev-parse-upstream: don't check the stderr of a shell function
  t3030-merge-recursive: don't check the stderr of a subshell
  t5500-fetch-pack: don't check the stderr of a subshell
  t5526: use $TRASH_DIRECTORY to specify the path of GIT_TRACE log file
  t5570-git-daemon: don't check the stderr of a subshell
  t9903-bash-prompt: don't check the stderr of __git_ps1()
  t1510-repo-setup: mark as untraceable with '-x'
  t/README: add a note about don't saving stderr of compound commands
  travis-ci: run tests with '-x' tracing

 ci/lib-travisci.sh|  2 +-
 t/README  | 23 +++---
 t/lib-terminal.sh |  4 ++--
 t/t1507-rev-parse-upstream.sh | 14 +++---
 t/t1510-repo-setup.sh |  4 
 t/t3030-merge-recursive.sh| 36 +++
 t/t5500-fetch-pack.sh | 12 ++--
 t/t5526-fetch-submodules.sh   |  2 +-
 t/t5570-git-daemon.sh |  2 +-
 t/t9903-bash-prompt.sh| 14 ++
 t/test-lib-functions.sh   | 24 +++
 t/test-lib.sh | 19 +-
 12 files changed, 94 insertions(+), 62 deletions(-)

-- 
2.16.2.400.g911b7cc0da



Re: [PATCH 2/3] t: teach 'test_must_fail' to save the command's stderr to a file

2018-02-23 Thread SZEDER Gábor
On Fri, Feb 9, 2018 at 3:21 PM, Jeff King  wrote:
> On Fri, Feb 09, 2018 at 03:42:34AM +0100, SZEDER Gábor wrote:
>
>> To check that a git command fails and to inspect its error message we
>> usually execute a command like this throughout our test suite:
>>
>>   test_must_fail git command --option 2>output.err
>>
>> Note that this command doesn't limit the redirection to the git
>> command, but it redirects the standard error of the 'test_must_fail'
>> helper function as well.  This is wrong for several reasons:
>>
>>   - If that git command were to succeed or die for an unexpected
>> reason e.g. signal, then 'test_must_fail's helpful error message
>> would end up in the given file instead of on the terminal or in
>> 't/test-results/$T.out', when the test is run with '-v' or
>> '--verbose-log', respectively.
>>
>>   - If the test is run with '-x', the trace of executed commands in
>> 'test_must_fail' will go to stderr as well (except for more recent
>> Bash versions supporting $BASH_XTRACEFD), and then in turn will
>> end up in the given file.
>
> I have to admit that I'm slightly negative on this approach

Well, now I'm not just slightly negative, see the patch series I will
send out in a minute :)

> just
> because:
>
>   1. It requires every caller to do something special, rather than just
>  using normal redirection. And the feature isn't as powerful as
>  redirection. E.g., some callers do:
>
>test_must_fail foo >output 2>&1

Yeah, they do, but most of the time they shouldn't.

I've looked at uses of 'test_must_fail cmd... 2>&1', and also uses of
other test helper functions with stderr duplicated from stdout (and a
couple of cases of "plain" 'git cmd ... 2>&1' as well; that's where the
t6300-for-each-ref fix came from some days ago).

I think all but one of those cases would benefit from handling stdout
and stderr separately, and that one exception shouldn't use
'test_must_fail' for a platform command in the first place[1].

  - Many of these tests check error messages after they mingled stderr
and stdout like this:

  test_must_fail git cmd >out 2>&1 &&
  (test_i18n)grep "relevant part of the error msg" out

In these cases there is no need for 2>&1 as they don't care about
stdout at all.  They could just grep the failed command's stderr,
with the added benefit that they would ensure that the error message
indeed goes to stderr.

  - A couple of these tests check the output more strictly:

  test_must_fail git cmd >actual 2>&1 &&
  echo "error message" >expect &&
  test_(i18n)cmp expect actual

which also checks that nothing was printed on stdout, but does so
implicitly.  Considering all the sloppy uses of 2>&1, it's hard to
tell whether it was deliberate to check the empty stdout, or
accidental, because the writer of the test used 'test_cmp' instead
of 'grep' for its verbosity on failure, or simply mixed up the two.
Therefore, I think it's better when it's written more explicitly:

  test_must_fail git cmd >out 2>err &&
  echo "exact error message" >expect &&
  test_cmp expect err &&
  test_must_be_empty out

... both when reading such a test and when looking at it's output on
failure.

  - Then there are some cases that, for some reason, completely silence
a git command with:

  test_must_fail git cmd >/dev/null 2>&1

The output of a command could turn out to be useful when debugging a
failing tests, so tests shouldn't do this, unless there is a very
good reason (e.g. the command produces a lot of output).


  - Finally, a few tests check that something is surely missing from a
git command's output, both from stdout and stderr:

  test_must_fail git cmd >out 2>&1 &&
  ! grep "should not be there" out

Or that a command is completely silent:

  test_must_fail git cmd --quiet >out 2>&1 &&
  test_must_be_empty out

Checking the emptiness of stdout and stderr separately would give
more info on failure, as it would tell us right away where the
unexpected output is coming from (though arguably the presence of
words like "error" or "fatal" in the output of 'test_must_be_empty'
would be a dead giveaway).
BTW, we already have a couple of careful tests that do:

  git cmd >out 2>err &&
  test_must_be_empty out &&
  test_must_be_empty err

(Which makes me think that extending 'test_must_be_empty' to check
the emptiness of multiple files at once would be worth it: fewer
lines, fewer characters to type, and it could report all non-empty
files, not just the first.)

>  But:
>
>test_must_fail stderr=output foo >output
>
>  is not quite right (stdout and stderr outputs may clobber each
>  other, because they have independent position pointers).

Relying on the combination of the unbuffered stderr and the
block-buffered stdout could be fragile to begin 

Re: [PATCH 15/27] sha1_file: allow prepare_alt_odb to handle arbitrary repositories

2018-02-23 Thread Stefan Beller
On Wed, Feb 21, 2018 at 4:35 PM, Jonathan Tan  wrote:
> On Tue, 20 Feb 2018 17:54:18 -0800
> Stefan Beller  wrote:
>
>> -void prepare_alt_odb_the_repository(void)
>> +void prepare_alt_odb(struct repository *r)
>>  {
>> - const char *alt;
>> -
>> - if (the_repository->objects.alt_odb_tail)
>> + if (r->objects.alt_odb_tail)
>>   return;
>>
>> - alt = getenv(ALTERNATE_DB_ENVIRONMENT);
>> + r->objects.alt_odb_tail = >objects.alt_odb_list;
>> +
>> + if (!r->ignore_env) {
>> + const char *alt = getenv(ALTERNATE_DB_ENVIRONMENT);
>> + if (!alt)
>> + alt = "";
>
> alt can be NULL, just like in the existing code, so these 2 lines are
> unnecessary. (link_alt_odb_entries() will just do nothing in either
> case.)
>
> (I also think that the check of absence of alt should be done by the
> caller of link_alt_odb_entries(), not by link_alt_odb_entries() itself,
> but that is much beyond the scope of this patch set.)
>

reverted to a literal translation of s/the_repository/r/, I forget
why I implemented this change in the first place. dropped for now.

Thanks,
Stefan


Re: [PATCH 2/2] Revert "repository: pre-initialize hash algo pointer"

2018-02-23 Thread Brandon Williams
On 02/23, brian m. carlson wrote:
> On Fri, Feb 23, 2018 at 04:56:40PM +0700, Nguyễn Thái Ngọc Duy wrote:
> > diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> > index 7e3e1a461c..8ee935504e 100644
> > --- a/builtin/index-pack.c
> > +++ b/builtin/index-pack.c
> > @@ -1673,6 +1673,11 @@ int cmd_index_pack(int argc, const char **argv, 
> > const char *prefix)
> > if (prefix && chdir(prefix))
> > die(_("Cannot come back to cwd"));
> >  
> > +   if (!the_hash_algo) {
> > +   warning(_("Running without a repository, assuming SHA-1 hash"));
> > +   repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
> > +   }
> 
> Is this warning going to be visible to users in the normal course of
> operation?  If so, people are probably going to find this bothersome or
> alarming.
> 
> > for (i = 1; i < argc; i++) {
> > const char *arg = argv[i];
> >  
> > diff --git a/common-main.c b/common-main.c
> > index 6a689007e7..12aec36794 100644
> > --- a/common-main.c
> > +++ b/common-main.c
> > @@ -1,6 +1,7 @@
> >  #include "cache.h"
> >  #include "exec_cmd.h"
> >  #include "attr.h"
> > +#include "repository.h"
> >  
> >  /*
> >   * Many parts of Git have subprograms communicate via pipe, expect the
> > @@ -40,5 +41,8 @@ int main(int argc, const char **argv)
> >  
> > restore_sigpipe_to_default();
> >  
> > +   if (getenv("GIT_HASH_FIXUP"))
> > +   repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
> 
> I'm lukewarm on adding this environment variable, but considering our
> history here, we had probably better.  We can always remove it after a
> few releases.

Yeah I don't know what the best thing to do here would be.  I mean we
could always just init the hash_algo for the_repository here so that we
don't ever have to worry about it, but I don't know if even that is the
right approach.

> 
> > return cmd_main(argc, argv);
> >  }
> > diff --git a/diff-no-index.c b/diff-no-index.c
> > index 0ed5f0f496..f038f665bc 100644
> > --- a/diff-no-index.c
> > +++ b/diff-no-index.c
> > @@ -241,6 +241,11 @@ void diff_no_index(struct rev_info *revs,
> > struct strbuf replacement = STRBUF_INIT;
> > const char *prefix = revs->prefix;
> >  
> > +   if (!the_hash_algo) {
> > +   warning(_("Running without a repository, assuming SHA-1 hash"));
> > +   repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
> > +   }
> 
> Again, same concern.  I can imagine scripts that will blow up loudly if
> git diff --no-index spews things to standard error.
> 
> I'm not opposed to making this more visible, but I wonder if maybe it
> should only be visible to developers or such.  The only way I can think
> of doing is that is with an advice options, but maybe there's a better
> way.
> -- 
> brian m. carlson / brian with sandals: Houston, Texas, US
> https://www.crustytoothpaste.net/~bmc | My opinion only
> OpenPGP: https://keybase.io/bk2204



-- 
Brandon Williams


Re: [PATCH 16/27] sha1_file: add repository argument to sha1_file_name

2018-02-23 Thread Brandon Williams
On 02/23, Stefan Beller wrote:
> On Wed, Feb 21, 2018 at 4:51 PM, Brandon Williams  wrote:
> > On 02/20, Stefan Beller wrote:
> >> Add a repository argument to allow sha1_file_name callers to be more
> >> specific about which repository to handle. This is a small mechanical
> >> change; it doesn't change the implementation to handle repositories
> >> other than the_repository yet.
> >>
> >> As with the previous commits, use a macro to catch callers passing a
> >> repository other than the_repository at compile time.
> >>
> >> While at it, move the declaration to object-store.h, where it should
> >> be easier to find.
> >
> > Seems like we may want to make a sha1-file.h or an oid-file.h or
> > something like that at some point as that seems like a better place for
> > the function than in the object-store.h file?
> 
> It depends what our long term goal is.
> Do we want header and source file name to match for each function?
> Or do we want a coarser set of headers, such that we have a broad
> object-store.h, but that is implemented in sha1_file.c, packfile.c
> for the parts of the raw_objectstore and other .c files for the higher
> levels of the object store?
> 
> For now I'd just keep it in object-store.h as moving out just a couple
> functions seems less consistent?

Fair enough :)

-- 
Brandon Williams


Re: [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader

2018-02-23 Thread Brandon Williams
On 02/23, Stefan Beller wrote:
> On Fri, Feb 23, 2018 at 1:30 PM, Brandon Williams  wrote:
> > On 02/22, Stefan Beller wrote:
> >> > +static enum protocol_version discover_version(struct packet_reader 
> >> > *reader)
> >> > +{
> >> ...
> >> > +
> >> > +   /* Maybe process capabilities here, at least for v2 */
> >> > +   switch (version) {
> >> > +   case protocol_v1:
> >> > +   /* Read the peeked version line */
> >> > +   packet_reader_read(reader);
> >> > +   break;
> >> > +   case protocol_v0:
> >> > +   break;
> >> > +   case protocol_unknown_version:
> >> > +   die("unknown protocol version: '%s'\n", reader->line);
> >>
> >> The following patches introduce more of the switch(version) cases.
> >> And there it actually is a
> >> BUG("protocol version unknown? should have been set in 
> >> discover_version")
> >> but here it is a mere
> >>   die (_("The server uses a different protocol version than we can
> >> speak: %s\n"),
> >>   reader->line);
> >> so I would think here it is reasonable to add _(translation).
> >
> > This should be a BUG as it shouldn't ever be unknown at this point.  And
> > I'll also drop that comment.
> 
> Huh?
> Then I miss-understood the flow of code. When the server announces its
> answer is version 42, but the client cannot handle it, which die call is
> responsible for reporting it to the user?
> (That is technically a BUG on the server side, as we probably never
> asked for v42, so I would not want to print BUG locally at the client?)

This is handled in 
`determine_protocol_version_client(const char *server_response)`,
which is just a few lines out of context here.

-- 
Brandon Williams


Re: [PATCH] strbuf_read_file(): preserve errno across close() call

2018-02-23 Thread René Scharfe
Am 23.02.2018 um 23:17 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> +#define IGNORE_ERROR(expr) do { int e_ = errno; expr; errno = e_; } while 
>> (0)
> 
> The macro certainly is a cute idea, but ...
> 
>> @@ -391,7 +393,7 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t 
>> hint)
>>   
>>  if (got < 0) {
>>  if (oldalloc == 0)
>> -strbuf_release(sb);
>> +IGNORE_ERROR(strbuf_release(sb));
>>  else
>>  strbuf_setlen(sb, oldlen);
>>  return -1;
> 
> ... ideally, I would imagine that we wish we could write this hunk
> to something that expands to:
> 
>   if (got < 0) {
>   do {
>  int e_ = errno;
>  if (oldalloc == 0)
>  strbuf_release(sb);
>  else
>  strbuf_setlen(sb, oldlen);
>  errno = e_;
>   } while (0);
>   return -1;
> 
> no?  That is (1) we do not want to rely too much on knowing that
> strbuf_setlen() is very thin and does not touch errno, and hence (2)
> we want to mark not just a single expr but a block as "we know we
> got an error and errno from that error is more precious than what we
> do in this block to clean thihngs up".

Relying on that internal knowledge should be OK in strbuf.c, but in
this specific example we could of course do:

if (oldalloc == 0)
IGNORE_ERROR(strbuf_release(sb));
else
IGNORE_ERROR(strbuf_setlen(sb, oldlen));

I guess ignoring errors of whole blocks is not that common, based on
a quick search (git grep -W int.*_errno).  And in such a case we could
factor that code out into a separate function, if really needed.  Or
continue saving errno explicitly.

Compilers should be smart enough to avoid saving and restoring errno
between multiple uses of that macro, e.g. code like this would only do
it once, from what I saw when experimenting with the Compiler Explorer
(https://godbolt.org/):

IGNORE_ERROR(close(fd1));
IGNORE_ERROR(close(fd2));

> Of course, a pair of macros
> 
>   #define IGNORE_ERROR_BEGIN do { int e_ = errno
>   #define IGNORE_ERROR_END errno = e_; } while (0)
> 
> is probably the only way to do so in C, and that is already too ugly
> to live, so we cannot achieve the ideal.
> 
> So I dunno..

*shudder*

> 
>> @@ -617,9 +619,11 @@ ssize_t strbuf_read_file(struct strbuf *sb, const char 
>> *path, size_t hint)
>>  if (fd < 0)
>>  return -1;
>>  len = strbuf_read(sb, fd, hint);
>> -close(fd);
>> -if (len < 0)
>> +if (len < 0) {
>> +IGNORE_ERROR(close(fd));
>>  return -1;
>> +}
>> +close(fd);
>>   
>>  return len;
>>   }


Re: [PATCH 22/27] sha1_file: allow sha1_file_name to handle arbitrary repositories

2018-02-23 Thread Stefan Beller
On Wed, Feb 21, 2018 at 4:44 PM, Jonathan Tan  wrote:
> On Tue, 20 Feb 2018 17:54:25 -0800
> Stefan Beller  wrote:
>
>> Signed-off-by: Stefan Beller 
>> Signed-off-by: Jonathan Nieder 
>
> Reviewed-by: Jonathan Tan 
>
>> -void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
>> *sha1)
>> +void sha1_file_name(struct repository *r, struct strbuf *buf, const 
>> unsigned char *sha1)
>>  {
>> - strbuf_addstr(buf, get_object_directory());
>> + strbuf_addstr(buf, r->objects.objectdir);
>>   strbuf_addch(buf, '/');
>>   fill_sha1_path(buf, sha1);
>>  }
>
> In the future, we should probably have:
>  - a function to get the object store out of a repo (so that it can
>lazily initialize the object store struct if necessary)
>  - when the object store is obtained, its objectdir field is guaranteed
>to be populated
>  - sha1_file_name should take the object store struct, not the repo
>struct

We had that in v2, but I reverted it as the consensus seemed that we prefer
the_repository to be passed around.

>
> but this is outside the scope of this patch.


Re: [PATCH 0/2] Fix initializing the_hash_algo

2018-02-23 Thread brian m. carlson
On Fri, Feb 23, 2018 at 04:56:38PM +0700, Nguyễn Thái Ngọc Duy wrote:
> I can certainly try! I start to remember all the hairy details in that
> setup code.
> 
> The first step may be something like this, which identifies all the
> "repo init" entry points. This is basically a revert of e26f7f19b6
> (repository: pre-initialize hash algo pointer - 2018-01-19) and doing
> things the proper way, hopefully.
> 
> This is on 'master', independent from Stefan's series. I have another
> patch on top of that series to remove the use of ignore_env in
> sha1_file.c (and things seem to work). Basically whenever you have to
> initialize the hash algorithm, there's a good chance you need to
> initialize object store as well. But I'll hold that off until
> Stefan's and this one are both merged.

I definitely think this series is an improvement over my previous patch.

My major concern is alarming users (or breaking scripts) with the
warning message.  I wonder if deferring the use of the warning until we
have multiple hash algorithms might be a better idea.  At that point,
the warning would become something people could act upon.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2] subtree: fix add and pull for GPG-signed commits

2018-02-23 Thread Junio C Hamano
Stephen R Guglielmo  writes:

> If log.showsignature is true (or --show-signature is passed) while
> performing a `subtree add` or `subtree pull`, the command fails.
>
> toptree_for_commit() calls `log` and passes the output to `commit-tree`.
> If this output shows the GPG signature data, `commit-tree` throws a
> fatal error.
>
> This commit fixes the issue by adding --no-show-signature to `log` calls
> in a few places, as well as using the more appropriate `rev-parse`
> instead where possible.
>
> Signed-off-by: Stephen R Guglielmo 
> ---
>  contrib/subtree/git-subtree.sh | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)

This was too heavily whitespace damaged so I recreated your patch
manually from scratch and queued, during which time I may have made
silly and simple mistakes.  Please double check what appears on the
'pu' branch in a few hours.

Thanks.

I am however starting to feel that

 (1) add gitlog="git log" and then do s/git log/$gitlog/; to the
 remainder of the whole script in patch 1/2; and

 (2) turn the variable definition to gitlog="git log --no-show-signature"
 in patch 2/2

may be a better approach.  After all, this script is not prepared to
be used by any group of people who use signed commits, and showing
commit signature in any of its use of 'git log', either present or
in the future, will not be useful to it, I suspect.

>
> diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
> index dec085a23..9594ca4b5 100755
> --- a/contrib/subtree/git-subtree.sh
> +++ b/contrib/subtree/git-subtree.sh
> @@ -297,7 +297,7 @@ find_latest_squash () {
>   main=
>   sub=
>   git log --grep="^git-subtree-dir: $dir/*\$" \
> - --pretty=format:'START %H%n%s%n%n%b%nEND%n' HEAD |
> + --no-show-signature --pretty=format:'START %H%n%s%n%n%b%nEND%n' HEAD |
>   while read a b junk
>   do
>   debug "$a $b $junk"
> @@ -341,7 +341,7 @@ find_existing_splits () {
>   main=
>   sub=
>   git log --grep="^git-subtree-dir: $dir/*\$" \
> - --pretty=format:'START %H%n%s%n%n%b%nEND%n' $revs |
> + --no-show-signature --pretty=format:'START %H%n%s%n%n%b%nEND%n' $revs |
>   while read a b junk
>   do
>   case "$a" in
> @@ -382,7 +382,7 @@ copy_commit () {
>   # We're going to set some environment vars here, so
>   # do it in a subshell to get rid of them safely later
>   debug copy_commit "{$1}" "{$2}" "{$3}"
> - git log -1 --pretty=format:'%an%n%ae%n%aD%n%cn%n%ce%n%cD%n%B' "$1" |
> + git log --no-show-signature -1
> --pretty=format:'%an%n%ae%n%aD%n%cn%n%ce%n%cD%n%B' "$1" |
>   (
>   read GIT_AUTHOR_NAME
>   read GIT_AUTHOR_EMAIL
> @@ -462,8 +462,8 @@ squash_msg () {
>   oldsub_short=$(git rev-parse --short "$oldsub")
>   echo "Squashed '$dir/' changes from $oldsub_short..$newsub_short"
>   echo
> - git log --pretty=tformat:'%h %s' "$oldsub..$newsub"
> - git log --pretty=tformat:'REVERT: %h %s' "$newsub..$oldsub"
> + git log --no-show-signature --pretty=tformat:'%h %s' "$oldsub..$newsub"
> + git log --no-show-signature --pretty=tformat:'REVERT: %h %s'
> "$newsub..$oldsub"
>   else
>   echo "Squashed '$dir/' content from commit $newsub_short"
>   fi
> @@ -475,7 +475,7 @@ squash_msg () {
>
>  toptree_for_commit () {
>   commit="$1"
> - git log -1 --pretty=format:'%T' "$commit" -- || exit $?
> + git rev-parse --verify "$commit^{tree}" || exit $?
>  }
>
>  subtree_for_commit () {


Re: [PATCH 16/27] sha1_file: add repository argument to sha1_file_name

2018-02-23 Thread Stefan Beller
On Wed, Feb 21, 2018 at 4:51 PM, Brandon Williams  wrote:
> On 02/20, Stefan Beller wrote:
>> Add a repository argument to allow sha1_file_name callers to be more
>> specific about which repository to handle. This is a small mechanical
>> change; it doesn't change the implementation to handle repositories
>> other than the_repository yet.
>>
>> As with the previous commits, use a macro to catch callers passing a
>> repository other than the_repository at compile time.
>>
>> While at it, move the declaration to object-store.h, where it should
>> be easier to find.
>
> Seems like we may want to make a sha1-file.h or an oid-file.h or
> something like that at some point as that seems like a better place for
> the function than in the object-store.h file?

It depends what our long term goal is.
Do we want header and source file name to match for each function?
Or do we want a coarser set of headers, such that we have a broad
object-store.h, but that is implemented in sha1_file.c, packfile.c
for the parts of the raw_objectstore and other .c files for the higher
levels of the object store?

For now I'd just keep it in object-store.h as moving out just a couple
functions seems less consistent?

Stefan


Re: [PATCH 13/27] sha1_file: add repository argument to prepare_alt_odb

2018-02-23 Thread Stefan Beller
>> --- a/sha1_file.c
>> +++ b/sha1_file.c
>> @@ -23,6 +23,7 @@
>>  #include "sha1-lookup.h"
>>  #include "bulk-checkin.h"
>>  #include "repository.h"
>> +#include "object-store.h"
>
> Should this #include have been added in an earlier patch, since the
> file both uses and defines prepare_alt_odb, which is declared there?

This is another superflous include, dropped it. Thanks for catching.


Re: [PATCH 2/2] Revert "repository: pre-initialize hash algo pointer"

2018-02-23 Thread brian m. carlson
On Fri, Feb 23, 2018 at 04:56:40PM +0700, Nguyễn Thái Ngọc Duy wrote:
> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index 7e3e1a461c..8ee935504e 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -1673,6 +1673,11 @@ int cmd_index_pack(int argc, const char **argv, const 
> char *prefix)
>   if (prefix && chdir(prefix))
>   die(_("Cannot come back to cwd"));
>  
> + if (!the_hash_algo) {
> + warning(_("Running without a repository, assuming SHA-1 hash"));
> + repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
> + }

Is this warning going to be visible to users in the normal course of
operation?  If so, people are probably going to find this bothersome or
alarming.

>   for (i = 1; i < argc; i++) {
>   const char *arg = argv[i];
>  
> diff --git a/common-main.c b/common-main.c
> index 6a689007e7..12aec36794 100644
> --- a/common-main.c
> +++ b/common-main.c
> @@ -1,6 +1,7 @@
>  #include "cache.h"
>  #include "exec_cmd.h"
>  #include "attr.h"
> +#include "repository.h"
>  
>  /*
>   * Many parts of Git have subprograms communicate via pipe, expect the
> @@ -40,5 +41,8 @@ int main(int argc, const char **argv)
>  
>   restore_sigpipe_to_default();
>  
> + if (getenv("GIT_HASH_FIXUP"))
> + repo_set_hash_algo(the_repository, GIT_HASH_SHA1);

I'm lukewarm on adding this environment variable, but considering our
history here, we had probably better.  We can always remove it after a
few releases.

>   return cmd_main(argc, argv);
>  }
> diff --git a/diff-no-index.c b/diff-no-index.c
> index 0ed5f0f496..f038f665bc 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -241,6 +241,11 @@ void diff_no_index(struct rev_info *revs,
>   struct strbuf replacement = STRBUF_INIT;
>   const char *prefix = revs->prefix;
>  
> + if (!the_hash_algo) {
> + warning(_("Running without a repository, assuming SHA-1 hash"));
> + repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
> + }

Again, same concern.  I can imagine scripts that will blow up loudly if
git diff --no-index spews things to standard error.

I'm not opposed to making this more visible, but I wonder if maybe it
should only be visible to developers or such.  The only way I can think
of doing is that is with an advice options, but maybe there's a better
way.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH] Call timegm and timelocal with 4-digit year

2018-02-23 Thread Junio C Hamano
"Bernhard M. Wiedemann"  writes:

> amazingly timegm(gmtime(0)) is only 0 before 2020
> because perl's timegm deviates from GNU timegm(3) in how it handles years.
>
> man Time::Local says
>
>  Whenever possible, use an absolute four digit year instead.
>
> with a detailed explanation about ambiguity of 2-digit years above that.

That's amusing.  Will queue; thanks.


Re: [PATCH 2/2] Revert "repository: pre-initialize hash algo pointer"

2018-02-23 Thread Junio C Hamano
Stefan Beller  writes:

> On Fri, Feb 23, 2018 at 11:50 AM, Junio C Hamano  wrote:
>> Stefan Beller  writes:
>>
>>> I wonder if there is yet another missing case in the enumeration of
>>> the previous patch:
>>> Some commands are able to operate on GIT_OBJECT_DIR instead
>>> of GIT_DIR (git repack?), which may not even explore the full git directory,
>>> and so doesn't know about the hash value.
>>
>> ... because GIT_DIR/config is not known?  "repack" is not one of
>> them, though---it needs to at least use refs as the starting point
>> so a standalone OBJECT_DIR is insufficient.
>
> Yes, I could have worded this as a question:
> Is there any command that operates on GIT_OBJECT_DIR
> without trying to discover GIT_DIR ?

If somebody finds one that would be a good argument not to pursue
the approach.  Lack of response to the question would not amount to
that much---it is possible all people who bothered to think of
overlooked something obvious, though.  "When I asked around nobody
thought of a possibly way for this to cause breakages, so let's
declare it is safe to do so and do it" is not how we want to do
things X-<.



Re: [PATCH 08/27] pack: move approximate object count to object store

2018-02-23 Thread Stefan Beller
On Wed, Feb 21, 2018 at 4:47 PM, Brandon Williams  wrote:
> On 02/20, Stefan Beller wrote:
>> The approximate_object_count() function maintains a rough count of
>> objects in a repository to estimate how long object name abbreviates
>> should be.  Object names are scoped to a repository and the
>> appropriate length may differ by repository, so the object count
>> should not be global.
>>
>> Signed-off-by: Stefan Beller 
>> Signed-off-by: Jonathan Nieder 
>> ---
>>  object-store.h | 10 +-
>>  packfile.c | 11 +--
>>  2 files changed, 14 insertions(+), 7 deletions(-)
>>
>> diff --git a/object-store.h b/object-store.h
>> index 6cecba3951..bd1e4fcd8b 100644
>> --- a/object-store.h
>> +++ b/object-store.h
>> @@ -93,6 +93,14 @@ struct raw_object_store {
>>   struct alternate_object_database *alt_odb_list;
>>   struct alternate_object_database **alt_odb_tail;
>>
>> + /*
>> +  * A fast, rough count of the number of objects in the repository.
>> +  * These two fields are not meant for direct access. Use
>> +  * approximate_object_count() instead.
>> +  */
>> + unsigned long approximate_object_count;
>> + unsigned approximate_object_count_valid : 1;
>
> Patch looks fine and is effectively a no-op, though what is the need for
> both of these variables?  Maybe it can be simplified down to just use
> one?  Just musing as its out of the scope of this patch and we probably
> shouldn't try to change that in this series.

I agree we should. It was introduced in e323de4ad7f (sha1_file:
allow sha1_loose_object_info to handle arbitrary repositories, 2017-08-30)
and I think it was seen as a clever optimization trick back then?


Re: [PATCH 1/2] setup.c: initialize the_repository correctly in all cases

2018-02-23 Thread brian m. carlson
On Fri, Feb 23, 2018 at 04:56:39PM +0700, Nguyễn Thái Ngọc Duy wrote:
> [3] the reason server side is still running ok with no hash algo
> before [2] is because the programs that use enter_repo() do very
> little then spawn a new program (like pack-objects or
> upload-archive) to do the heavy lifting. These programs already
> use setup_git_dir..()

You have "..()" here.  Did you want to say "()." instead?
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH] strbuf_read_file(): preserve errno across close() call

2018-02-23 Thread Junio C Hamano
René Scharfe  writes:

> +#define IGNORE_ERROR(expr) do { int e_ = errno; expr; errno = e_; } while (0)

The macro certainly is a cute idea, but ...

> @@ -391,7 +393,7 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t 
> hint)
>  
>   if (got < 0) {
>   if (oldalloc == 0)
> - strbuf_release(sb);
> + IGNORE_ERROR(strbuf_release(sb));
>   else
>   strbuf_setlen(sb, oldlen);
>   return -1;

... ideally, I would imagine that we wish we could write this hunk
to something that expands to:

if (got < 0) {
do {
int e_ = errno;
if (oldalloc == 0)
strbuf_release(sb);
else
strbuf_setlen(sb, oldlen);
errno = e_;
} while (0);
return -1;

no?  That is (1) we do not want to rely too much on knowing that
strbuf_setlen() is very thin and does not touch errno, and hence (2)
we want to mark not just a single expr but a block as "we know we
got an error and errno from that error is more precious than what we
do in this block to clean thihngs up".

Of course, a pair of macros

#define IGNORE_ERROR_BEGIN do { int e_ = errno
#define IGNORE_ERROR_END errno = e_; } while (0)

is probably the only way to do so in C, and that is already too ugly
to live, so we cannot achieve the ideal.

So I dunno..

> @@ -617,9 +619,11 @@ ssize_t strbuf_read_file(struct strbuf *sb, const char 
> *path, size_t hint)
>   if (fd < 0)
>   return -1;
>   len = strbuf_read(sb, fd, hint);
> - close(fd);
> - if (len < 0)
> + if (len < 0) {
> + IGNORE_ERROR(close(fd));
>   return -1;
> + }
> + close(fd);
>  
>   return len;
>  }


Re: [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader

2018-02-23 Thread Stefan Beller
On Fri, Feb 23, 2018 at 1:30 PM, Brandon Williams  wrote:
> On 02/22, Stefan Beller wrote:
>> > +static enum protocol_version discover_version(struct packet_reader 
>> > *reader)
>> > +{
>> ...
>> > +
>> > +   /* Maybe process capabilities here, at least for v2 */
>> > +   switch (version) {
>> > +   case protocol_v1:
>> > +   /* Read the peeked version line */
>> > +   packet_reader_read(reader);
>> > +   break;
>> > +   case protocol_v0:
>> > +   break;
>> > +   case protocol_unknown_version:
>> > +   die("unknown protocol version: '%s'\n", reader->line);
>>
>> The following patches introduce more of the switch(version) cases.
>> And there it actually is a
>> BUG("protocol version unknown? should have been set in discover_version")
>> but here it is a mere
>>   die (_("The server uses a different protocol version than we can
>> speak: %s\n"),
>>   reader->line);
>> so I would think here it is reasonable to add _(translation).
>
> This should be a BUG as it shouldn't ever be unknown at this point.  And
> I'll also drop that comment.

Huh?
Then I miss-understood the flow of code. When the server announces its
answer is version 42, but the client cannot handle it, which die call is
responsible for reporting it to the user?
(That is technically a BUG on the server side, as we probably never
asked for v42, so I would not want to print BUG locally at the client?)


Re: [PATCH v3 12/35] serve: introduce git-serve

2018-02-23 Thread Brandon Williams
On 02/22, Jeff King wrote:
> On Tue, Feb 06, 2018 at 05:12:49PM -0800, Brandon Williams wrote:
> 
> > +In protocol v2 communication is command oriented.  When first contacting a
> > +server a list of capabilities will advertised.  Some of these capabilities
> > +will be commands which a client can request be executed.  Once a command
> > +has completed, a client can reuse the connection and request that other
> > +commands be executed.
> 
> If I understand this correctly, we'll potentially have a lot more
> round-trips between the client and server (one per "command"). And for
> git-over-http, each one will be its own HTTP request?
> 
> We've traditionally tried to minimize HTTP requests, but I guess it's
> not too bad if we can keep the connection open in most cases. Then we
> just suffer some extra framing bytes, but we don't have to re-establish
> the TCP connection each time.
> 
> I do wonder if the extra round trips will be noticeable in high-latency
> conditions. E.g., if I'm 200ms away, converting the current
> ref-advertisement spew to "capabilities, then the client asks for refs,
> then we spew the refs" is going to cost an extra 200ms, even if the
> fetch just ends up being a noop. I'm not sure how bad that is in the
> grand scheme of things (after all, the TCP handshake involves some
> round-trips, too).

I think this is the price of extending the protocol in a backward
compatible way.  If we don't want to be backwards compatible (allowing
for graceful fallback to v1) then we could design this differently.
Even so we're not completely out of luck just yet.

Back when I introduced the GIT_PROTOCOL side-channel I was able to
demonstrate that arbitrary data could be sent to the server and it would
only respect the stuff it knows about.  This means that we can do a
follow up to v2 at some point to introduce an optimization where we can
stuff a request into GIT_PROTOCOL and short-circuit the first round-trip
if the server supports it.

> 
> > + Capability Advertisement
> > +--
> > +
> > +A server which decides to communicate (based on a request from a client)
> > +using protocol version 2, notifies the client by sending a version string
> > +in its initial response followed by an advertisement of its capabilities.
> > +Each capability is a key with an optional value.  Clients must ignore all
> > +unknown keys.  Semantics of unknown values are left to the definition of
> > +each key.  Some capabilities will describe commands which can be requested
> > +to be executed by the client.
> > +
> > +capability-advertisement = protocol-version
> > +  capability-list
> > +  flush-pkt
> > +
> > +protocol-version = PKT-LINE("version 2" LF)
> > +capability-list = *capability
> > +capability = PKT-LINE(key[=value] LF)
> > +
> > +key = 1*CHAR
> > +value = 1*CHAR
> > +CHAR = 1*(ALPHA / DIGIT / "-" / "_")
> > +
> > +A client then responds to select the command it wants with any particular
> > +capabilities or arguments.  There is then an optional section where the
> > +client can provide any command specific parameters or queries.
> > +
> > +command-request = command
> > + capability-list
> > + (command-args)
> > + flush-pkt
> > +command = PKT-LINE("command=" key LF)
> > +command-args = delim-pkt
> > +  *arg
> > +arg = 1*CHAR
> 
> For a single stateful TCP connection like git:// or git-over-ssh, the
> client would get the capabilities once and then issue a series of
> commands. For git-over-http, how does it work?
> 
> The client speaks first in HTTP, so we'd first make a request to get
> just the capabilities from the server? And then proceed from there with
> a series of requests, assuming that the capabilities for each server we
> subsequently contact are the same? That's probably reasonable (and
> certainly the existing http protocol makes that capabilities
> assumption).
> 
> I don't see any documentation on how this all works with http. But

I can add in a bit for the initial request when using http, but the rest
of it should function the same.

> reading patch 34, it looks like we just do the usual
> service=git-upload-pack request (with the magic request for v2), and
> then the server would send us capabilities. Which follows my line of
> thinking in the paragraph above.

Yes this is exactly how it should work.  First we make an info/refs
request and if the server speaks v2 then instead of a refs request we
should get back a capability listing.  Then subsequent requests are made
assuming the capabilities are the same like we've done with the
existing protocol.

The great thing about this is that from the POV of the git-client, it
doesn't care if its speaking using the git://, ssh://, file://, or
http:// transport; it's all the same protocol.  In my next re-roll I'll
even drop the "# service" bit from the http server response and then 

Re: [PATCH 05/27] object-store: move packed_git and packed_git_mru to object store

2018-02-23 Thread Stefan Beller
>>  2. Applying the semantic patch
>> contrib/coccinelle/refactoring/packed_git.cocci to adjust callers.
>> This semantic patch is placed in a sub directory of the coccinelle
>> contrib dir, as this semantic patch is not expected to be of general
>> usefulness; it is only useful during developing this series and
>> merging it with other topics in flight. At a later date, just
>> delete that semantic patch.
>
> Can the semantic patch go in the commit message instead?  It is very
> brief.

done

>
> Actually, I don't see this semantic patch in the diffstat.  Is the
> commit message stale?
>
>>  3. Applying line wrapping fixes from "make style" to break the
>> resulting long lines.
>>
>>  4. Adding missing #includes of repository.h and object-store.h
>> where needed.
>
> Is there a way to automate this step?  (I'm asking for my own
> reference when writing future patches, not because of any concern
> about the correctness of this one.)

no, for several reasons.
(a) I don't know how to write it in coccinelle, because
(b) I have not figured out the order in which we include headers, apart from
  "cache.h goes first", the rest of the includes sometimes looks like a random
  order, because different patches add new includes at different places.
  I have the impression, that (1) some add the include after all other
  existing includes or (2) try to figure out where to add the include to make
  sense in the existing include order or (3) sort alphabetically or (4) put it
  randomly, so the chance for a merge conflict with other series in flight
  decreases.
(c) I did it in a semi automated fashion:
while make fails:
add another include

The mess with including repository or object-store comes from the fact that
I had v2 based on object-store, not the repository and cherry-picked this patch
over to v3. Fixed all of the includes now.

>> @@ -59,10 +83,25 @@ struct raw_object_store {
>>*/
>>   char *objectdir;
>>
>> + struct packed_git *packed_git;
>> + /*
>> +  * A most-recently-used ordered version of the packed_git list, which 
>> can
>> +  * be iterated instead of packed_git (and marked via mru_mark).
>> +  */
>> + struct list_head packed_git_mru;
>
> I don't understand the new part of the comment.  Can you explain here,
> for me?

cherrypicking error, fixed.

>> -#define RAW_OBJECT_STORE_INIT { NULL, NULL, NULL }
>> +
>> +/*
>> + * The mru list_head is supposed to be initialized using
>> + * the LIST_HEAD macro, assigning prev/next to itself.
>> + * However this doesn't work in this case as some compilers dislike
>> + * that macro on member variables. Use NULL instead as that is defined
>> + * and accepted, deferring the real init to prepare_packed_git_mru(). */
>
> style nit: '*/' should be on its own line.
>
> More importantly, we can avoid such an issue as described by Junio. :)

See reply to Junio, I am not quite sure I like that.


Re: [PATCH v3 12/35] serve: introduce git-serve

2018-02-23 Thread Brandon Williams
On 02/21, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:12:49 -0800
> Brandon Williams  wrote:
> 
> >  .gitignore  |   1 +
> >  Documentation/technical/protocol-v2.txt | 114 +++
> >  Makefile|   2 +
> >  builtin.h   |   1 +
> >  builtin/serve.c |  30 
> >  git.c   |   1 +
> >  serve.c | 250 
> > 
> >  serve.h |  15 ++
> >  t/t5701-git-serve.sh|  60 
> >  9 files changed, 474 insertions(+)
> >  create mode 100644 Documentation/technical/protocol-v2.txt
> >  create mode 100644 builtin/serve.c
> >  create mode 100644 serve.c
> >  create mode 100644 serve.h
> >  create mode 100755 t/t5701-git-serve.sh
> 
> As someone who is implementing the server side of protocol V2 in JGit, I
> now have a bit more insight into this :-)
> 
> First of all, I used to not have a strong opinion on the existence of a
> new endpoint, but now I think that it's better to *not* have git-serve.
> As it is, as far as I can tell, upload-pack also needs to support (and
> does support, as of the end of this patch set) protocol v2 anyway, so it
> might be better to merely upgrade upload-pack.

Having it allows for easier testing and the easy ability to make it a
true endpoint when we want to.  As of right now, git-serve isn't an
endpoint as you can't issue requests there via http-backend or
git-daemon.

> 
> > +A client then responds to select the command it wants with any particular
> > +capabilities or arguments.  There is then an optional section where the
> > +client can provide any command specific parameters or queries.
> > +
> > +command-request = command
> > + capability-list
> > + (command-args)
> 
> If you are stating that this is optional, write "*1command-args". (RFC
> 5234 also supports square brackets, but "*1" is already used in
> pack-protocol.txt and http-protocol.txt.)
> 
> > + flush-pkt
> > +command = PKT-LINE("command=" key LF)
> > +command-args = delim-pkt
> > +  *arg
> > +arg = 1*CHAR
> 
> arg should be wrapped in PKT-LINE, I think, and terminated by an LF.

-- 
Brandon Williams


Re: [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader

2018-02-23 Thread Brandon Williams
On 02/22, Stefan Beller wrote:
> > +static enum protocol_version discover_version(struct packet_reader *reader)
> > +{
> ...
> > +
> > +   /* Maybe process capabilities here, at least for v2 */
> > +   switch (version) {
> > +   case protocol_v1:
> > +   /* Read the peeked version line */
> > +   packet_reader_read(reader);
> > +   break;
> > +   case protocol_v0:
> > +   break;
> > +   case protocol_unknown_version:
> > +   die("unknown protocol version: '%s'\n", reader->line);
> 
> The following patches introduce more of the switch(version) cases.
> And there it actually is a
> BUG("protocol version unknown? should have been set in discover_version")
> but here it is a mere
>   die (_("The server uses a different protocol version than we can
> speak: %s\n"),
>   reader->line);
> so I would think here it is reasonable to add _(translation).

This should be a BUG as it shouldn't ever be unknown at this point.  And
I'll also drop that comment.

-- 
Brandon Williams


Re: Git "branch properties"-- thoughts?

2018-02-23 Thread Junio C Hamano
Phillip Wood  writes:

> It would certainly be nice to be able to share branch descriptions so
> that if I clone a repository I can get a bit more detail about the ideas
> behind each branch. Shared descriptions could be displayed in web uis. I

One thing to note is that there is no universal identity for
branches across repositoryes that push and pull from each other.  My
'refs/heads/master' may be copied to your 'refs/remotes/origin/master'
so in that sense, any "branch property" I give to my 'master' branch
(including the 'branch.master.description') could be copied to its
corresponding "remote-tracking branch property" you have, but that
is the easy part.

It is quite hard to figure out how these branch properties you
acquired from me on my branches affect properties of _your_ own
branches that build on top of the branches you obtained from me.

Perhaps a narrow special case of two or more people collaborating on
a single topic branch with the same focus would be helped by blindly
sharing the "branch property" across the local and remote-tracking
branches that share the same name.  I.e. the shared repository may
have 'optimize-frotz' topic branch, you and your friend both copy to
your 'refs/remotes/origin/optimize-frotz' remote-tracking branch,
and these copies will share the same "branch properties" copied from
the shared repository.  Then if you and your friend both work on the
topic by "git checkout -b optimize-frotz origin/optimize-frotz",
perhaps your and your friend's optmize-frotz branch would inherit
the same "branch properties" copied from their upstream.  Because
all of the participants are focused on a rather narrow task of
"optimizing the frotz feature" on their branches that share the same
name, pretending as if these are "logically" the same branch, and
enforcing that by sharing the "branch properties", may make sense.

But for a generic branch like 'master', that arrangement would not
work well, I am afraid.  You may have N copies of the same project,
where the 'master' branch of each is used to work on separate topic.
The focus of the 'master' branch of the overall project would be
quite different from each of these copies you have, hence it is
likely that it would be inappropriate to share the task list and
stuff you would want to add to branch descriptions and branch
properties between the shared repository's 'master' and any of your
'master' in these N copies you have.

So...


Re: [PATCH v3 11/35] test-pkt-line: introduce a packet-line test helper

2018-02-23 Thread Brandon Williams
On 02/22, Stefan Beller wrote:
> On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams  wrote:
> 
> > +static void pack_line(const char *line)
> > +{
> > +   if (!strcmp(line, "") || !strcmp(line, "\n"))
> 
> From our in-office discussion:
> v1/v0 packs pktlines twice in http, which is not possible to
> construct using this test helper when using the same string
> for the packed and unpacked representation of flush and delim packets,
> i.e. test-pkt-line --pack $(test-pkt-line --pack ) would produce
> '' instead of '0009\n'.
> To fix it we'd have to replace the unpacked versions of these pkts to
> something else such as "FLUSH" "DELIM".
> 
> However as we do not anticipate the test helper to be used in further
> tests for v0, this ought to be no big issue.
> Maybe someone else cares though?

I'm going to punt and say, if someone cares enough they can update this
test-helper when they want to use it for v1/v0 stuff.

-- 
Brandon Williams


Re: [PATCH v5 00/17] document & test fetch pruning & add fetch.pruneTags

2018-02-23 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

>> Let's see how others find it useful and/or if the changed code gets
>> in the way of others (I am not absolutely sure if the changes are
>> free of regression to existing users who do not use the new
>> feature).
>
> I think if you're on the fence about merging it down (and others don't
> chime in saying the want it / like it) it makes sense to merge down
> 1-14/17 and we could discard 15-17/17 for now for a later re-submission
> and discussion once the earlier part of the series lands in master.

Or we merge everything and let people discover glitches and
complain.  Reverting the last pieces can wait until then ;-).




Re: [PATCH v3 04/35] upload-pack: convert to a builtin

2018-02-23 Thread Brandon Williams
On 02/22, Jeff King wrote:
> On Thu, Feb 22, 2018 at 03:19:40PM -0500, Jeff King wrote:
> 
> > > To be clear, which of the following are you (most) worried about?
> > > 
> > >  1. being invoked with --help and spawning a pager
> > >  2. receiving and acting on options between 'git' and 'upload-pack'
> > >  3. repository discovery
> > >  4. pager config
> > >  5. alias discovery
> > >  6. increased code surface / unknown threats
> > 
> > My immediate concern is (4). But my greater concern is that people who
> > work on git.c should not have to worry about accidentally violating this
> > principle when they add a new feature or config option.
> > 
> > In other words, it seems like an accident waiting to happen. I'd be more
> > amenable to it if there was some compelling reason for it to be a
> > builtin, but I don't see one listed in the commit message. I see only
> > "let's make it easier to share the code", which AFAICT is equally served
> > by just lib-ifying the code and calling it from the standalone
> > upload-pack.c.
> 
> By the way, any decision here would presumably need to be extended to
> git-serve, etc. The current property is that it's safe to fetch from an
> untrusted repository, even over ssh. If we're keeping that for protocol
> v1, we'd want it to apply to protocol v2, as well.
> 
> -Peff

This may be more complicated.  Right now (for backward compatibility)
all fetches for v2 are issued to the upload-pack endpoint. So even
though I've introduced git-serve it doesn't have requests issued to it
and no requests can be issued to it currently (support isn't added to
http-backend or git-daemon).  This just means that the command already
exists to make it easy for testing specific v2 stuff and if we want to
expose it as an endpoint (like when we have a brand new server command
that is completely incompatible with v1) its already there and support
just needs to be plumbed in.

This whole notion of treating upload-pack differently from receive-pack
has bad consequences for v2 though.  The idea for v2 is to be able to
run any number of commands via the same endpoint, so at the end of the
day the endpoint you used is irrelevant.  So you could issue both fetch
and push commands via the same endpoint in v2 whether its git-serve,
receive-pack, or upload-pack.  So really, like Jonathan has said
elsewhere, we need to figure out how to be ok with having receive-pack
and upload-pack builtins, or having neither of them builtins, because it
doesn't make much sense for v2 to straddle that line.  I mean you could
do some complicated advertising of commands based on the endpoint you
hit, but then what does that mean if you're hitting the git-serve
endpoint where you should presumably be able to do any operation.

-- 
Brandon Williams


Re: What's cooking in git.git (Feb 2018, #03; Wed, 21)

2018-02-23 Thread Elijah Newren
On Wed, Feb 21, 2018 at 4:31 PM, Junio C Hamano  wrote:
> * en/rename-directory-detection (2018-02-14) 29 commits
>  - merge-recursive: ensure we write updates for directory-renamed file
>  - merge-recursive: avoid spurious rename/rename conflict from dir renames
>  - directory rename detection: new testcases showcasing a pair of bugs
>  - merge-recursive: fix remaining directory rename + dirty overwrite cases
>  - merge-recursive: fix overwriting dirty files involved in renames
>  - merge-recursive: avoid clobbering untracked files with directory renames
>  - merge-recursive: apply necessary modifications for directory renames
>  - merge-recursive: when comparing files, don't include trees
>  - merge-recursive: check for file level conflicts then get new name
>  - merge-recursive: add computation of collisions due to dir rename & merging
>  - merge-recursive: check for directory level conflicts
>  - merge-recursive: add get_directory_renames()
>  - merge-recursive: make a helper function for cleanup for handle_renames
>  - merge-recursive: split out code for determining diff_filepairs
>  - merge-recursive: make !o->detect_rename codepath more obvious
>  - merge-recursive: fix leaks of allocated renames and diff_filepairs
>  - merge-recursive: introduce new functions to handle rename logic
>  - merge-recursive: move the get_renames() function
>  - directory rename detection: tests for handling overwriting dirty files
>  - directory rename detection: tests for handling overwriting untracked files
>  - directory rename detection: miscellaneous testcases to complete coverage
>  - directory rename detection: testcases exploring possibly suboptimal merges
>  - directory rename detection: more involved edge/corner testcases
>  - directory rename detection: testcases checking which side did the rename
>  - directory rename detection: files/directories in the way of some renames
>  - directory rename detection: partially renamed directory testcase/discussion
>  - directory rename detection: testcases to avoid taking detection too far
>  - directory rename detection: directory splitting testcases
>  - directory rename detection: basic testcases
>
>  Rename detection logic in "diff" family that is used in "merge" has
>  learned to guess when all of x/a, x/b and x/c have moved to z/a,
>  z/b and z/c, it is likely that x/d added in the meantime would also
>  want to move to z/d by taking the hint that the entire directory
>  'x' moved to 'z'.  A bug causing dirty files involved in a rename
>  to be overwritten during merge has also been fixed as part of this
>  work.
>
>  Will merge to 'next'.

SZEDER Gábor requested (just over a week ago, but I was out at a
funeral) that I switch is_null_sha1() to is_null_oid() in one of these
patches.  It's a trivial change and squashes cleanly, but if you're
merging this series to next, I'm curious whether I should submit this
change as a new patch on top of the series, if you'd prefer a full
re-roll of the whole series, or something else.  It's the only change
I'm aware of anyone wanting to see; otherwise I think the series is
complete.


Re: [PATCH] strbuf_read_file(): preserve errno across close() call

2018-02-23 Thread René Scharfe
Am 23.02.2018 um 08:00 schrieb Jeff King:
> On Fri, Feb 23, 2018 at 01:49:52AM -0500, Jeff King wrote:
> Subject: [PATCH] strbuf_read_file(): preserve errno across close() call
> 
> If we encounter a read error, the user may want to report it
> by looking at errno. However, our close() call may clobber
> errno, leading to confusing results. Let's save and restore
> it in the error case.

Good idea.

> Signed-off-by: Jeff King 
> ---
>   strbuf.c | 6 +-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/strbuf.c b/strbuf.c
> index 1df674e919..5f138ed3c8 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -612,14 +612,18 @@ ssize_t strbuf_read_file(struct strbuf *sb, const char 
> *path, size_t hint)
>   {
>   int fd;
>   ssize_t len;
> + int saved_errno;
>   
>   fd = open(path, O_RDONLY);
>   if (fd < 0)
>   return -1;
>   len = strbuf_read(sb, fd, hint);
> + saved_errno = errno;
>   close(fd);
> - if (len < 0)
> + if (len < 0) {
> + errno = saved_errno;
>   return -1;
> + }
>   
>   return len;
>   }

How about adding a stealthy close_no_errno(), or do something like the
following to get shorter and more readable code?  (We could also keep
a single close() call, but would then set errno even on success.)

--- 
 strbuf.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index 1df674e919..c0066b1db9 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -2,6 +2,8 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define IGNORE_ERROR(expr) do { int e_ = errno; expr; errno = e_; } while (0)
+
 int starts_with(const char *str, const char *prefix)
 {
for (; ; str++, prefix++)
@@ -391,7 +393,7 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
 
if (got < 0) {
if (oldalloc == 0)
-   strbuf_release(sb);
+   IGNORE_ERROR(strbuf_release(sb));
else
strbuf_setlen(sb, oldlen);
return -1;
@@ -617,9 +619,11 @@ ssize_t strbuf_read_file(struct strbuf *sb, const char 
*path, size_t hint)
if (fd < 0)
return -1;
len = strbuf_read(sb, fd, hint);
-   close(fd);
-   if (len < 0)
+   if (len < 0) {
+   IGNORE_ERROR(close(fd));
return -1;
+   }
+   close(fd);
 
return len;
 }


Re: [GSoC][PATCH v2] ref-filter: Make "--contains " less chatty if is invalid

2018-02-23 Thread Junio C Hamano
Paul-Sebastian Ungureanu  writes:

> Hello,
> I have made the changes after review. This is the updated patch
> based on what was discussed last time [1].
>
> In this patch, I have fixed the same issue that was also seen
> in "git branch" and "git for-reach-ref". I have also removed the
> dead code that was left and updated the patches accordingly.
>
> [1] 
> https://public-inbox.org/git/20180219212130.4217-1-ungureanupaulsebast...@gmail.com/
>
> Best regards,
> Paul Ungureanu
>
> https://public-inbox.org/git/20180219212130.4217-1-ungureanupaulsebast...@gmail.com/

You do not want all of the above, upto and including the "---" below,
to appear in the log message of the resulting commit.  One way to
tell the reading end that you have such preamble in your message is
to write "-- >8 --" instead of "---" there.

> ---
>
> Some git commands which use --contains  print the whole
> help text if  is invalid. It should only show the error
> message instead.
>
> This patch applies to "git tag", "git branch", "git for-each-ref".
>
> This bug was a side effect of looking up the commit in option
> parser callback. When a error occurs in the option parser, the
> full usage is shown. To fix this bug, the part related to
> looking up the commit was moved outside of the option parser
> to the ref-filter module.
>
> Basically, the option parser only parses strings that represent
> commits and the ref-filter performs the commit look-up. If an
> error occurs during the option parsing, then it must be an invalid
> argument and the user should be informed of usage, but if a error
> occurs during ref-filtering, then it is a problem with the
> argument.

The same problem appears for "git branch --points-at ",
doesn't it?

> diff --git a/ref-filter.c b/ref-filter.c
> index f9e25aea7..aa282a27f 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -2000,6 +2000,25 @@ static void do_merge_filter(struct ref_filter_cbdata 
> *ref_cbdata)
>   free(to_clear);
>  }
>  
> +int add_str_to_commit_list(struct string_list_item *item, void *commit_list)

If this function can be static to this file (and I suspect it is),
please make it so.

> +{
> + struct object_id oid;
> + struct commit *commit;
> +
> + if (get_oid(item->string, )) {
> + error(_("malformed object name %s"), item->string);
> + exit(1);
> + }
> + commit = lookup_commit_reference();
> + if (!commit) {
> + error(_("no such commit %s"), item->string);
> + exit(1);
> + }
> + commit_list_insert(commit, commit_list);

The original (i.e. before this patch) does commit_list_insert() in
the order the commits are given on the command line.  This version
collects the command line arguments with string_list_append() that
preserves the order, and feeds them to commit_list_insert() here, so
the resulting commit_list will have the commits in the same order
before or after this patch.

Which is good.

> + return 0;
> +}

The code after this patch is a strict improvement (the current code
do not do so either), so this is outside the scope of this patch,
but we may want to give this function another "const char *" that is
used to report which option we got a malformed object name for.

> @@ -2012,6 +2031,10 @@ int filter_refs(struct ref_array *array, struct 
> ref_filter *filter, unsigned int
>   int ret = 0;
>   unsigned int broken = 0;
>  
> + /* Convert string representation and add to commit list. */
> + for_each_string_list(>with_commit_strs, add_str_to_commit_list, 
> >with_commit);
> + for_each_string_list(>no_commit_strs, add_str_to_commit_list, 
> >no_commit);
> +

As it does not use item->util in the callback helper, this should
use for_each_string_list_item() instead; then you can do

for_each_string_list_item(item, _no_commit_strs)
collect_commit(>no_commit, item->string);

which allows the other helper take a simple string, instead of
requiring a string_list_item.


[PATCH v2] subtree: fix add and pull for GPG-signed commits

2018-02-23 Thread Stephen R Guglielmo
If log.showsignature is true (or --show-signature is passed) while
performing a `subtree add` or `subtree pull`, the command fails.

toptree_for_commit() calls `log` and passes the output to `commit-tree`.
If this output shows the GPG signature data, `commit-tree` throws a
fatal error.

This commit fixes the issue by adding --no-show-signature to `log` calls
in a few places, as well as using the more appropriate `rev-parse`
instead where possible.

Signed-off-by: Stephen R Guglielmo 
---
 contrib/subtree/git-subtree.sh | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index dec085a23..9594ca4b5 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -297,7 +297,7 @@ find_latest_squash () {
  main=
  sub=
  git log --grep="^git-subtree-dir: $dir/*\$" \
- --pretty=format:'START %H%n%s%n%n%b%nEND%n' HEAD |
+ --no-show-signature --pretty=format:'START %H%n%s%n%n%b%nEND%n' HEAD |
  while read a b junk
  do
  debug "$a $b $junk"
@@ -341,7 +341,7 @@ find_existing_splits () {
  main=
  sub=
  git log --grep="^git-subtree-dir: $dir/*\$" \
- --pretty=format:'START %H%n%s%n%n%b%nEND%n' $revs |
+ --no-show-signature --pretty=format:'START %H%n%s%n%n%b%nEND%n' $revs |
  while read a b junk
  do
  case "$a" in
@@ -382,7 +382,7 @@ copy_commit () {
  # We're going to set some environment vars here, so
  # do it in a subshell to get rid of them safely later
  debug copy_commit "{$1}" "{$2}" "{$3}"
- git log -1 --pretty=format:'%an%n%ae%n%aD%n%cn%n%ce%n%cD%n%B' "$1" |
+ git log --no-show-signature -1
--pretty=format:'%an%n%ae%n%aD%n%cn%n%ce%n%cD%n%B' "$1" |
  (
  read GIT_AUTHOR_NAME
  read GIT_AUTHOR_EMAIL
@@ -462,8 +462,8 @@ squash_msg () {
  oldsub_short=$(git rev-parse --short "$oldsub")
  echo "Squashed '$dir/' changes from $oldsub_short..$newsub_short"
  echo
- git log --pretty=tformat:'%h %s' "$oldsub..$newsub"
- git log --pretty=tformat:'REVERT: %h %s' "$newsub..$oldsub"
+ git log --no-show-signature --pretty=tformat:'%h %s' "$oldsub..$newsub"
+ git log --no-show-signature --pretty=tformat:'REVERT: %h %s'
"$newsub..$oldsub"
  else
  echo "Squashed '$dir/' content from commit $newsub_short"
  fi
@@ -475,7 +475,7 @@ squash_msg () {

 toptree_for_commit () {
  commit="$1"
- git log -1 --pretty=format:'%T' "$commit" -- || exit $?
+ git rev-parse --verify "$commit^{tree}" || exit $?
 }

 subtree_for_commit () {
-- 
2.16.2


Re: RFC: git squash

2018-02-23 Thread Junio C Hamano
Julius Musseau  writes:

> I'm embarrassed to realise your approach matches the top-voted
> stack-overflow answer on the subject:
> https://stackoverflow.com/a/5201642

I personally do not visit stack-overflow, but I am happy to see that
there are people who remember "right" way to do this among the folks
who do.

"reset --soft" was invented exactly for this use case, and you can
use it even with dirty working tree and dirty index (they are left
intact).


Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-23 Thread Junio C Hamano
Junio C Hamano  writes:

> Lars Schneider  writes:
>
>> I still think it would be nice to see diffs for arbitrary encodings.
>> Would it be an option to read the `encoding` attribute and use it in
>> `git diff`?
>
> Reusing that gitk-only thing and suddenly start doing so would break
> gitk users, no?  The tool expects the diff to come out encoded in
> the encoding that is specified by that attribute (which is learned
> from get_path_encoding helper) and does its thing.
>
> I guess that gitk uses diff-tree plumbing and you won't be applying
> this change to the plumbing, perhaps?  If so, it might not be too
> bad, but those who decided to postprocess "git diff" output (instead
> of "git diff-tree" output) mimicking how gitk does it by thinking
> that is the safe and sane thing to do will be broken by such a
> change.  You could do "use the encoding only when a command line
> option says so", but then people will add a configuration variable
> to turn it always on and these existing scripts will be broken.
>
> I do not personally have much sympathy for the last case (i.e. those
> who scripted around 'git diff' instead of 'git diff-tree' to get
> broken), so making the new feature only work with the Porcelain "git
> diff" might be an option.  I'll need a bit more time to formulate
> the rest of my thought ;-)

So we are introducing in this series a way to say in what encoding
the things should be placed in the working tree files (i.e. the
w-t-e attribute attached to the paths).  Currently there is no
mechanism to say what encoding the in-repo contents are and UTF-8 is
assumed when conversion from/to w-t-e is required, but there is no
fundamental reason why it shouldn't be customizable (if anything, as
a piece of fact on the in-repo data, in-repo-encoding is *more*
appropriate to be an attribute than w-t-e that can merely be project
preference at best, as I mentioned earlier in this thread).  

We always use the in-repo contents when generating 'diff'.  I think
by "attribute to be used in diff", what you are reallying after is
to convert the in-repo contents to that encoding _BEFORE_ running
'diff' on it.  E.g. in-repo UTF-16 that can have NUL bytes all over
the place will not diff well with the xdiff machinery, but if you
first convert it to UTF-8 and have xdiff work on it, you can get
reasonable result out of it.  It is unclear what encoding you want
your final diff output in (it is equally valid in such a set-up to
desire your patch output in UTF-16 or UTF-8), but assuming that you
want UTF-8 in your patch output, perhaps we do not have to break
gitk users by hijacking the 'encoding' attribute.  Instead what you
want is a single bit that says between in-repo or working tree which
representation should be given to the xdiff machinery.  When that
bit is set, then we

 - First ensure that both sides of the diff input is in the working
   tree encoding by running it through convert_to_working_tree();

 - Run xdiff on it;

 - Take the xdiff result, and run it through convert_to_git(),
   before emitting (this is optional, making this a one-and-half bit
   option).

That would allow you to say "I have in-repo data in UTF-16 which I
check out as UTF-8.  xdiff machinery is unhappy.  Please do
something." perhaps?

The other way around (i.e. in-repo is UTF-8, but working tree
encoding is UTF-16) won't need xdiff issues, I would imagine.


Re: [PATCH 2/2] Revert "repository: pre-initialize hash algo pointer"

2018-02-23 Thread Stefan Beller
On Fri, Feb 23, 2018 at 11:50 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> I wonder if there is yet another missing case in the enumeration of
>> the previous patch:
>> Some commands are able to operate on GIT_OBJECT_DIR instead
>> of GIT_DIR (git repack?), which may not even explore the full git directory,
>> and so doesn't know about the hash value.
>
> ... because GIT_DIR/config is not known?  "repack" is not one of
> them, though---it needs to at least use refs as the starting point
> so a standalone OBJECT_DIR is insufficient.

Yes, I could have worded this as a question:
Is there any command that operates on GIT_OBJECT_DIR
without trying to discover GIT_DIR ?


Re: [PATCH v4 04/13] commit-graph: implement write_commit_graph()

2018-02-23 Thread Derrick Stolee



On 2/23/2018 2:30 PM, Junio C Hamano wrote:

Derrick Stolee  writes:


jt/binsearch-with-fanout introduces one when there is a 256-entry
fanout table (not the case here).

The bsearch() method in search.h (and used in
pack-write.c:need_large_offset) does not return the _position_ of a
found element.

Neither of these suit my needs, but I could just be searching for the
wrong strings. Also, I could divert my energies in this area to create
a generic search in the style of jt/binsearch-with-fanout.

... me goes and digs ...

What I had in mind was the one in sha1-lookup.c, actually.  Having
said that, hand-rolling another one is not too huge a deal;
eventually people will notice and consolidate code after the series
stabilizes anyway ;-)


Ah, sha1_pos(). That definitely satisfies my use case. Thanks! My local 
branch has this replacement.





+   num_large_edges++;
+   parent = parent->next;
+   } while (parent);

It feels somewhat wasteful to traverse the commit's parents list
only to count, without populating the octopus table (which I
understand is assumed to be minority case under this design).

Since we are writing the commit graph file in-order, we cannot write
the octopus table until after the chunk lengths are known.

Oh, my "minority case" comment was me wondering "since we expect
there are only a few, why not keep them in memory while we discover
them here, so that the writing phase that come later do not have to
go through all commits again counting their parents?  would it be
more performant and a better trade-off?"  We can certainly do such
an optimization later (iow, it is not all that crucial issue and
certainly I didn't mention the above as something that needs to be
"fixed"--there is nothing broken).


store the octopus table in memory and then dump it into the file
later, but walking the parents is quite fast after all the commits are
loaded. I'm not sure the time optimization merits the extra complexity
here. (I'm happy to revisit this if we do see this performance
lacking.)

P.S. I really like the name "octopus table" and will use that for
informal discussions of this format.

I actually do not mind that name used as the official term.  I find
it far more descriptive and understandable than "long edge" / "large
edge" at least to a Git person.


I will consider using this in the format and design documents.




You're right that int_id isn't great, and your more-specific
"oid_table_pos" shows an extra reason why it isn't great: when we add
the GRAPH_LAST_EDGE bit or set it to GRAPH_PARENT_MISSING, the value
is NOT a table position.

Perhaps I am somewhat biased, but it is quite natural for our
codebase and internal API to say something like this:

 x_pos(table, key) function's return value is the non-negative
 position for the key in the table when the key is there; when it
 returns a negative value, it is (-1 - position) where the "position"
 is the position in the table they key would have been found if
 it was in there.

and store the return value from such a function in a variable called
"pos".  Surely, sometimes "pos" does not have _the_ position, but
that does not make it a bad name.

Saying "MISSING is a special value that denotes 'nothing is here'"
and allowing it to be set to a variable that meant to hold the
position is not such a bad thing, though.  After all, that is how
you use NULL as a special value for a pointer variable ;-).

Same for using the high bit to mean something else.  Taking these
together you would explain "low 31-bit of pos holds the position for
the item in the table.  MISSING is a special value that you can use
to denote there is nothing.  The LAST_EDGE bit denotes that one
group of positions ends there", or something like that.


I think the current name makes the following call very clear:

It is still a strange name nevertheless.


+char *write_commit_graph(const char *obj_dir)
+{
+   struct packed_oid_list oids;
+   struct packed_commit_list commits;
+   struct sha1file *f;
+   int i, count_distinct = 0;
+   DIR *info_dir;
+   struct strbuf tmp_file = STRBUF_INIT;
+   struct strbuf graph_file = STRBUF_INIT;
+   unsigned char final_hash[GIT_MAX_RAWSZ];
+   char *graph_name;
+   int fd;
+   uint32_t chunk_ids[5];
+   uint64_t chunk_offsets[5];
+   int num_chunks;
+   int num_long_edges;
+   struct commit_list *parent;
+
+   oids.nr = 0;
+   oids.alloc = (int)(0.15 * approximate_object_count());

Heh, traditionalist would probably avoid unnecessary use of float
and use something like 1/4 or 1/8 ;-)  After all, it is merely a
ballpark guestimate.


+   num_long_edges = 0;

This again is about naming, but I find it a bit unnatural to call
the edge between a chind and its octopus parents "long".  Individual
edges are not long--the only thing that is long is your "list of

Re: [PATCH 05/27] object-store: move packed_git and packed_git_mru to object store

2018-02-23 Thread Stefan Beller
On Wed, Feb 21, 2018 at 1:51 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> +
>> +/*
>> + * The mru list_head is supposed to be initialized using
>> + * the LIST_HEAD macro, assigning prev/next to itself.
>> + * However this doesn't work in this case as some compilers dislike
>> + * that macro on member variables. Use NULL instead as that is defined
>> + * and accepted, deferring the real init to prepare_packed_git_mru(). */
>> +#define __MRU_INIT { NULL, NULL }
>> +#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL }
>
> I do not think it has to be this way to abuse two NULLs, if you
> faithfully mimicked how LIST_HEAD() macro is constructed.  The
> reason why it does not try to introduce
>
> struct list_head x = LIST_HEAD_INIT;
>
> and instead, uses
>
> LIST_HEAD(x);
>
> is because of the need for self referral.  If we learn from it, we
> can do the same, i.e. instead of doing
>
> struct raw_object_store x = RAW_OBJECT_STORE_INIT;
>
> we can do
>
> RAW_OBJECT_STORE(x);
>
> that expands to
>
> struct raw_object_store x = {
> ...
> { _git_mru, _git_mru },
> ...
> };
>
> no?  Then we do not need such a lengthy comment that reads only like
> an excuse ;-)

We cannot do this, because the object store is directly
embedded into the the_repository (not as a pointer),
which is a global on its own. So we'd have to do this
at the repository level, which I would not want.
I'd want to have the #defines where the structs are declared.

Any guidance on how to do that correctly with self referential
definitions without making the object store a pointer then?


Re: [PATCH v3 23/35] fetch-pack: support shallow requests

2018-02-23 Thread Brandon Williams
On 02/23, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:13:00 -0800
> Brandon Williams  wrote:
> 
> > @@ -1090,6 +1110,10 @@ static int send_fetch_request(int fd_out, const 
> > struct fetch_pack_args *args,
> > if (prefer_ofs_delta)
> > packet_buf_write(_buf, "ofs-delta");
> >  
> > +   /* Add shallow-info and deepen request */
> > +   if (server_supports_feature("fetch", "shallow", 1))
> > +   add_shallow_requests(_buf, args);
> 
> One more thing I observed when trying to implement the server side in
> JGit - the last argument should be 0, not 1, right? I don't think that
> "shallow" should be required on the server unless we really need it.

Good catch, I'll fix that.

-- 
Brandon Williams


Re: [PATCH v4 08/13] commit-graph: implement --delete-expired

2018-02-23 Thread Junio C Hamano
Derrick Stolee  writes:

> My thought was that using a fixed name
> (e.g. .git/objects/info/commit-graph) would block making the graph
> incremental. Upon thinking again, this is not the case. That feature
> could be designed with a fixed name for the small, frequently-updated
> file and use .../info/graph-.graph names for the "base" graph
> files.

I guess that is in line with the way how split-index folks did their
thing ;-)


Re: [PATCH 2/2] Revert "repository: pre-initialize hash algo pointer"

2018-02-23 Thread Junio C Hamano
Stefan Beller  writes:

> I wonder if there is yet another missing case in the enumeration of
> the previous patch:
> Some commands are able to operate on GIT_OBJECT_DIR instead
> of GIT_DIR (git repack?), which may not even explore the full git directory,
> and so doesn't know about the hash value.

... because GIT_DIR/config is not known?  "repack" is not one of
them, though---it needs to at least use refs as the starting point
so a standalone OBJECT_DIR is insufficient.


Re: [PATCH v4 04/13] commit-graph: implement write_commit_graph()

2018-02-23 Thread Junio C Hamano
Junio C Hamano  writes:

>> I think the current name makes the following call very clear:
>
> It is still a strange name nevertheless.

Sorry for simply repeating "strange" without spelling out why in the
previous message.  This certainly is subjective and depends on your
cultural background, but in our codebase, I tried to name functions
after "what" they do and "why", rather than "how" they do so.  In a
way, it's the same kind of uneasiness I feel when I see variables
named in hungarian notation.

You would inspect the object and treat 'data' as a list and add to
the object if it is a commit, and if_packed_commit_add_to_list()
certainly is a name that describes all of that well, but does it
give readers a good answer when they wonder why the function is
doing so?  You described with the name of the function how it
collects commits that are in the pack, without explicitly saying
that you want to collect packed commits and that is why you are
inspecting for type and doing so only for commit (i.e.
"if_packed_commit" part of the name) and why you are adding to a
list.


  1   2   >