Re: [PATCH] mailmap: update brian m. carlson's email address

2018-05-07 Thread brian m. carlson
On Mon, May 07, 2018 at 12:37:05PM +0900, Junio C Hamano wrote:
> I initially reacted to "was reversed" with "yikes, did we break the
> mailmap reader and we need to update the file?", but apparently that
> is not what this patch is about.  I think what is happening here is
> that cdb6b5ac (".mailmap: Combine more (name, email) to individual
> persons", 2013-08-12) removed
> 
> -Brian M. Carlson <sand...@crustytoothpaste.ath.cx>
> 
> and then added these two lines
> 
> +brian m. carlson <sand...@crustytoothpaste.ath.cx> Brian M. Carlson 
> <sand...@crustytoothpaste.ath.cx>
> +brian m. carlson <sand...@crustytoothpaste.ath.cx> 
> <sand...@crustytoothpaste.net>
> 
> where *.net address did not come from any other entry for you in the
> file.  I guess the author of the patch saw that you were sending
> your messages from the .net address and tried to help by unifying
> the two addresses, without knowing your preference and recorded two
> reversed entries.
> 
> Will queue as-is for now, but if you want to update the log message
> I do not mind taking a reroll.

I can reroll with a less alarming commit message, sure.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH] mailmap: update brian m. carlson's email address

2018-05-07 Thread Stefan Beller
On Sun, May 6, 2018 at 8:37 PM, Junio C Hamano <gits...@pobox.com> wrote:
> "brian m. carlson" <sand...@crustytoothpaste.net> writes:
>
>> The order of addresses in the mailmap file was reversed, leading to git
>> preferring the crustytoothpaste.ath.cx address, which is obsolete, over
>> the crustytoothpaste.net address, which is current.  Switch the order of
>> the addresses so that git log displays the correct address.
>>
>> Signed-off-by: brian m. carlson <sand...@crustytoothpaste.net>
>> ---
>
> I initially reacted to "was reversed" with "yikes, did we break the
> mailmap reader and we need to update the file?", but apparently that
> is not what this patch is about.  I think what is happening here is
> that cdb6b5ac (".mailmap: Combine more (name, email) to individual
> persons", 2013-08-12) removed
>
> -Brian M. Carlson <sand...@crustytoothpaste.ath.cx>
>
> and then added these two lines
>
> +brian m. carlson <sand...@crustytoothpaste.ath.cx> Brian M. Carlson 
> <sand...@crustytoothpaste.ath.cx>
> +brian m. carlson <sand...@crustytoothpaste.ath.cx> 
> <sand...@crustytoothpaste.net>
>
> where *.net address did not come from any other entry for you in the
> file.  I guess the author of the patch saw that you were sending
> your messages from the .net address and tried to help by unifying
> the two addresses, without knowing your preference and recorded two
> reversed entries.
>
> Will queue as-is for now, but if you want to update the log message
> I do not mind taking a reroll.

brian,

sorry to break you there. I was the author of the patch Junio found, organizing
the .mailmap file was one of my starter contributions. I recall asking all the
people if they were ok with it their names combined in different spellings
94b410bba86 (.mailmap: Map email addresses to names, 2013-07-12)
f4f49e2258a (.mailmap: Combine more (email, name) to individual
persons, 2013-07-14)
and I vaguley recall asking you about capitalization of your name there
and you preferred the lower case name, but apparently I never asked you
about the preferred email address.

Sorry and thanks for fixing,
Stefan


Re: [PATCH] mailmap: update brian m. carlson's email address

2018-05-06 Thread Junio C Hamano
"brian m. carlson" <sand...@crustytoothpaste.net> writes:

> The order of addresses in the mailmap file was reversed, leading to git
> preferring the crustytoothpaste.ath.cx address, which is obsolete, over
> the crustytoothpaste.net address, which is current.  Switch the order of
> the addresses so that git log displays the correct address.
>
> Signed-off-by: brian m. carlson <sand...@crustytoothpaste.net>
> ---

I initially reacted to "was reversed" with "yikes, did we break the
mailmap reader and we need to update the file?", but apparently that
is not what this patch is about.  I think what is happening here is
that cdb6b5ac (".mailmap: Combine more (name, email) to individual
persons", 2013-08-12) removed

-Brian M. Carlson <sand...@crustytoothpaste.ath.cx>

and then added these two lines

+brian m. carlson <sand...@crustytoothpaste.ath.cx> Brian M. Carlson 
<sand...@crustytoothpaste.ath.cx>
+brian m. carlson <sand...@crustytoothpaste.ath.cx> 
<sand...@crustytoothpaste.net>

where *.net address did not come from any other entry for you in the
file.  I guess the author of the patch saw that you were sending
your messages from the .net address and tried to help by unifying
the two addresses, without knowing your preference and recorded two
reversed entries.

Will queue as-is for now, but if you want to update the log message
I do not mind taking a reroll.

Thanks.


>  .mailmap | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/.mailmap b/.mailmap
> index 7c71e88ea5..df7cf6313c 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -25,8 +25,8 @@ Ben Walton <bdwal...@gmail.com> <bwal...@artsci.utoronto.ca>
>  Benoit Sigoure <tsuna...@gmail.com> <ts...@lrde.epita.fr>
>  Bernt Hansen <be...@norang.ca> <be...@alumni.uwaterloo.ca>
>  Brandon Casey <draf...@gmail.com> <ca...@nrlssc.navy.mil>
> -brian m. carlson <sand...@crustytoothpaste.ath.cx> Brian M. Carlson 
> <sand...@crustytoothpaste.ath.cx>
> -brian m. carlson <sand...@crustytoothpaste.ath.cx> 
> <sand...@crustytoothpaste.net>
> +brian m. carlson <sand...@crustytoothpaste.net> Brian M. Carlson 
> <sand...@crustytoothpaste.ath.cx>
> +brian m. carlson <sand...@crustytoothpaste.net> 
> <sand...@crustytoothpaste.ath.cx>
>  Bryan Larsen <br...@larsen.st> <bryan.lar...@gmail.com>
>  Bryan Larsen <br...@larsen.st> <bryanlar...@yahoo.com>
>  Cheng Renquan <crq...@gmail.com>


[PATCH] mailmap: update brian m. carlson's email address

2018-05-06 Thread brian m. carlson
The order of addresses in the mailmap file was reversed, leading to git
preferring the crustytoothpaste.ath.cx address, which is obsolete, over
the crustytoothpaste.net address, which is current.  Switch the order of
the addresses so that git log displays the correct address.

Signed-off-by: brian m. carlson 
---
 .mailmap | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.mailmap b/.mailmap
index 7c71e88ea5..df7cf6313c 100644
--- a/.mailmap
+++ b/.mailmap
@@ -25,8 +25,8 @@ Ben Walton  
 Benoit Sigoure  
 Bernt Hansen  
 Brandon Casey  
-brian m. carlson  Brian M. Carlson 

-brian m. carlson  

+brian m. carlson  Brian M. Carlson 

+brian m. carlson  

 Bryan Larsen  
 Bryan Larsen  
 Cheng Renquan 


Re: git update-ref fails to create reference. (bug)

2018-05-05 Thread Rafael Ascensão
Thanks Martin for the quick fix.

On Fri, May 04, 2018 at 08:26:46PM +0200, Martin �gren wrote:
> Anyway, that's not where I'm stuck... Regardless of how I try to write
> tests (in t1400), they just pass beautifully even before this patch. I
> might be able to look into that more on the weekend. If anyone has
> ideas, I am all ears. Or if someone feels like picking this up and
> running with it, feel free.

In t1400 `m=refs/heads/master` is used in the majority of tests. And
this issue doesn't manifest itself if refs are being written under refs/

It also seems particular about having the "old sha" set to 40 zeros or
the empty string.

So I guess we should add some extra tests to cover the variations of
these two cases.

e.g.
 test_expect_success "create PSEUDOREF" '
 git update-ref PSEUDOREF $A
 &&
 test $A = $(cat .git/PSEUDOREF)
 '

fails/succeeds appropriately in my limited testing.

I am busy this weekend, but can try to write some if no one writes it
until after the weekend.

Cumprimentos,
Rafael Ascensão


Re: git update-ref fails to create reference. (bug)

2018-05-04 Thread Martin Ågren
Hi Rafael,

On 4 May 2018 at 18:28, Rafael Ascensão <rafa.al...@gmail.com> wrote:
> While trying to create a pseudo reference named REF pointing to the
> empty tree iff it doesn't exist, I stumbled on the following:
>
> I assume both are valid ways to create such reference:
> a) $ echo -e option no-deref\\nupdate REF $(git hash-object -t
> tree /dev/null) 0000 | git
> update-ref --stdin
> b) $ git update-ref --no-deref REF $(git hash-object -t tree
> /dev/null) 
>
> While a) works, b) will throw:
> fatal: could not read ref 'REF'


I can reproduce this and I agree with your understanding of what should
happen here. The patch below makes this work according to my and your
expectations, at least in my command-line testing.

The die("... already exists") could instead be a no-op, trusting that
the backend discovers the problem. "die" could also be strbuf_addf(...),
I'm just following 2c3aed138 here.

Anyway, that's not where I'm stuck... Regardless of how I try to write
tests (in t1400), they just pass beautifully even before this patch. I
might be able to look into that more on the weekend. If anyone has
ideas, I am all ears. Or if someone feels like picking this up and
running with it, feel free.

Martin

diff --git a/refs.c b/refs.c
index 8b7a77fe5e..cdb0a5ab29 100644
--- a/refs.c
+++ b/refs.c
@@ -666,9 +666,12 @@ static int write_pseudoref(const char *pseudoref, const 
struct object_id *oid,
if (old_oid) {
struct object_id actual_old_oid;
 
-   if (read_ref(pseudoref, _old_oid))
-   die("could not read ref '%s'", pseudoref);
-   if (oidcmp(_old_oid, old_oid)) {
+   if (read_ref(pseudoref, _old_oid)) {
+   if (!is_null_oid(old_oid))
+   die("could not read ref '%s'", pseudoref);
+   } else if (is_null_oid(old_oid)) {
+   die("reference '%s' already exists", pseudoref);
+   } else if (oidcmp(_old_oid, old_oid)) {
strbuf_addf(err, "unexpected sha1 when writing '%s'", 
pseudoref);
rollback_lock_file();
goto done;
-- 
2.17.0.392.g7fa371e468



git update-ref fails to create reference. (bug)

2018-05-04 Thread Rafael Ascensão
While trying to create a pseudo reference named REF pointing to the
empty tree iff it doesn't exist, I stumbled on the following:

I assume both are valid ways to create such reference:
a) $ echo -e option no-deref\\nupdate REF $(git hash-object -t tree 
/dev/null)  | git update-ref --stdin
b) $ git update-ref --no-deref REF $(git hash-object -t tree /dev/null) 


While a) works, b) will throw:
fatal: could not read ref 'REF'

Bisect seems to point to:
2c3aed138 (pseudoref: check return values from read_ref(), 2015-07-15)

Thanks,
Rafael Ascensão


Re: [PATCH 39/41] Update shell scripts to compute empty tree object ID

2018-05-03 Thread brian m. carlson
On Tue, May 01, 2018 at 12:42:57PM +0200, Duy Nguyen wrote:
> On Mon, Apr 23, 2018 at 11:39:49PM +, brian m. carlson wrote:
> > Several of our shell scripts hard-code the object ID of the empty tree.
> > To avoid any problems when changing hashes, compute this value on
> > startup of the script.  For performance, store the value in a variable
> > and reuse it throughout the life of the script.
> > 
> > Signed-off-by: brian m. carlson 
> > ---
> >  git-filter-branch.sh   | 4 +++-
> >  git-rebase--interactive.sh | 4 +++-
> >  templates/hooks--pre-commit.sample | 2 +-
> >  3 files changed, 7 insertions(+), 3 deletions(-)
> > 
> > diff --git a/git-filter-branch.sh b/git-filter-branch.sh
> > index 64f21547c1..ccceaf19a7 100755
> > --- a/git-filter-branch.sh
> > +++ b/git-filter-branch.sh
> > @@ -11,6 +11,8 @@
> >  # The following functions will also be available in the commit filter:
> >  
> >  functions=$(cat << \EOF
> > +EMPTY_TREE=$(git hash-object -t tree /dev/null)
> 
> All scripts (except those example hooks) must source
> git-sh-setup. Should we define this in there instead?

I think at this point, I'm okay with special-casing these two uses, but
I would generally say that if we gain any more we should move it there.

There's a trade-off between the benefits of reuse here and the fact that
we're forking a process, which incurs a cost, especially on Windows.

I'm open to hearing other opinions, of course.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


[PATCH v2 1/6] sequencer: extract helper to update active_cache_tree

2018-05-03 Thread Johannes Schindelin
This patch extracts the code from is_index_unchanged() to initialize or
update the index' cache tree (i.e. a tree object reflecting the current
index' top-level tree).

The new helper will be used in the upcoming code to support `git rebase
-i --root` via the sequencer.

Signed-off-by: Johannes Schindelin <johannes.schinde...@gmx.de>
---
 sequencer.c | 27 ++-
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index e2f83942843..90c8218aa9a 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -562,9 +562,23 @@ static int do_recursive_merge(struct commit *base, struct 
commit *next,
return !clean;
 }
 
+static struct object_id *get_cache_tree_oid(void)
+{
+   if (!active_cache_tree)
+   active_cache_tree = cache_tree();
+
+   if (!cache_tree_fully_valid(active_cache_tree))
+   if (cache_tree_update(_index, 0)) {
+   error(_("unable to update cache tree"));
+   return NULL;
+   }
+
+   return _cache_tree->oid;
+}
+
 static int is_index_unchanged(void)
 {
-   struct object_id head_oid;
+   struct object_id head_oid, *cache_tree_oid;
struct commit *head_commit;
 
if (!resolve_ref_unsafe("HEAD", RESOLVE_REF_READING, _oid, NULL))
@@ -583,15 +597,10 @@ static int is_index_unchanged(void)
if (parse_commit(head_commit))
return -1;
 
-   if (!active_cache_tree)
-   active_cache_tree = cache_tree();
-
-   if (!cache_tree_fully_valid(active_cache_tree))
-   if (cache_tree_update(_index, 0))
-   return error(_("unable to update cache tree"));
+   if (!(cache_tree_oid = get_cache_tree_oid()))
+   return -1;
 
-   return !oidcmp(_cache_tree->oid,
-  _commit->tree->object.oid);
+   return !oidcmp(cache_tree_oid, _commit->tree->object.oid);
 }
 
 static int write_author_script(const char *message)
-- 
2.17.0.windows.1.38.g05ca542f78d




[PATCH v4 3/3] submodule: add --dissociate option to add/update commands

2018-05-03 Thread Casey Fitzpatrick
Add --dissociate option to add and update commands, both clone helper commands
that already have the --reference option --dissociate pairs with.

Signed-off-by: Casey Fitzpatrick <kcgh...@gmail.com>
---
 Documentation/git-submodule.txt | 10 +-
 builtin/submodule--helper.c | 16 +---
 git-submodule.sh| 10 +-
 t/t7408-submodule-reference.sh  | 17 +
 4 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index d1ebe32e8..a75b95043 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -369,7 +369,15 @@ the submodule itself.
this option will be passed to the linkgit:git-clone[1] command.
 +
 *NOTE*: Do *not* use this option unless you have read the note
-for linkgit:git-clone[1]'s `--reference` and `--shared` options carefully.
+for linkgit:git-clone[1]'s `--reference`, `--shared`, and `--dissociate`
+options carefully.
+
+--dissociate::
+   This option is only valid for add and update commands.  These
+   commands sometimes need to clone a remote repository. In this case,
+   this option will be passed to the linkgit:git-clone[1] command.
++
+*NOTE*: see the NOTE for the `--reference` option.
 
 --recursive::
This option is only valid for foreach, update, status and sync commands.
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6ba8587b6..a85b30419 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1056,7 +1056,7 @@ static int module_deinit(int argc, const char **argv, 
const char *prefix)
 }
 
 static int clone_submodule(const char *path, const char *gitdir, const char 
*url,
-  const char *depth, struct string_list *reference,
+  const char *depth, struct string_list *reference, 
int dissociate,
   int quiet, int progress)
 {
struct child_process cp = CHILD_PROCESS_INIT;
@@ -1075,6 +1075,8 @@ static int clone_submodule(const char *path, const char 
*gitdir, const char *url
argv_array_pushl(, "--reference",
 item->string, NULL);
}
+   if (dissociate)
+   argv_array_push(, "--dissociate");
if (gitdir && *gitdir)
argv_array_pushl(, "--separate-git-dir", gitdir, NULL);
 
@@ -1190,6 +1192,7 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
char *p, *path = NULL, *sm_gitdir;
struct strbuf sb = STRBUF_INIT;
struct string_list reference = STRING_LIST_INIT_NODUP;
+   int dissociate = 0;
char *sm_alternate = NULL, *error_strategy = NULL;
 
struct option module_clone_options[] = {
@@ -1208,6 +1211,8 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
OPT_STRING_LIST(0, "reference", ,
   N_("repo"),
   N_("reference repository")),
+   OPT_BOOL(0, "dissociate", ,
+  N_("use --reference only while cloning")),
OPT_STRING(0, "depth", ,
   N_("string"),
   N_("depth for shallow clones")),
@@ -1247,7 +1252,7 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
 
prepare_possible_alternates(name, );
 
-   if (clone_submodule(path, sm_gitdir, url, depth, ,
+   if (clone_submodule(path, sm_gitdir, url, depth, , 
dissociate,
quiet, progress))
die(_("clone of '%s' into submodule path '%s' failed"),
url, path);
@@ -1300,6 +1305,7 @@ struct submodule_update_clone {
int quiet;
int recommend_shallow;
struct string_list references;
+   int dissociate;
const char *depth;
const char *recursive_prefix;
const char *prefix;
@@ -1315,7 +1321,7 @@ struct submodule_update_clone {
int failed_clones_nr, failed_clones_alloc;
 };
 #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \
-   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, \
+   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, \
NULL, NULL, NULL, \
STRING_LIST_INIT_DUP, 0, NULL, 0, 0}
 
@@ -1442,6 +1448,8 @@ static int prepare_to_clone_next_submodule(const struct 
cache_entry *ce,
for_each_string_list_item(item, >references)
argv_array_pushl(>args, "--reference", 
item->string, NULL);
}
+   if (suc->dissociate)
+   argv_array_push(>args, "--dissociate");
if (suc->depth)

[PATCH v4 1/3] merge: update documentation for {merge,diff}.renameLimit

2018-05-02 Thread Ben Peart
Update the documentation to better indicate that the renameLimit setting is
ignored if rename detection is turned off via command line options or config
settings.

Signed-off-by: Ben Peart <benpe...@microsoft.com>
---
 Documentation/diff-config.txt  | 3 ++-
 Documentation/merge-config.txt | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index 5ca942ab5e..77caa66c2f 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -112,7 +112,8 @@ diff.orderFile::
 
 diff.renameLimit::
The number of files to consider when performing the copy/rename
-   detection; equivalent to the 'git diff' option `-l`.
+   detection; equivalent to the 'git diff' option `-l`. This setting
+   has no effect if rename detection is turned off.
 
 diff.renames::
Whether and how Git detects renames.  If set to "false",
diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
index 12b6bbf591..48ee3bce77 100644
--- a/Documentation/merge-config.txt
+++ b/Documentation/merge-config.txt
@@ -35,7 +35,8 @@ include::fmt-merge-msg-config.txt[]
 merge.renameLimit::
The number of files to consider when performing rename detection
during a merge; if not specified, defaults to the value of
-   diff.renameLimit.
+   diff.renameLimit. This setting has no effect if rename detection
+   is turned off.
 
 merge.renormalize::
Tell Git that canonical representation of files in the
-- 
2.17.0.windows.1



Re: [PATCH v4 10/10] commit-graph.txt: update design document

2018-05-02 Thread Jakub Narebski
Derrick Stolee  writes:

> On 4/30/2018 7:32 PM, Jakub Narebski wrote:
>> Derrick Stolee  writes:
[...]
>>>   - After computing and storing generation numbers, we must make graph
>>> walks aware of generation numbers to gain the performance benefits they
>>> enable. This will mostly be accomplished by swapping a 
>>> commit-date-ordered
>>> priority queue with one ordered by generation number. The following
>>> -  operations are important candidates:
>>> +  operation is an important candidate:
>>>   -- paint_down_to_common()
>>>   - 'log --topo-order'
>>
>> Another possible candidates:
>>
>> - remove_redundant() - see comment in previous patch
>> - still_interesting() - where Git uses date slop to stop walking
>>   too far
>
> remove_redundant() will be included in v5, thanks.

Oh.  Nice.

I'll try to review the new patch in detail soon.

> Instead of "still_interesting()" I'll add "git tag --merged" as the
> candidate to consider, as discussed in [1].
>
> [1] https://public-inbox.org/git/87fu3g67ry@lant.ki.iif.hu/t/#u
>     "branch --contains / tag --merged inconsistency"

All right.  I have mentioned still_interesting() as a hint where
possible additional generation numbers based optimization may lurk
(because that's where heuristic based on dates is used - similarly to
how it was done in this series with paint_down_to_common()).

[...]
>> One important issue left is handling features that change view of
>> project history, and their interaction with commit-graph feature.
>>
>> What would happen, if we turn on commit-graph feature, generate commit
>> graph file, and then:
>>
>>* use graft file or remove graft entries to cut history, or remove cut
>>  or join two [independent] histories.
>>* use git-replace mechanims to do the same
>>* in shallow clone, deepen or shorten the clone
>>
>> What would happen if without re-generating commit-graph file (assuming
>> tha Git wouldn't do it for us), we run some feature that makes use of
>> commit-graph data:
>>
>>- git branch --contains
>>- git tag --contains
>>- git rev-list A..B
>>
>
> The commit-graph is not supported in these scenarios (yet). grafts are
> specifically mentioned in the future work section.
>
> I'm not particularly interested in supporting these features, so they
> are good venues for other contributors to get involved in the
> commit-graph feature. Eventually, they will be blockers to making the
> commit-graph feature a "default" feature. That is when I will pay
> attention to these situations. For now, a user must opt-in to having a
> commit-graph file (and that same user has possibly opted in to these
> history modifying features).

Well, that is sensible approach.  Get commit-graph features in working
condition, and worry about beng able to make it on by default later.

Nice to have it clarified.  I'll stop nagging about that, then ;-P

One issue: 'grafts' are mentioned in the future work section of the
technical documentation, but we don't have *any* warning about
commit-graph limitations in user-facing documentation, that is
git-commit-graph(1) manpage.

Best,
-- 
Jakub Narębski


[PATCH 3/3] submodule: add --dissociate option to add/update commands

2018-05-01 Thread Casey Fitzpatrick
Add --dissociate option to add and update commands, both clone helper commands
that already have the --reference option --dissociate pairs with.

Signed-off-by: Casey Fitzpatrick <kcgh...@gmail.com>
---
 Documentation/git-submodule.txt | 10 +-
 builtin/submodule--helper.c | 16 +---
 git-submodule.sh| 10 +-
 t/t7408-submodule-reference.sh  | 17 +
 4 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index d1ebe32e8..a75b95043 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -369,7 +369,15 @@ the submodule itself.
this option will be passed to the linkgit:git-clone[1] command.
 +
 *NOTE*: Do *not* use this option unless you have read the note
-for linkgit:git-clone[1]'s `--reference` and `--shared` options carefully.
+for linkgit:git-clone[1]'s `--reference`, `--shared`, and `--dissociate`
+options carefully.
+
+--dissociate::
+   This option is only valid for add and update commands.  These
+   commands sometimes need to clone a remote repository. In this case,
+   this option will be passed to the linkgit:git-clone[1] command.
++
+*NOTE*: see the NOTE for the `--reference` option.
 
 --recursive::
This option is only valid for foreach, update, status and sync commands.
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6ba8587b6..a85b30419 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1056,7 +1056,7 @@ static int module_deinit(int argc, const char **argv, 
const char *prefix)
 }
 
 static int clone_submodule(const char *path, const char *gitdir, const char 
*url,
-  const char *depth, struct string_list *reference,
+  const char *depth, struct string_list *reference, 
int dissociate,
   int quiet, int progress)
 {
struct child_process cp = CHILD_PROCESS_INIT;
@@ -1075,6 +1075,8 @@ static int clone_submodule(const char *path, const char 
*gitdir, const char *url
argv_array_pushl(, "--reference",
 item->string, NULL);
}
+   if (dissociate)
+   argv_array_push(, "--dissociate");
if (gitdir && *gitdir)
argv_array_pushl(, "--separate-git-dir", gitdir, NULL);
 
@@ -1190,6 +1192,7 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
char *p, *path = NULL, *sm_gitdir;
struct strbuf sb = STRBUF_INIT;
struct string_list reference = STRING_LIST_INIT_NODUP;
+   int dissociate = 0;
char *sm_alternate = NULL, *error_strategy = NULL;
 
struct option module_clone_options[] = {
@@ -1208,6 +1211,8 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
OPT_STRING_LIST(0, "reference", ,
   N_("repo"),
   N_("reference repository")),
+   OPT_BOOL(0, "dissociate", ,
+  N_("use --reference only while cloning")),
OPT_STRING(0, "depth", ,
   N_("string"),
   N_("depth for shallow clones")),
@@ -1247,7 +1252,7 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
 
prepare_possible_alternates(name, );
 
-   if (clone_submodule(path, sm_gitdir, url, depth, ,
+   if (clone_submodule(path, sm_gitdir, url, depth, , 
dissociate,
quiet, progress))
die(_("clone of '%s' into submodule path '%s' failed"),
url, path);
@@ -1300,6 +1305,7 @@ struct submodule_update_clone {
int quiet;
int recommend_shallow;
struct string_list references;
+   int dissociate;
const char *depth;
const char *recursive_prefix;
const char *prefix;
@@ -1315,7 +1321,7 @@ struct submodule_update_clone {
int failed_clones_nr, failed_clones_alloc;
 };
 #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \
-   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, \
+   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, \
NULL, NULL, NULL, \
STRING_LIST_INIT_DUP, 0, NULL, 0, 0}
 
@@ -1442,6 +1448,8 @@ static int prepare_to_clone_next_submodule(const struct 
cache_entry *ce,
for_each_string_list_item(item, >references)
argv_array_pushl(>args, "--reference", 
item->string, NULL);
}
+   if (suc->dissociate)
+   argv_array_push(>args, "--dissociate");
if (suc->depth)

Re: [PATCH 3/3] submodule: add --dissociate option to add/update commands

2018-05-01 Thread Casey Fitzpatrick
Just noticed I missed the other 'test_must_fail'. Resubmitting in a few moments.

On Tue, May 1, 2018 at 8:27 PM, Casey Fitzpatrick <kcgh...@gmail.com> wrote:
> Add --dissociate option to add and update commands, both clone helper commands
> that already have the --reference option --dissociate pairs with.
>
> Signed-off-by: Casey Fitzpatrick <kcgh...@gmail.com>
> ---
>  Documentation/git-submodule.txt | 10 +-
>  builtin/submodule--helper.c | 16 +---
>  git-submodule.sh| 10 +-
>  t/t7408-submodule-reference.sh  | 17 +
>  4 files changed, 48 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index d1ebe32e8..a75b95043 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -369,7 +369,15 @@ the submodule itself.
> this option will be passed to the linkgit:git-clone[1] command.
>  +
>  *NOTE*: Do *not* use this option unless you have read the note
> -for linkgit:git-clone[1]'s `--reference` and `--shared` options carefully.
> +for linkgit:git-clone[1]'s `--reference`, `--shared`, and `--dissociate`
> +options carefully.
> +
> +--dissociate::
> +   This option is only valid for add and update commands.  These
> +   commands sometimes need to clone a remote repository. In this case,
> +   this option will be passed to the linkgit:git-clone[1] command.
> ++
> +*NOTE*: see the NOTE for the `--reference` option.
>
>  --recursive::
> This option is only valid for foreach, update, status and sync 
> commands.
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 6ba8587b6..a85b30419 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -1056,7 +1056,7 @@ static int module_deinit(int argc, const char **argv, 
> const char *prefix)
>  }
>
>  static int clone_submodule(const char *path, const char *gitdir, const char 
> *url,
> -  const char *depth, struct string_list *reference,
> +  const char *depth, struct string_list *reference, 
> int dissociate,
>int quiet, int progress)
>  {
> struct child_process cp = CHILD_PROCESS_INIT;
> @@ -1075,6 +1075,8 @@ static int clone_submodule(const char *path, const char 
> *gitdir, const char *url
> argv_array_pushl(, "--reference",
>  item->string, NULL);
> }
> +   if (dissociate)
> +   argv_array_push(, "--dissociate");
> if (gitdir && *gitdir)
> argv_array_pushl(, "--separate-git-dir", gitdir, 
> NULL);
>
> @@ -1190,6 +1192,7 @@ static int module_clone(int argc, const char **argv, 
> const char *prefix)
> char *p, *path = NULL, *sm_gitdir;
> struct strbuf sb = STRBUF_INIT;
> struct string_list reference = STRING_LIST_INIT_NODUP;
> +   int dissociate = 0;
> char *sm_alternate = NULL, *error_strategy = NULL;
>
> struct option module_clone_options[] = {
> @@ -1208,6 +1211,8 @@ static int module_clone(int argc, const char **argv, 
> const char *prefix)
> OPT_STRING_LIST(0, "reference", ,
>N_("repo"),
>N_("reference repository")),
> +   OPT_BOOL(0, "dissociate", ,
> +  N_("use --reference only while cloning")),
> OPT_STRING(0, "depth", ,
>N_("string"),
>N_("depth for shallow clones")),
> @@ -1247,7 +1252,7 @@ static int module_clone(int argc, const char **argv, 
> const char *prefix)
>
> prepare_possible_alternates(name, );
>
> -   if (clone_submodule(path, sm_gitdir, url, depth, ,
> +   if (clone_submodule(path, sm_gitdir, url, depth, , 
> dissociate,
> quiet, progress))
> die(_("clone of '%s' into submodule path '%s' 
> failed"),
> url, path);
> @@ -1300,6 +1305,7 @@ struct submodule_update_clone {
> int quiet;
> int recommend_shallow;
> struct string_list references;
> +   int dissociate;
> const char *depth;
> const char *recursive_prefix;
> const char *prefix;
> @@ -1315,7 +1321,7 @@ struct submodule_update_clone {
> int failed_clones_nr, failed_clones_alloc;
>  };
>  #defin

[PATCH 3/3] submodule: add --dissociate option to add/update commands

2018-05-01 Thread Casey Fitzpatrick
Add --dissociate option to add and update commands, both clone helper commands
that already have the --reference option --dissociate pairs with.

Signed-off-by: Casey Fitzpatrick <kcgh...@gmail.com>
---
 Documentation/git-submodule.txt | 10 +-
 builtin/submodule--helper.c | 16 +---
 git-submodule.sh| 10 +-
 t/t7408-submodule-reference.sh  | 17 +
 4 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index d1ebe32e8..a75b95043 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -369,7 +369,15 @@ the submodule itself.
this option will be passed to the linkgit:git-clone[1] command.
 +
 *NOTE*: Do *not* use this option unless you have read the note
-for linkgit:git-clone[1]'s `--reference` and `--shared` options carefully.
+for linkgit:git-clone[1]'s `--reference`, `--shared`, and `--dissociate`
+options carefully.
+
+--dissociate::
+   This option is only valid for add and update commands.  These
+   commands sometimes need to clone a remote repository. In this case,
+   this option will be passed to the linkgit:git-clone[1] command.
++
+*NOTE*: see the NOTE for the `--reference` option.
 
 --recursive::
This option is only valid for foreach, update, status and sync commands.
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6ba8587b6..a85b30419 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1056,7 +1056,7 @@ static int module_deinit(int argc, const char **argv, 
const char *prefix)
 }
 
 static int clone_submodule(const char *path, const char *gitdir, const char 
*url,
-  const char *depth, struct string_list *reference,
+  const char *depth, struct string_list *reference, 
int dissociate,
   int quiet, int progress)
 {
struct child_process cp = CHILD_PROCESS_INIT;
@@ -1075,6 +1075,8 @@ static int clone_submodule(const char *path, const char 
*gitdir, const char *url
argv_array_pushl(, "--reference",
 item->string, NULL);
}
+   if (dissociate)
+   argv_array_push(, "--dissociate");
if (gitdir && *gitdir)
argv_array_pushl(, "--separate-git-dir", gitdir, NULL);
 
@@ -1190,6 +1192,7 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
char *p, *path = NULL, *sm_gitdir;
struct strbuf sb = STRBUF_INIT;
struct string_list reference = STRING_LIST_INIT_NODUP;
+   int dissociate = 0;
char *sm_alternate = NULL, *error_strategy = NULL;
 
struct option module_clone_options[] = {
@@ -1208,6 +1211,8 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
OPT_STRING_LIST(0, "reference", ,
   N_("repo"),
   N_("reference repository")),
+   OPT_BOOL(0, "dissociate", ,
+  N_("use --reference only while cloning")),
OPT_STRING(0, "depth", ,
   N_("string"),
   N_("depth for shallow clones")),
@@ -1247,7 +1252,7 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
 
prepare_possible_alternates(name, );
 
-   if (clone_submodule(path, sm_gitdir, url, depth, ,
+   if (clone_submodule(path, sm_gitdir, url, depth, , 
dissociate,
quiet, progress))
die(_("clone of '%s' into submodule path '%s' failed"),
url, path);
@@ -1300,6 +1305,7 @@ struct submodule_update_clone {
int quiet;
int recommend_shallow;
struct string_list references;
+   int dissociate;
const char *depth;
const char *recursive_prefix;
const char *prefix;
@@ -1315,7 +1321,7 @@ struct submodule_update_clone {
int failed_clones_nr, failed_clones_alloc;
 };
 #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \
-   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, \
+   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, \
NULL, NULL, NULL, \
STRING_LIST_INIT_DUP, 0, NULL, 0, 0}
 
@@ -1442,6 +1448,8 @@ static int prepare_to_clone_next_submodule(const struct 
cache_entry *ce,
for_each_string_list_item(item, >references)
argv_array_pushl(>args, "--reference", 
item->string, NULL);
}
+   if (suc->dissociate)
+   argv_array_push(>args, "--dissociate");
if (suc->depth)

[PATCH v2 16/42] Update struct index_state to use struct object_id

2018-05-01 Thread brian m. carlson
Adjust struct index_state to use struct object_id instead of unsigned
char [20].

Signed-off-by: brian m. carlson 
---
 cache.h  |  2 +-
 read-cache.c | 16 
 t/helper/test-dump-split-index.c |  2 +-
 unpack-trees.c   |  2 +-
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index f06737fb78..37d081b8e4 100644
--- a/cache.h
+++ b/cache.h
@@ -324,7 +324,7 @@ struct index_state {
 drop_cache_tree : 1;
struct hashmap name_hash;
struct hashmap dir_hash;
-   unsigned char sha1[20];
+   struct object_id oid;
struct untracked_cache *untracked;
uint64_t fsmonitor_last_update;
struct ewah_bitmap *fsmonitor_dirty;
diff --git a/read-cache.c b/read-cache.c
index f47666b975..9dbaeeec43 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1806,7 +1806,7 @@ int do_read_index(struct index_state *istate, const char 
*path, int must_exist)
if (verify_hdr(hdr, mmap_size) < 0)
goto unmap;
 
-   hashcpy(istate->sha1, (const unsigned char *)hdr + mmap_size - 
the_hash_algo->rawsz);
+   hashcpy(istate->oid.hash, (const unsigned char *)hdr + mmap_size - 
the_hash_algo->rawsz);
istate->version = ntohl(hdr->hdr_version);
istate->cache_nr = ntohl(hdr->hdr_entries);
istate->cache_alloc = alloc_nr(istate->cache_nr);
@@ -1902,10 +1902,10 @@ int read_index_from(struct index_state *istate, const 
char *path,
base_oid_hex = oid_to_hex(_index->base_oid);
base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_oid_hex);
ret = do_read_index(split_index->base, base_path, 1);
-   if (hashcmp(split_index->base_oid.hash, split_index->base->sha1))
+   if (oidcmp(_index->base_oid, _index->base->oid))
die("broken index, expect %s in %s, got %s",
base_oid_hex, base_path,
-   sha1_to_hex(split_index->base->sha1));
+   oid_to_hex(_index->base->oid));
 
freshen_shared_index(base_path, 0);
merge_base_index(istate);
@@ -2194,7 +2194,7 @@ static int verify_index_from(const struct index_state 
*istate, const char *path)
if (n != the_hash_algo->rawsz)
goto out;
 
-   if (hashcmp(istate->sha1, hash))
+   if (hashcmp(istate->oid.hash, hash))
goto out;
 
close(fd);
@@ -2373,7 +2373,7 @@ static int do_write_index(struct index_state *istate, 
struct tempfile *tempfile,
return -1;
}
 
-   if (ce_flush(, newfd, istate->sha1))
+   if (ce_flush(, newfd, istate->oid.hash))
return -1;
if (close_tempfile_gently(tempfile)) {
error(_("could not close '%s'"), tempfile->filename.buf);
@@ -2497,10 +2497,10 @@ static int write_shared_index(struct index_state 
*istate,
return ret;
}
ret = rename_tempfile(temp,
- git_path("sharedindex.%s", 
sha1_to_hex(si->base->sha1)));
+ git_path("sharedindex.%s", 
oid_to_hex(>base->oid)));
if (!ret) {
-   hashcpy(si->base_oid.hash, si->base->sha1);
-   clean_shared_index_files(sha1_to_hex(si->base->sha1));
+   oidcpy(>base_oid, >base->oid);
+   clean_shared_index_files(oid_to_hex(>base->oid));
}
 
return ret;
diff --git a/t/helper/test-dump-split-index.c b/t/helper/test-dump-split-index.c
index 754e9bb624..63c689d6ee 100644
--- a/t/helper/test-dump-split-index.c
+++ b/t/helper/test-dump-split-index.c
@@ -14,7 +14,7 @@ int cmd__dump_split_index(int ac, const char **av)
int i;
 
do_read_index(_index, av[1], 1);
-   printf("own %s\n", sha1_to_hex(the_index.sha1));
+   printf("own %s\n", oid_to_hex(_index.oid));
si = the_index.split_index;
if (!si) {
printf("not a split index\n");
diff --git a/unpack-trees.c b/unpack-trees.c
index e73745051e..038ef7b926 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1287,7 +1287,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
o->result.split_index = o->src_index->split_index;
if (o->result.split_index)
o->result.split_index->refcount++;
-   hashcpy(o->result.sha1, o->src_index->sha1);
+   oidcpy(>result.oid, >src_index->oid);
o->merge_size = len;
mark_all_ce_unused(o->src_index);
 


[PATCH v2 40/42] Update shell scripts to compute empty tree object ID

2018-05-01 Thread brian m. carlson
Several of our shell scripts hard-code the object ID of the empty tree.
To avoid any problems when changing hashes, compute this value on
startup of the script.  For performance, store the value in a variable
and reuse it throughout the life of the script.

Signed-off-by: brian m. carlson 
---
 git-filter-branch.sh   | 4 +++-
 git-rebase--interactive.sh | 4 +++-
 templates/hooks--pre-commit.sample | 2 +-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 64f21547c1..ccceaf19a7 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -11,6 +11,8 @@
 # The following functions will also be available in the commit filter:
 
 functions=$(cat << \EOF
+EMPTY_TREE=$(git hash-object -t tree /dev/null)
+
 warn () {
echo "$*" >&2
 }
@@ -46,7 +48,7 @@ git_commit_non_empty_tree()
 {
if test $# = 3 && test "$1" = $(git rev-parse "$3^{tree}"); then
map "$3"
-   elif test $# = 1 && test "$1" = 
4b825dc642cb6eb9a060e54bf8d69288fbee4904; then
+   elif test $# = 1 && test "$1" = $EMPTY_TREE; then
:
else
git commit-tree "$@"
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 9947e6265f..e5ae58a07e 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -81,6 +81,8 @@ rewritten_pending="$state_dir"/rewritten-pending
 # and leaves CR at the end instead.
 cr=$(printf "\015")
 
+empty_tree=$(git hash-object -t tree /dev/null)
+
 strategy_args=${strategy:+--strategy=$strategy}
 test -n "$strategy_opts" &&
 eval '
@@ -238,7 +240,7 @@ is_empty_commit() {
die "$(eval_gettext "\$sha1: not a commit that can be picked")"
}
ptree=$(git rev-parse -q --verify "$1"^^{tree} 2>/dev/null) ||
-   ptree=4b825dc642cb6eb9a060e54bf8d69288fbee4904
+   ptree=$empty_tree
test "$tree" = "$ptree"
 }
 
diff --git a/templates/hooks--pre-commit.sample 
b/templates/hooks--pre-commit.sample
index 68d62d5446..6a75641638 100755
--- a/templates/hooks--pre-commit.sample
+++ b/templates/hooks--pre-commit.sample
@@ -12,7 +12,7 @@ then
against=HEAD
 else
# Initial commit: diff against an empty tree object
-   against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
+   against=$(git hash-object -t tree /dev/null)
 fi
 
 # If you want to allow non-ASCII filenames set this variable to true.


Re: [PATCH 2/2] submodule: Add --dissociate option to add/update commands

2018-05-01 Thread Casey Fitzpatrick
Thanks, I will clean up the braces and commit message.

I have to disagree with the 's/reference/dissociate/' comments. It
appears this section of option descriptions mostly copies from the
descriptions given by 'git clone -h', which outputs:
--reference reference repository
--reference-if-able 
  reference repository
--dissociate  use --reference only while cloning
It is perhaps not the best description, but it means to say when
--dissociate is used --reference is only in play for reducing network
transfer, not keeping alternates.

Those expansions *are* certainly off-putting, they make a long line
even more difficult to parse. I just searched that file for ":+" for
those types of expressions and I think a patch to fix them would be
fairly short. I'll look into making that cleanup patch.



On Tue, May 1, 2018 at 4:25 PM, Eric Sunshine <sunsh...@sunshineco.com> wrote:
> On Tue, May 1, 2018 at 2:09 PM, Casey Fitzpatrick <kcgh...@gmail.com> wrote:
>> submodule: Add --dissociate option to add/update commands
>
> s/Add/add/
>
>> Add --dissociate option to add and update commands, both clone helper 
>> commands
>> that already have the --reference option --dissociate pairs with.
>> Add documentation.
>> Add tests.
>>
>> Signed-off-by: Casey Fitzpatrick <kcgh...@gmail.com>
>> ---
>> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
>> @@ -1075,6 +1075,9 @@ static int clone_submodule(const char *path, const 
>> char *gitdir, const char *url
>> +   if (dissociate) {
>> +   argv_array_push(, "--dissociate");
>> +   }
>
> Style: drop unnecessary braces
>
>> @@ -1208,6 +1212,8 @@ static int module_clone(int argc, const char **argv, 
>> const char *prefix)
>> +   OPT_BOOL(0, "dissociate", ,
>> +  N_("use --reference only while cloning")),
>
> s/reference/dissociate/
>
>> @@ -1575,6 +1584,8 @@ static int update_clone(int argc, const char **argv, 
>> const char *prefix)
>> +   OPT_BOOL(0, "dissociate", ,
>> +  N_("use --reference only while cloning")),
>
> s/reference/dissociate/
>
>> diff --git a/git-submodule.sh b/git-submodule.sh
>> +   --dissociate)
>> +   dissociate="--dissociate"
>> @@ -258,7 +262,7 @@ or you are unsure what this means choose another name 
>> with the '--name' option."
>> -   git submodule--helper clone ${GIT_QUIET:+--quiet} 
>> ${progress:+"$progress"} --prefix "$wt_prefix" --path "$sm_path" --name 
>> "$sm_name" --url "$realrepo" ${reference:+"$reference"} ${depth:+"$depth"} 
>> || exit
>> +   git submodule--helper clone ${GIT_QUIET:+--quiet} 
>> ${progress:+"$progress"} --prefix "$wt_prefix" --path "$sm_path" --name 
>> "$sm_name" --url "$realrepo" ${reference:+"$reference"} 
>> ${dissociate:+"$dissociate"} ${depth:+"$depth"} || exit
>
> I realize that you're just following existing practice in this script,
> but it's a bit off-putting to see expansions for the new --progress
> and --dissociate options being done via unnecessarily complex
> ${foobar:+"$foobar"} when the simpler $foobar would work just as well.
>
> Just a comment; not necessarily a request for change. (A separate
> preparatory cleanup patch which simplifies the existing complex
> expansion expressions would be welcome but could also be considered
> outside the scope of this patch series.)
>
>> @@ -493,6 +497,9 @@ cmd_update()
>> +   --dissociate)
>> +   dissociate="--dissociate"
>> +   ;;
>> @@ -550,6 +557,7 @@ cmd_update()
>> ${reference:+"$reference"} \
>> +   ${dissociate:+"$dissociate"} \
>> ${depth:+--depth "$depth"} \


Re: [PATCH 2/2] submodule: Add --dissociate option to add/update commands

2018-05-01 Thread Eric Sunshine
On Tue, May 1, 2018 at 2:09 PM, Casey Fitzpatrick <kcgh...@gmail.com> wrote:
> submodule: Add --dissociate option to add/update commands

s/Add/add/

> Add --dissociate option to add and update commands, both clone helper commands
> that already have the --reference option --dissociate pairs with.
> Add documentation.
> Add tests.
>
> Signed-off-by: Casey Fitzpatrick <kcgh...@gmail.com>
> ---
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> @@ -1075,6 +1075,9 @@ static int clone_submodule(const char *path, const char 
> *gitdir, const char *url
> +   if (dissociate) {
> +   argv_array_push(, "--dissociate");
> +   }

Style: drop unnecessary braces

> @@ -1208,6 +1212,8 @@ static int module_clone(int argc, const char **argv, 
> const char *prefix)
> +   OPT_BOOL(0, "dissociate", ,
> +  N_("use --reference only while cloning")),

s/reference/dissociate/

> @@ -1575,6 +1584,8 @@ static int update_clone(int argc, const char **argv, 
> const char *prefix)
> +   OPT_BOOL(0, "dissociate", ,
> +  N_("use --reference only while cloning")),

s/reference/dissociate/

> diff --git a/git-submodule.sh b/git-submodule.sh
> +   --dissociate)
> +   dissociate="--dissociate"
> @@ -258,7 +262,7 @@ or you are unsure what this means choose another name 
> with the '--name' option."
> -   git submodule--helper clone ${GIT_QUIET:+--quiet} 
> ${progress:+"$progress"} --prefix "$wt_prefix" --path "$sm_path" --name 
> "$sm_name" --url "$realrepo" ${reference:+"$reference"} ${depth:+"$depth"} || 
> exit
> +   git submodule--helper clone ${GIT_QUIET:+--quiet} 
> ${progress:+"$progress"} --prefix "$wt_prefix" --path "$sm_path" --name 
> "$sm_name" --url "$realrepo" ${reference:+"$reference"} 
> ${dissociate:+"$dissociate"} ${depth:+"$depth"} || exit

I realize that you're just following existing practice in this script,
but it's a bit off-putting to see expansions for the new --progress
and --dissociate options being done via unnecessarily complex
${foobar:+"$foobar"} when the simpler $foobar would work just as well.

Just a comment; not necessarily a request for change. (A separate
preparatory cleanup patch which simplifies the existing complex
expansion expressions would be welcome but could also be considered
outside the scope of this patch series.)

> @@ -493,6 +497,9 @@ cmd_update()
> +   --dissociate)
> +   dissociate="--dissociate"
> +   ;;
> @@ -550,6 +557,7 @@ cmd_update()
> ${reference:+"$reference"} \
> +   ${dissociate:+"$dissociate"} \
> ${depth:+--depth "$depth"} \


Re: [PATCH 2/2] submodule: Add --dissociate option to add/update commands

2018-05-01 Thread Stefan Beller
On Tue, May 1, 2018 at 11:09 AM, Casey Fitzpatrick  wrote:
>
> +test_expect_success 'submodule add --reference with --dissociate doesnt use 
> alternates' '
> +   (
> +   cd super &&
> +   git submodule add --reference ../B --dissociate 
> "file://$base_dir/A" sub-dissociate &&
> +   git commit -m B-super-added &&
> +   git repack -ad
> +   ) &&
> +   test_must_fail test_alternate_is_used 
> super/.git/modules/sub-dissociate/objects/info/alternates super/sub-dissociate

We do already have the unfortunate precedent of using "test_must_fail
test_alternate_is_used"
(and it was partially included by me in 31224cbdc72 (clone: recursive
and reference option
triggers submodule alternates, 2016-08-17), further used in
bf03b790471 (submodule--helper:
set alternateLocation for cloned submodules, 2016-12-08).

I think it is the wrong approach though, as the test_must_fail only ensure that
*something* doesn't return success. Usually we use it with
"test_must_fail git ..."
to ensure we properly error out on user requests, that are impossible
to fulfill in
the current implementation.

For this test case I'd suggest we rather test that the no alternate is setup,

test_path_is_missing
super/.git/modules/sub-dissociate/objects/info/alternates

as that tests the correctness of the --dissociate option?

Thanks,
Stefan


[PATCH 2/2] submodule: Add --dissociate option to add/update commands

2018-05-01 Thread Casey Fitzpatrick
Add --dissociate option to add and update commands, both clone helper commands
that already have the --reference option --dissociate pairs with.
Add documentation.
Add tests.

Signed-off-by: Casey Fitzpatrick <kcgh...@gmail.com>
---
 Documentation/git-submodule.txt | 10 +-
 builtin/submodule--helper.c | 17 ++---
 git-submodule.sh| 10 +-
 t/t7408-submodule-reference.sh  | 17 +
 4 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index d1ebe32e8..a75b95043 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -369,7 +369,15 @@ the submodule itself.
this option will be passed to the linkgit:git-clone[1] command.
 +
 *NOTE*: Do *not* use this option unless you have read the note
-for linkgit:git-clone[1]'s `--reference` and `--shared` options carefully.
+for linkgit:git-clone[1]'s `--reference`, `--shared`, and `--dissociate`
+options carefully.
+
+--dissociate::
+   This option is only valid for add and update commands.  These
+   commands sometimes need to clone a remote repository. In this case,
+   this option will be passed to the linkgit:git-clone[1] command.
++
+*NOTE*: see the NOTE for the `--reference` option.
 
 --recursive::
This option is only valid for foreach, update, status and sync commands.
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6ba8587b6..655f84f3c 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1056,7 +1056,7 @@ static int module_deinit(int argc, const char **argv, 
const char *prefix)
 }
 
 static int clone_submodule(const char *path, const char *gitdir, const char 
*url,
-  const char *depth, struct string_list *reference,
+  const char *depth, struct string_list *reference, 
int dissociate,
   int quiet, int progress)
 {
struct child_process cp = CHILD_PROCESS_INIT;
@@ -1075,6 +1075,9 @@ static int clone_submodule(const char *path, const char 
*gitdir, const char *url
argv_array_pushl(, "--reference",
 item->string, NULL);
}
+   if (dissociate) {
+   argv_array_push(, "--dissociate");
+   }
if (gitdir && *gitdir)
argv_array_pushl(, "--separate-git-dir", gitdir, NULL);
 
@@ -1190,6 +1193,7 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
char *p, *path = NULL, *sm_gitdir;
struct strbuf sb = STRBUF_INIT;
struct string_list reference = STRING_LIST_INIT_NODUP;
+   int dissociate = 0;
char *sm_alternate = NULL, *error_strategy = NULL;
 
struct option module_clone_options[] = {
@@ -1208,6 +1212,8 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
OPT_STRING_LIST(0, "reference", ,
   N_("repo"),
   N_("reference repository")),
+   OPT_BOOL(0, "dissociate", ,
+  N_("use --reference only while cloning")),
OPT_STRING(0, "depth", ,
   N_("string"),
   N_("depth for shallow clones")),
@@ -1247,7 +1253,7 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
 
prepare_possible_alternates(name, );
 
-   if (clone_submodule(path, sm_gitdir, url, depth, ,
+   if (clone_submodule(path, sm_gitdir, url, depth, , 
dissociate,
quiet, progress))
die(_("clone of '%s' into submodule path '%s' failed"),
url, path);
@@ -1300,6 +1306,7 @@ struct submodule_update_clone {
int quiet;
int recommend_shallow;
struct string_list references;
+   int dissociate;
const char *depth;
const char *recursive_prefix;
const char *prefix;
@@ -1315,7 +1322,7 @@ struct submodule_update_clone {
int failed_clones_nr, failed_clones_alloc;
 };
 #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \
-   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, \
+   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, \
NULL, NULL, NULL, \
STRING_LIST_INIT_DUP, 0, NULL, 0, 0}
 
@@ -1442,6 +1449,8 @@ static int prepare_to_clone_next_submodule(const struct 
cache_entry *ce,
for_each_string_list_item(item, >references)
argv_array_pushl(>args, "--reference", 
item->string, NULL);
}
+   if (suc->dissociate)
+   argv_array_push(>args, "--dissoc

[PATCH v5 11/11] commit-graph.txt: update design document

2018-05-01 Thread Derrick Stolee
We now calculate generation numbers in the commit-graph file and use
them in paint_down_to_common().

Expand the section on generation numbers to discuss how the three
special generation numbers GENERATION_NUMBER_INFINITY, _ZERO, and
_MAX interact with other generation numbers.

Signed-off-by: Derrick Stolee 
---
 Documentation/technical/commit-graph.txt | 29 
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/Documentation/technical/commit-graph.txt 
b/Documentation/technical/commit-graph.txt
index 0550c6d0dc..e1a883eb46 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -77,6 +77,29 @@ in the commit graph. We can treat these commits as having 
"infinite"
 generation number and walk until reaching commits with known generation
 number.
 
+We use the macro GENERATION_NUMBER_INFINITY = 0x to mark commits not
+in the commit-graph file. If a commit-graph file was written by a version
+of Git that did not compute generation numbers, then those commits will
+have generation number represented by the macro GENERATION_NUMBER_ZERO = 0.
+
+Since the commit-graph file is closed under reachability, we can guarantee
+the following weaker condition on all commits:
+
+If A and B are commits with generation numbers N amd M, respectively,
+and N < M, then A cannot reach B.
+
+Note how the strict inequality differs from the inequality when we have
+fully-computed generation numbers. Using strict inequality may result in
+walking a few extra commits, but the simplicity in dealing with commits
+with generation number *_INFINITY or *_ZERO is valuable.
+
+We use the macro GENERATION_NUMBER_MAX = 0x3FFF to for commits whose
+generation numbers are computed to be at least this value. We limit at
+this value since it is the largest value that can be stored in the
+commit-graph file using the 30 bits available to generation numbers. This
+presents another case where a commit can have generation number equal to
+that of a parent.
+
 Design Details
 --
 
@@ -98,18 +121,14 @@ Future Work
 - The 'commit-graph' subcommand does not have a "verify" mode that is
   necessary for integration with fsck.
 
-- The file format includes room for precomputed generation numbers. These
-  are not currently computed, so all generation numbers will be marked as
-  0 (or "uncomputed"). A later patch will include this calculation.
-
 - After computing and storing generation numbers, we must make graph
   walks aware of generation numbers to gain the performance benefits they
   enable. This will mostly be accomplished by swapping a commit-date-ordered
   priority queue with one ordered by generation number. The following
   operations are important candidates:
 
-- paint_down_to_common()
 - 'log --topo-order'
+- 'tag --merged'
 
 - Currently, parse_commit_gently() requires filling in the root tree
   object for a commit. This passes through lookup_tree() and consequently
-- 
2.17.0.39.g685157f7fb



Re: [PATCH v4 10/10] commit-graph.txt: update design document

2018-05-01 Thread Derrick Stolee

On 4/30/2018 7:32 PM, Jakub Narebski wrote:

Derrick Stolee  writes:


We now calculate generation numbers in the commit-graph file and use
them in paint_down_to_common().

Expand the section on generation numbers to discuss how the three
special generation numbers GENERATION_NUMBER_INFINITY, _ZERO, and
_MAX interact with other generation numbers.

Signed-off-by: Derrick Stolee 

Looks good.


---
  Documentation/technical/commit-graph.txt | 30 +++-
  1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/commit-graph.txt 
b/Documentation/technical/commit-graph.txt
index 0550c6d0dc..d9f2713efa 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -77,6 +77,29 @@ in the commit graph. We can treat these commits as having 
"infinite"
  generation number and walk until reaching commits with known generation
  number.
  
+We use the macro GENERATION_NUMBER_INFINITY = 0x to mark commits not

+in the commit-graph file. If a commit-graph file was written by a version
+of Git that did not compute generation numbers, then those commits will
+have generation number represented by the macro GENERATION_NUMBER_ZERO = 0.
+
+Since the commit-graph file is closed under reachability, we can guarantee
+the following weaker condition on all commits:
+
+If A and B are commits with generation numbers N amd M, respectively,
+and N < M, then A cannot reach B.
+
+Note how the strict inequality differs from the inequality when we have
+fully-computed generation numbers. Using strict inequality may result in
+walking a few extra commits,

The linux kernel commit graph has maximum of 513 commits sharing the
same generation number, but is is 5.43 commits sharing the same
generation number on average, with standard deviation 10.70; median is
even lower: it is 2, with 5.35 median absolute deviation (MAD).

So on average it would be a few extra commits.  Right.


   but the simplicity in dealing with commits
+with generation number *_INFINITY or *_ZERO is valuable.

As I wrote before, handling those corner cases in more complicated, but
not that complicated.  We could simply use stronger condition if both
generation numbers are ordinary generation numbers, and weaker condition
when at least one generation number has one of those special values.


+
+We use the macro GENERATION_NUMBER_MAX = 0x3FFF to for commits whose
+generation numbers are computed to be at least this value. We limit at
+this value since it is the largest value that can be stored in the
+commit-graph file using the 30 bits available to generation numbers. This
+presents another case where a commit can have generation number equal to
+that of a parent.

Ordinary generation numbers, where stronger condition holds, are those
between GENERATION_NUMBER_ZERO < gen(C) < GENERATION_NUMBER_MAX.


+
  Design Details
  --
  
@@ -98,17 +121,12 @@ Future Work

  - The 'commit-graph' subcommand does not have a "verify" mode that is
necessary for integration with fsck.
  
-- The file format includes room for precomputed generation numbers. These

-  are not currently computed, so all generation numbers will be marked as
-  0 (or "uncomputed"). A later patch will include this calculation.
-

Good.


  - After computing and storing generation numbers, we must make graph
walks aware of generation numbers to gain the performance benefits they
enable. This will mostly be accomplished by swapping a commit-date-ordered
priority queue with one ordered by generation number. The following
-  operations are important candidates:
+  operation is an important candidate:
  
-- paint_down_to_common()

  - 'log --topo-order'

Another possible candidates:

- remove_redundant() - see comment in previous patch
- still_interesting() - where Git uses date slop to stop walking
  too far


remove_redundant() will be included in v5, thanks.

Instead of "still_interesting()" I'll add "git tag --merged" as the 
candidate to consider, as discussed in [1].


[1] https://public-inbox.org/git/87fu3g67ry@lant.ki.iif.hu/t/#u
    "branch --contains / tag --merged inconsistency"



  
  - Currently, parse_commit_gently() requires filling in the root tree

One important issue left is handling features that change view of
project history, and their interaction with commit-graph feature.

What would happen, if we turn on commit-graph feature, generate commit
graph file, and then:

   * use graft file or remove graft entries to cut history, or remove cut
 or join two [independent] histories.
   * use git-replace mechanims to do the same
   * in shallow clone, deepen or shorten the clone

What would happen if without re-generating commit-graph file (assuming
tha Git wouldn't do it for us), we run some feature that makes use of
commit-graph data:

   - git branch 

Re: [PATCH 39/41] Update shell scripts to compute empty tree object ID

2018-05-01 Thread Duy Nguyen
On Mon, Apr 23, 2018 at 11:39:49PM +, brian m. carlson wrote:
> Several of our shell scripts hard-code the object ID of the empty tree.
> To avoid any problems when changing hashes, compute this value on
> startup of the script.  For performance, store the value in a variable
> and reuse it throughout the life of the script.
> 
> Signed-off-by: brian m. carlson 
> ---
>  git-filter-branch.sh   | 4 +++-
>  git-rebase--interactive.sh | 4 +++-
>  templates/hooks--pre-commit.sample | 2 +-
>  3 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/git-filter-branch.sh b/git-filter-branch.sh
> index 64f21547c1..ccceaf19a7 100755
> --- a/git-filter-branch.sh
> +++ b/git-filter-branch.sh
> @@ -11,6 +11,8 @@
>  # The following functions will also be available in the commit filter:
>  
>  functions=$(cat << \EOF
> +EMPTY_TREE=$(git hash-object -t tree /dev/null)

All scripts (except those example hooks) must source
git-sh-setup. Should we define this in there instead?
--
Duy


Re: [PATCH v4 10/10] commit-graph.txt: update design document

2018-04-30 Thread Jakub Narebski
Derrick Stolee  writes:

> We now calculate generation numbers in the commit-graph file and use
> them in paint_down_to_common().
>
> Expand the section on generation numbers to discuss how the three
> special generation numbers GENERATION_NUMBER_INFINITY, _ZERO, and
> _MAX interact with other generation numbers.
>
> Signed-off-by: Derrick Stolee 

Looks good.

> ---
>  Documentation/technical/commit-graph.txt | 30 +++-
>  1 file changed, 24 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/technical/commit-graph.txt 
> b/Documentation/technical/commit-graph.txt
> index 0550c6d0dc..d9f2713efa 100644
> --- a/Documentation/technical/commit-graph.txt
> +++ b/Documentation/technical/commit-graph.txt
> @@ -77,6 +77,29 @@ in the commit graph. We can treat these commits as having 
> "infinite"
>  generation number and walk until reaching commits with known generation
>  number.
>  
> +We use the macro GENERATION_NUMBER_INFINITY = 0x to mark commits not
> +in the commit-graph file. If a commit-graph file was written by a version
> +of Git that did not compute generation numbers, then those commits will
> +have generation number represented by the macro GENERATION_NUMBER_ZERO = 0.
> +
> +Since the commit-graph file is closed under reachability, we can guarantee
> +the following weaker condition on all commits:
> +
> +If A and B are commits with generation numbers N amd M, respectively,
> +and N < M, then A cannot reach B.
> +
> +Note how the strict inequality differs from the inequality when we have
> +fully-computed generation numbers. Using strict inequality may result in
> +walking a few extra commits,

The linux kernel commit graph has maximum of 513 commits sharing the
same generation number, but is is 5.43 commits sharing the same
generation number on average, with standard deviation 10.70; median is
even lower: it is 2, with 5.35 median absolute deviation (MAD).

So on average it would be a few extra commits.  Right.

>   but the simplicity in dealing with commits
> +with generation number *_INFINITY or *_ZERO is valuable.

As I wrote before, handling those corner cases in more complicated, but
not that complicated.  We could simply use stronger condition if both
generation numbers are ordinary generation numbers, and weaker condition
when at least one generation number has one of those special values.

> +
> +We use the macro GENERATION_NUMBER_MAX = 0x3FFF to for commits whose
> +generation numbers are computed to be at least this value. We limit at
> +this value since it is the largest value that can be stored in the
> +commit-graph file using the 30 bits available to generation numbers. This
> +presents another case where a commit can have generation number equal to
> +that of a parent.

Ordinary generation numbers, where stronger condition holds, are those
between GENERATION_NUMBER_ZERO < gen(C) < GENERATION_NUMBER_MAX.

> +
>  Design Details
>  --
>  
> @@ -98,17 +121,12 @@ Future Work
>  - The 'commit-graph' subcommand does not have a "verify" mode that is
>necessary for integration with fsck.
>  
> -- The file format includes room for precomputed generation numbers. These
> -  are not currently computed, so all generation numbers will be marked as
> -  0 (or "uncomputed"). A later patch will include this calculation.
> -

Good.

>  - After computing and storing generation numbers, we must make graph
>walks aware of generation numbers to gain the performance benefits they
>enable. This will mostly be accomplished by swapping a commit-date-ordered
>priority queue with one ordered by generation number. The following
> -  operations are important candidates:
> +  operation is an important candidate:
>  
> -- paint_down_to_common()
>  - 'log --topo-order'

Another possible candidates:

   - remove_redundant() - see comment in previous patch
   - still_interesting() - where Git uses date slop to stop walking
 too far

>  
>  - Currently, parse_commit_gently() requires filling in the root tree

One important issue left is handling features that change view of
project history, and their interaction with commit-graph feature.

What would happen, if we turn on commit-graph feature, generate commit
graph file, and then:

  * use graft file or remove graft entries to cut history, or remove cut
or join two [independent] histories.
  * use git-replace mechanims to do the same
  * in shallow clone, deepen or shorten the clone

What would happen if without re-generating commit-graph file (assuming
tha Git wouldn't do it for us), we run some feature that makes use of
commit-graph data:

  - git branch --contains
  - git tag --contains
  - git rev-list A..B

Best,
-- 
Jakub Narębski


Re: [PATCH 1/6] sequencer: extract helper to update active_cache_tree

2018-04-28 Thread Stefan Beller
Hi Johannes,

On Fri, Apr 27, 2018 at 3:30 PM, Johannes Schindelin
<johannes.schinde...@gmx.de> wrote:
> This patch extracts the code from is_index_unchanged() to initialize or
> update the index' cache tree (i.e. a tree object reflecting the current
> index' top-level tree).
>
> The new helper will be used in the upcoming code to support `git rebase
> -i --root` via the sequencer.
>
> Signed-off-by: Johannes Schindelin <johannes.schinde...@gmx.de>
> ---
>  sequencer.c | 27 ++-
>  1 file changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index e2f83942843..90c8218aa9a 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -562,9 +562,23 @@ static int do_recursive_merge(struct commit *base, 
> struct commit *next,
> return !clean;
>  }
>
> +static struct object_id *get_cache_tree_oid(void)
> +{
> +   if (!active_cache_tree)
> +   active_cache_tree = cache_tree();
> +
> +   if (!cache_tree_fully_valid(active_cache_tree))
> +   if (cache_tree_update(_index, 0)) {
> +   error(_("unable to update cache tree"));
> +   return NULL;
> +   }

This move is a verbatim move of the code below, except that
we need to add braces. For some reason I fantasize of using
the comma operator in C eventually (which we do not use in
our code base AFAICT), then we could leave out the braces
and have a

return error(...), NULL;

Anyway, this patch is
Reviewed-by: Stefan Beller <sbel...@google.com>


[PATCH 1/6] sequencer: extract helper to update active_cache_tree

2018-04-27 Thread Johannes Schindelin
This patch extracts the code from is_index_unchanged() to initialize or
update the index' cache tree (i.e. a tree object reflecting the current
index' top-level tree).

The new helper will be used in the upcoming code to support `git rebase
-i --root` via the sequencer.

Signed-off-by: Johannes Schindelin <johannes.schinde...@gmx.de>
---
 sequencer.c | 27 ++-
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index e2f83942843..90c8218aa9a 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -562,9 +562,23 @@ static int do_recursive_merge(struct commit *base, struct 
commit *next,
return !clean;
 }
 
+static struct object_id *get_cache_tree_oid(void)
+{
+   if (!active_cache_tree)
+   active_cache_tree = cache_tree();
+
+   if (!cache_tree_fully_valid(active_cache_tree))
+   if (cache_tree_update(_index, 0)) {
+   error(_("unable to update cache tree"));
+   return NULL;
+   }
+
+   return _cache_tree->oid;
+}
+
 static int is_index_unchanged(void)
 {
-   struct object_id head_oid;
+   struct object_id head_oid, *cache_tree_oid;
struct commit *head_commit;
 
if (!resolve_ref_unsafe("HEAD", RESOLVE_REF_READING, _oid, NULL))
@@ -583,15 +597,10 @@ static int is_index_unchanged(void)
if (parse_commit(head_commit))
return -1;
 
-   if (!active_cache_tree)
-   active_cache_tree = cache_tree();
-
-   if (!cache_tree_fully_valid(active_cache_tree))
-   if (cache_tree_update(_index, 0))
-   return error(_("unable to update cache tree"));
+   if (!(cache_tree_oid = get_cache_tree_oid()))
+   return -1;
 
-   return !oidcmp(_cache_tree->oid,
-  _commit->tree->object.oid);
+   return !oidcmp(cache_tree_oid, _commit->tree->object.oid);
 }
 
 static int write_author_script(const char *message)
-- 
2.17.0.windows.1.33.gfcbb1fa0445




Re: [PATCH v3 1/3] merge: update documentation for {merge,diff}.renameLimit

2018-04-26 Thread Jonathan Tan
On Thu, 26 Apr 2018 16:11:50 -0700
Elijah Newren  wrote:

> Patch looks fine, but it's hard for me not to notice a separate issue
> in this area independent of your series: I'm curious if we should
> document that the value of 0 is special here (as per Jonathan Tan's
> commit 89973554b52c ("diffcore-rename: make diff-tree -l0 mean
> -l", 2017-11-29)), and doesn't actually drop the limit to 0.
> cc'ing Jonathan Tan for his thoughts.

Documenting that the value of 0 is special does make sense to me. I
think this patch can go in as-is, though - it is already an improvement.


Re: [PATCH v3 1/3] merge: update documentation for {merge,diff}.renameLimit

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 1:52 PM, Ben Peart <ben.pe...@microsoft.com> wrote:
> Update the documentation to better indicate that the renameLimit setting is
> ignored if rename detection is turned off via command line options or config
> settings.
>
> Signed-off-by: Ben Peart <benpe...@microsoft.com>
> ---
>  Documentation/diff-config.txt  | 3 ++-
>  Documentation/merge-config.txt | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
> index 5ca942ab5e..77caa66c2f 100644
> --- a/Documentation/diff-config.txt
> +++ b/Documentation/diff-config.txt
> @@ -112,7 +112,8 @@ diff.orderFile::
>
>  diff.renameLimit::
> The number of files to consider when performing the copy/rename
> -   detection; equivalent to the 'git diff' option `-l`.
> +   detection; equivalent to the 'git diff' option `-l`. This setting
> +   has no effect if rename detection is turned off.
>
>  diff.renames::
> Whether and how Git detects renames.  If set to "false",
> diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
> index 12b6bbf591..48ee3bce77 100644
> --- a/Documentation/merge-config.txt
> +++ b/Documentation/merge-config.txt
> @@ -35,7 +35,8 @@ include::fmt-merge-msg-config.txt[]
>  merge.renameLimit::
> The number of files to consider when performing rename detection
> during a merge; if not specified, defaults to the value of
> -   diff.renameLimit.
> +   diff.renameLimit. This setting has no effect if rename detection
> +   is turned off.
>
>  merge.renormalize::
> Tell Git that canonical representation of files in the
> --
> 2.17.0.windows.1

Patch looks fine, but it's hard for me not to notice a separate issue
in this area independent of your series: I'm curious if we should
document that the value of 0 is special here (as per Jonathan Tan's
commit 89973554b52c ("diffcore-rename: make diff-tree -l0 mean
-l", 2017-11-29)), and doesn't actually drop the limit to 0.
cc'ing Jonathan Tan for his thoughts.


[PATCH v3 1/3] merge: update documentation for {merge,diff}.renameLimit

2018-04-26 Thread Ben Peart
Update the documentation to better indicate that the renameLimit setting is
ignored if rename detection is turned off via command line options or config
settings.

Signed-off-by: Ben Peart <benpe...@microsoft.com>
---
 Documentation/diff-config.txt  | 3 ++-
 Documentation/merge-config.txt | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index 5ca942ab5e..77caa66c2f 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -112,7 +112,8 @@ diff.orderFile::
 
 diff.renameLimit::
The number of files to consider when performing the copy/rename
-   detection; equivalent to the 'git diff' option `-l`.
+   detection; equivalent to the 'git diff' option `-l`. This setting
+   has no effect if rename detection is turned off.
 
 diff.renames::
Whether and how Git detects renames.  If set to "false",
diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
index 12b6bbf591..48ee3bce77 100644
--- a/Documentation/merge-config.txt
+++ b/Documentation/merge-config.txt
@@ -35,7 +35,8 @@ include::fmt-merge-msg-config.txt[]
 merge.renameLimit::
The number of files to consider when performing rename detection
during a merge; if not specified, defaults to the value of
-   diff.renameLimit.
+   diff.renameLimit. This setting has no effect if rename detection
+   is turned off.
 
 merge.renormalize::
Tell Git that canonical representation of files in the
-- 
2.17.0.windows.1



[PATCH v4 10/10] commit-graph.txt: update design document

2018-04-25 Thread Derrick Stolee
We now calculate generation numbers in the commit-graph file and use
them in paint_down_to_common().

Expand the section on generation numbers to discuss how the three
special generation numbers GENERATION_NUMBER_INFINITY, _ZERO, and
_MAX interact with other generation numbers.

Signed-off-by: Derrick Stolee 
---
 Documentation/technical/commit-graph.txt | 30 +++-
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/commit-graph.txt 
b/Documentation/technical/commit-graph.txt
index 0550c6d0dc..d9f2713efa 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -77,6 +77,29 @@ in the commit graph. We can treat these commits as having 
"infinite"
 generation number and walk until reaching commits with known generation
 number.
 
+We use the macro GENERATION_NUMBER_INFINITY = 0x to mark commits not
+in the commit-graph file. If a commit-graph file was written by a version
+of Git that did not compute generation numbers, then those commits will
+have generation number represented by the macro GENERATION_NUMBER_ZERO = 0.
+
+Since the commit-graph file is closed under reachability, we can guarantee
+the following weaker condition on all commits:
+
+If A and B are commits with generation numbers N amd M, respectively,
+and N < M, then A cannot reach B.
+
+Note how the strict inequality differs from the inequality when we have
+fully-computed generation numbers. Using strict inequality may result in
+walking a few extra commits, but the simplicity in dealing with commits
+with generation number *_INFINITY or *_ZERO is valuable.
+
+We use the macro GENERATION_NUMBER_MAX = 0x3FFF to for commits whose
+generation numbers are computed to be at least this value. We limit at
+this value since it is the largest value that can be stored in the
+commit-graph file using the 30 bits available to generation numbers. This
+presents another case where a commit can have generation number equal to
+that of a parent.
+
 Design Details
 --
 
@@ -98,17 +121,12 @@ Future Work
 - The 'commit-graph' subcommand does not have a "verify" mode that is
   necessary for integration with fsck.
 
-- The file format includes room for precomputed generation numbers. These
-  are not currently computed, so all generation numbers will be marked as
-  0 (or "uncomputed"). A later patch will include this calculation.
-
 - After computing and storing generation numbers, we must make graph
   walks aware of generation numbers to gain the performance benefits they
   enable. This will mostly be accomplished by swapping a commit-date-ordered
   priority queue with one ordered by generation number. The following
-  operations are important candidates:
+  operation is an important candidate:
 
-- paint_down_to_common()
 - 'log --topo-order'
 
 - Currently, parse_commit_gently() requires filling in the root tree
-- 
2.17.0.39.g685157f7fb



[PATCH 39/41] Update shell scripts to compute empty tree object ID

2018-04-23 Thread brian m. carlson
Several of our shell scripts hard-code the object ID of the empty tree.
To avoid any problems when changing hashes, compute this value on
startup of the script.  For performance, store the value in a variable
and reuse it throughout the life of the script.

Signed-off-by: brian m. carlson 
---
 git-filter-branch.sh   | 4 +++-
 git-rebase--interactive.sh | 4 +++-
 templates/hooks--pre-commit.sample | 2 +-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 64f21547c1..ccceaf19a7 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -11,6 +11,8 @@
 # The following functions will also be available in the commit filter:
 
 functions=$(cat << \EOF
+EMPTY_TREE=$(git hash-object -t tree /dev/null)
+
 warn () {
echo "$*" >&2
 }
@@ -46,7 +48,7 @@ git_commit_non_empty_tree()
 {
if test $# = 3 && test "$1" = $(git rev-parse "$3^{tree}"); then
map "$3"
-   elif test $# = 1 && test "$1" = 
4b825dc642cb6eb9a060e54bf8d69288fbee4904; then
+   elif test $# = 1 && test "$1" = $EMPTY_TREE; then
:
else
git commit-tree "$@"
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 50323fc273..cc873d630d 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -81,6 +81,8 @@ rewritten_pending="$state_dir"/rewritten-pending
 # and leaves CR at the end instead.
 cr=$(printf "\015")
 
+empty_tree=$(git hash-object -t tree /dev/null)
+
 strategy_args=${strategy:+--strategy=$strategy}
 test -n "$strategy_opts" &&
 eval '
@@ -238,7 +240,7 @@ is_empty_commit() {
die "$(eval_gettext "\$sha1: not a commit that can be picked")"
}
ptree=$(git rev-parse -q --verify "$1"^^{tree} 2>/dev/null) ||
-   ptree=4b825dc642cb6eb9a060e54bf8d69288fbee4904
+   ptree=$empty_tree
test "$tree" = "$ptree"
 }
 
diff --git a/templates/hooks--pre-commit.sample 
b/templates/hooks--pre-commit.sample
index 68d62d5446..6a75641638 100755
--- a/templates/hooks--pre-commit.sample
+++ b/templates/hooks--pre-commit.sample
@@ -12,7 +12,7 @@ then
against=HEAD
 else
# Initial commit: diff against an empty tree object
-   against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
+   against=$(git hash-object -t tree /dev/null)
 fi
 
 # If you want to allow non-ASCII filenames set this variable to true.


[PATCH 16/41] Update struct index_state to use struct object_id

2018-04-23 Thread brian m. carlson
Adjust struct index_state to use struct object_id instead of unsigned
char [20].

Signed-off-by: brian m. carlson 
---
 cache.h  |  2 +-
 read-cache.c | 16 
 t/helper/test-dump-split-index.c |  2 +-
 unpack-trees.c   |  2 +-
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index e03a0d4d23..9ad1dd2ddc 100644
--- a/cache.h
+++ b/cache.h
@@ -324,7 +324,7 @@ struct index_state {
 drop_cache_tree : 1;
struct hashmap name_hash;
struct hashmap dir_hash;
-   unsigned char sha1[20];
+   struct object_id oid;
struct untracked_cache *untracked;
uint64_t fsmonitor_last_update;
struct ewah_bitmap *fsmonitor_dirty;
diff --git a/read-cache.c b/read-cache.c
index f47666b975..9dbaeeec43 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1806,7 +1806,7 @@ int do_read_index(struct index_state *istate, const char 
*path, int must_exist)
if (verify_hdr(hdr, mmap_size) < 0)
goto unmap;
 
-   hashcpy(istate->sha1, (const unsigned char *)hdr + mmap_size - 
the_hash_algo->rawsz);
+   hashcpy(istate->oid.hash, (const unsigned char *)hdr + mmap_size - 
the_hash_algo->rawsz);
istate->version = ntohl(hdr->hdr_version);
istate->cache_nr = ntohl(hdr->hdr_entries);
istate->cache_alloc = alloc_nr(istate->cache_nr);
@@ -1902,10 +1902,10 @@ int read_index_from(struct index_state *istate, const 
char *path,
base_oid_hex = oid_to_hex(_index->base_oid);
base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_oid_hex);
ret = do_read_index(split_index->base, base_path, 1);
-   if (hashcmp(split_index->base_oid.hash, split_index->base->sha1))
+   if (oidcmp(_index->base_oid, _index->base->oid))
die("broken index, expect %s in %s, got %s",
base_oid_hex, base_path,
-   sha1_to_hex(split_index->base->sha1));
+   oid_to_hex(_index->base->oid));
 
freshen_shared_index(base_path, 0);
merge_base_index(istate);
@@ -2194,7 +2194,7 @@ static int verify_index_from(const struct index_state 
*istate, const char *path)
if (n != the_hash_algo->rawsz)
goto out;
 
-   if (hashcmp(istate->sha1, hash))
+   if (hashcmp(istate->oid.hash, hash))
goto out;
 
close(fd);
@@ -2373,7 +2373,7 @@ static int do_write_index(struct index_state *istate, 
struct tempfile *tempfile,
return -1;
}
 
-   if (ce_flush(, newfd, istate->sha1))
+   if (ce_flush(, newfd, istate->oid.hash))
return -1;
if (close_tempfile_gently(tempfile)) {
error(_("could not close '%s'"), tempfile->filename.buf);
@@ -2497,10 +2497,10 @@ static int write_shared_index(struct index_state 
*istate,
return ret;
}
ret = rename_tempfile(temp,
- git_path("sharedindex.%s", 
sha1_to_hex(si->base->sha1)));
+ git_path("sharedindex.%s", 
oid_to_hex(>base->oid)));
if (!ret) {
-   hashcpy(si->base_oid.hash, si->base->sha1);
-   clean_shared_index_files(sha1_to_hex(si->base->sha1));
+   oidcpy(>base_oid, >base->oid);
+   clean_shared_index_files(oid_to_hex(>base->oid));
}
 
return ret;
diff --git a/t/helper/test-dump-split-index.c b/t/helper/test-dump-split-index.c
index 754e9bb624..63c689d6ee 100644
--- a/t/helper/test-dump-split-index.c
+++ b/t/helper/test-dump-split-index.c
@@ -14,7 +14,7 @@ int cmd__dump_split_index(int ac, const char **av)
int i;
 
do_read_index(_index, av[1], 1);
-   printf("own %s\n", sha1_to_hex(the_index.sha1));
+   printf("own %s\n", oid_to_hex(_index.oid));
si = the_index.split_index;
if (!si) {
printf("not a split index\n");
diff --git a/unpack-trees.c b/unpack-trees.c
index e73745051e..038ef7b926 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1287,7 +1287,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
o->result.split_index = o->src_index->split_index;
if (o->result.split_index)
o->result.split_index->refcount++;
-   hashcpy(o->result.sha1, o->src_index->sha1);
+   oidcpy(>result.oid, >src_index->oid);
o->merge_size = len;
mark_all_ce_unused(o->src_index);
 


Re: [RFC PATCH 12/12] commit-graph: update design document

2018-04-20 Thread Jakub Narebski
Derrick Stolee  writes:

> The commit-graph feature is now integrated with 'fsck' and 'gc',
> so remove those items from the "Future Work" section of the
> commit-graph design document.

See comments below, but this looks good to me.

What is missing from the "Future Work" section (and which was somewhat
implied by now removed "When this feature stabilizes enough to recommend
to most users") is safety against history [view] changing features:
git-replace, shallow clone and grafts file.  I wrote about this in
"Re: [PATCH v8 04/14] graph: add commit graph design document"
https://public-inbox.org/git/86vacsjdcg@gmail.com/

JN> The problem in my opinion lies in different direction, namely that
JN> commit grafts can change, changing the view of the history.  If we want
JN> commit-graph file to follow user-visible view of the history of the
JN> project, it needs to respect current version of commit grafts - but what
JN> if commit grafts changed since commit-graph file was generated?
JN> 
JN> Actually, there are currently three ways to affect the view of the
JN> history:
JN> 
JN> a. legacy commit grafts mechanism; it was first, but it is not safe,
JN>cannot be transferred on fetch / push, and is now deprecated.
JN> 
JN> b. shallow clones, which are kind of specialized and limited grafts;
JN>they used to limit available functionality, but restrictions are
JN>being lifted (or perhaps even got lifted)
JN> 
JN> c. git-replace mechanism, where we can create an "overlay" of any
JN>object, and is intended to be among others replacement for commit
JN>grafts; safe, transferable, can be turned off with "git
JN>--no-replace-objects "
JN> 
JN> All those can change; some more likely than others.  The problem is if
JN> they change between writing commit-graph file (respecting old view of
JN> the history) and reading it (where we expect to see the new view).
JN> 
JN> a. grafts file can change: lines can be added, removed or changed
JN> 
JN> b. shallow clones can be deepened or shortened, or even make
JN>not shallow
JN> 
JN> c. new replacements can be added, old removed, and existing edited
JN> 
JN> 
JN> There are, as far as I can see, two ways of handling the issue of Git
JN> features that can change the view of the project's history, namely:
JN> 
JN>  * Disable commit-graph reading when any of this features are used, and
JN>always write full graph info.
JN> 
JN>This may not matter much for shallow clones, where commit count
JN>should be small anyway, but when using git-replace to stitch together
JN>current repository with historical one, commit-graph would be
JN>certainly useful.  Also, git-replace does not need to change history.
JN> 
JN>On the other hand I think it is the easier solution.
JN> 
JN> Or
JN> 
JN>  * Detect somehow that the view of the history changed, and invalidate
JN>commit-graph (perhaps even automatically regenerate it).
JN> 
JN>For shallow clone changes I think one can still use the old
JN>commit-graph file to generate the new one.  For other cases, the
JN>metadata is simple to modify, but indices such as generation number
JN>would need to be at least partially calculated anew.

Note that in all cases one can simply discard generation number
information (treating it as if it was ZERO), and use commit-graph as
values before applying history-changing feature: replacements, grafts or
shallow.

Well, at least for shallow you can do that for generation numbers: using
grafts (deprecated) or replacements to join repository with historical
one would mean that we are no longer have commit-graph transitively
closed under reachability.

>
> Signed-off-by: Derrick Stolee 
> ---
>  Documentation/technical/commit-graph.txt | 9 -
>  1 file changed, 9 deletions(-)
>
> diff --git a/Documentation/technical/commit-graph.txt 
> b/Documentation/technical/commit-graph.txt
> index d9f2713efa..d04657b781 100644
> --- a/Documentation/technical/commit-graph.txt
> +++ b/Documentation/technical/commit-graph.txt
> @@ -118,9 +118,6 @@ Future Work
>  - The commit graph feature currently does not honor commit grafts. This can
>be remedied by duplicating or refactoring the current graft logic.
>  
> -- The 'commit-graph' subcommand does not have a "verify" mode that is
> -  necessary for integration with fsck.
> -

All right (though "verify" mode is actually done via "check" command).

>  - After computing and storing generation numbers, we must make graph
>walks aware of generation numbers to gain the performance benefits they
>enable. This will mostly be accomplished by swapping a commit-date-ordered
> @@ -142,12 +139,6 @@ Future Work
>such as "ensure_tree_loaded(commit)" that fully loads a tree before
>using commit->tree.
>  
> -- The current design uses the 'commit-graph' subcommand to generate the 
> graph.
> -  When this feature stabilizes enough to recommend to most users, we should

[PATCH] doc/clone: update caption for GIT URLS cross-reference

2018-04-19 Thread Todd Zullinger
The description of the  argument directs readers to "See the
URLS section below".  When generating HTML this becomes a link to the
"GIT URLS" section.  When reading the man page in a terminal, the
caption is slightly misleading.  Use "GIT URLS" as the caption to avoid
any confusion.

Signed-off-by: Todd Zullinger <t...@pobox.com>
---
Martin Ågren wrote:
> On 18 April 2018 at 22:56, Todd Zullinger <t...@pobox.com> wrote:
>> Subject: [PATCH] doc/clone: update caption for GIT URLS cross-reference
>>
>> The description of the  argument directs readers to "See the
>> URLS section below".  When generating HTML this becomes a link to the
>> "GIT URLS" section.  When reading the man page in a terminal, the
>> caption is slightly misleading.  Use "GIT URLS" as the caption to avoid
>> an confusion.
> 
> s/an/any/?

Indeed, thanks.

>> The man page produced by asciidoc doesn't include hyperlinks.  The
>> description of the  argument simply
> 
> Abandoned first attempt at log message? ;-)

That or it was when a squirrel ran by my window. ;)

Thanks for catching both of these mistakes.

 Documentation/git-clone.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 42ca7b5095..b844b9957c 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -260,7 +260,7 @@ or `--mirror` is given)
 
 ::
The (possibly remote) repository to clone from.  See the
-   <<URLS,URLS>> section below for more information on specifying
+   <<URLS,GIT URLS>> section below for more information on specifying
repositories.
 
 ::
-- 
2.17.0

-- 
Todd
~~
Whenever you find yourself on the side of the majority, it is time to
pause and reflect.
-- Mark Twain



Re: [PATCH v3 4/9] commit-graph.txt: update design document

2018-04-18 Thread Jakub Narebski
Derrick Stolee  writes:

> We now calculate generation numbers in the commit-graph file and use
> them in paint_down_to_common().

All right.

>
> Expand the section on generation numbers to discuss how the three
> special generation numbers GENERATION_NUMBER_INFINITY, _ZERO, and
> _MAX interact with other generation numbers.

Very good.

>
> Signed-off-by: Derrick Stolee 
> ---
>  Documentation/technical/commit-graph.txt | 30 +++-
>  1 file changed, 24 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/technical/commit-graph.txt 
> b/Documentation/technical/commit-graph.txt
> index 0550c6d0dc..d9f2713efa 100644
> --- a/Documentation/technical/commit-graph.txt
> +++ b/Documentation/technical/commit-graph.txt
> @@ -77,6 +77,29 @@ in the commit graph. We can treat these commits as having 
> "infinite"
>  generation number and walk until reaching commits with known generation
>  number.
>
> +We use the macro GENERATION_NUMBER_INFINITY = 0x to mark commits not
> +in the commit-graph file. If a commit-graph file was written by a version
> +of Git that did not compute generation numbers, then those commits will
> +have generation number represented by the macro GENERATION_NUMBER_ZERO = 0.

I have to wonder if there would be any relesed Git that do not compute
generation numbers...

On the other hand in case the user-visible view of the project history
changes, be it because shallow clone is shortened or deepened, or grafts
file is edited, or a commit object is replaced with another with
different parents - we can still use "commit-graph" data, just pretend
that generation numbers (which are invalid in altered history) are all
zero.  (I'll write about this idea in comments to later series.)

On the other hand with GENERATION_NUMBER_ZERO these series of patches
are self-contained and bisectable.

> +
> +Since the commit-graph file is closed under reachability, we can guarantee
> +the following weaker condition on all commits:

I have had to look up the contents of the whole file, but it turns out
that it is all right: "weaker condition" refers to earlier "N <= M".

Minor sidenote: if one would be extremly pedantic, one could say that
previous condition is incorrect, because it doesn't state explicitely
that commit A != commit B. ;-)

> +
> +If A and B are commits with generation numbers N amd M, respectively,
> +and N < M, then A cannot reach B.
> +
> +Note how the strict inequality differs from the inequality when we have
> +fully-computed generation numbers. Using strict inequality may result in
> +walking a few extra commits, but the simplicity in dealing with commits
> +with generation number *_INFINITY or *_ZERO is valuable.
> +
> +We use the macro GENERATION_NUMBER_MAX = 0x3FFF to for commits whose
> +generation numbers are computed to be at least this value. We limit at
> +this value since it is the largest value that can be stored in the
> +commit-graph file using the 30 bits available to generation numbers. This
> +presents another case where a commit can have generation number equal to
> +that of a parent.

I wonder if something like the table I have proposed in v2 version of
this patch [1] would make it easier or harder to understand.

[1]: https://public-inbox.org/git/86a7u7mnzi@gmail.com/

Something like the following:

 |  gen(B)
 |
gen(A)   | _INFINITY | _MAX | larger   | smaller  | _ZERO
-+---+--+--+--+
_INFINITY| = | >| >| >| >
_MAX | < N   | =| >| >| >
larger   | < N   | < N  | = n  | >| >
smaller  | < N   | < N  | < N  | = n  | >
_ZERO| < N   | < N  | < N  | < N  | =

Here "n" and "N" denotes stronger condition, and "N" denotes weaker
condition.  We have _INFINITY > _MAX > larger > smaller > _ZERO.

> +
>  Design Details
>  --
>
> @@ -98,17 +121,12 @@ Future Work
>  - The 'commit-graph' subcommand does not have a "verify" mode that is
>necessary for integration with fsck.
>
> -- The file format includes room for precomputed generation numbers. These
> -  are not currently computed, so all generation numbers will be marked as
> -  0 (or "uncomputed"). A later patch will include this calculation.
> -
>  - After computing and storing generation numbers, we must make graph
>walks aware of generation numbers to gain the performance benefits they
>enable. This will mostly be accomplished by swapping a commit-date-ordered
>priority queue with one ordered by generation number. The following
> -  operations are important candidates:
> +  operation is an important candidate:
>
> -- paint_down_to_common()
>  - 'log --topo-order'
>
>  - Currently, parse_commit_gently() requires filling in the root tree

Looks good.


[RFC PATCH 12/12] commit-graph: update design document

2018-04-17 Thread Derrick Stolee
The commit-graph feature is now integrated with 'fsck' and 'gc',
so remove those items from the "Future Work" section of the
commit-graph design document.

Signed-off-by: Derrick Stolee 
---
 Documentation/technical/commit-graph.txt | 9 -
 1 file changed, 9 deletions(-)

diff --git a/Documentation/technical/commit-graph.txt 
b/Documentation/technical/commit-graph.txt
index d9f2713efa..d04657b781 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -118,9 +118,6 @@ Future Work
 - The commit graph feature currently does not honor commit grafts. This can
   be remedied by duplicating or refactoring the current graft logic.
 
-- The 'commit-graph' subcommand does not have a "verify" mode that is
-  necessary for integration with fsck.
-
 - After computing and storing generation numbers, we must make graph
   walks aware of generation numbers to gain the performance benefits they
   enable. This will mostly be accomplished by swapping a commit-date-ordered
@@ -142,12 +139,6 @@ Future Work
   such as "ensure_tree_loaded(commit)" that fully loads a tree before
   using commit->tree.
 
-- The current design uses the 'commit-graph' subcommand to generate the graph.
-  When this feature stabilizes enough to recommend to most users, we should
-  add automatic graph writes to common operations that create many commits.
-  For example, one could compute a graph on 'clone', 'fetch', or 'repack'
-  commands.
-
 - A server could provide a commit graph file as part of the network protocol
   to avoid extra calculations by clients. This feature is only of benefit if
   the user is willing to trust the file, because verifying the file is correct
-- 
2.17.0.39.g685157f7fb



[PATCH v3 4/9] commit-graph.txt: update design document

2018-04-17 Thread Derrick Stolee
We now calculate generation numbers in the commit-graph file and use
them in paint_down_to_common().

Expand the section on generation numbers to discuss how the three
special generation numbers GENERATION_NUMBER_INFINITY, _ZERO, and
_MAX interact with other generation numbers.

Signed-off-by: Derrick Stolee 
---
 Documentation/technical/commit-graph.txt | 30 +++-
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/commit-graph.txt 
b/Documentation/technical/commit-graph.txt
index 0550c6d0dc..d9f2713efa 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -77,6 +77,29 @@ in the commit graph. We can treat these commits as having 
"infinite"
 generation number and walk until reaching commits with known generation
 number.
 
+We use the macro GENERATION_NUMBER_INFINITY = 0x to mark commits not
+in the commit-graph file. If a commit-graph file was written by a version
+of Git that did not compute generation numbers, then those commits will
+have generation number represented by the macro GENERATION_NUMBER_ZERO = 0.
+
+Since the commit-graph file is closed under reachability, we can guarantee
+the following weaker condition on all commits:
+
+If A and B are commits with generation numbers N amd M, respectively,
+and N < M, then A cannot reach B.
+
+Note how the strict inequality differs from the inequality when we have
+fully-computed generation numbers. Using strict inequality may result in
+walking a few extra commits, but the simplicity in dealing with commits
+with generation number *_INFINITY or *_ZERO is valuable.
+
+We use the macro GENERATION_NUMBER_MAX = 0x3FFF to for commits whose
+generation numbers are computed to be at least this value. We limit at
+this value since it is the largest value that can be stored in the
+commit-graph file using the 30 bits available to generation numbers. This
+presents another case where a commit can have generation number equal to
+that of a parent.
+
 Design Details
 --
 
@@ -98,17 +121,12 @@ Future Work
 - The 'commit-graph' subcommand does not have a "verify" mode that is
   necessary for integration with fsck.
 
-- The file format includes room for precomputed generation numbers. These
-  are not currently computed, so all generation numbers will be marked as
-  0 (or "uncomputed"). A later patch will include this calculation.
-
 - After computing and storing generation numbers, we must make graph
   walks aware of generation numbers to gain the performance benefits they
   enable. This will mostly be accomplished by swapping a commit-date-ordered
   priority queue with one ordered by generation number. The following
-  operations are important candidates:
+  operation is an important candidate:
 
-- paint_down_to_common()
 - 'log --topo-order'
 
 - Currently, parse_commit_gently() requires filling in the root tree
-- 
2.17.0.39.g685157f7fb



Re: [PATCH v2 0/7] nd/repack-keep-pack update

2018-04-15 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> On Sat, Apr 14, 2018 at 9:47 PM, Ævar Arnfjörð Bjarmason  
> wrote:
>> The only (trivial) issue I found in the patches themselves was that
>> between 4/5 and 5/7 you're adding an empty line to config.txt in 4/7
>> just to erase it in 5/7, better not to add it to begin with, but
>> hopefully Junio can fix that up (if he cares).
>
> Fixed (and is the only change in v2) in case Junio has not picked up
> the last round and fixed it himself yet.

Thanks.


[PATCH v2 0/7] nd/repack-keep-pack update

2018-04-15 Thread Nguyễn Thái Ngọc Duy
On Sat, Apr 14, 2018 at 9:47 PM, Ævar Arnfjörð Bjarmason  
wrote:
> The only (trivial) issue I found in the patches themselves was that
> between 4/5 and 5/7 you're adding an empty line to config.txt in 4/7
> just to erase it in 5/7, better not to add it to begin with, but
> hopefully Junio can fix that up (if he cares).

Fixed (and is the only change in v2) in case Junio has not picked up
the last round and fixed it himself yet.

Nguyễn Thái Ngọc Duy (7):
  t7700: have closing quote of a test at the beginning of line
  repack: add --keep-pack option
  gc: add --keep-largest-pack option
  gc: add gc.bigPackThreshold config
  gc: handle a corner case in gc.bigPackThreshold
  gc --auto: exclude base pack if not enough mem to "repack -ad"
  pack-objects: show some progress when counting kept objects

 Documentation/config.txt   |  12 +++
 Documentation/git-gc.txt   |  19 +++-
 Documentation/git-pack-objects.txt |   9 +-
 Documentation/git-repack.txt   |   9 +-
 builtin/gc.c   | 165 +++--
 builtin/pack-objects.c |  83 +++
 builtin/repack.c   |  21 +++-
 config.mak.uname   |   1 +
 git-compat-util.h  |   4 +
 object-store.h |   1 +
 pack-objects.h |   2 +
 t/t6500-gc.sh  |  32 ++
 t/t7700-repack.sh  |  27 -
 13 files changed, 349 insertions(+), 36 deletions(-)

-- 
2.17.0.367.g5dd2e386c3



Re: [PATCH 0/7] nd/repack-keep-pack update

2018-04-14 Thread Junio C Hamano
On Sun, Apr 15, 2018 at 4:47 AM, Ævar Arnfjörð Bjarmason
 wrote:
>
> The only (trivial) issue I found in the patches themselves was that
> between 4/5 and 5/7 you're adding an empty line to config.txt in 4/7
> just to erase it in 5/7, better not to add it to begin with, but
> hopefully Junio can fix that up (if he cares).

I do care but I'd wish I do not have to waste time and concentration to spend
on doing so, even though I would be fully capable of skipping this round and
remembering to queue a rerolled one.

I've seen mentions like the above one a few times on the list recently, so let
me make it clear. Some things are easier to tweak locally than others, and
I'd rather not to waste my time on cleaning other people's mess. A simple
typofix that does not cascade through to later steps in a series is one thing.
A tweak that changes number of lines in a hunk that forces a later step to
compensate is more involved.

Don't expect your traffic cop to wash your care while you're stopping at a
red signal.


Re: [PATCH 0/7] nd/repack-keep-pack update

2018-04-14 Thread Ævar Arnfjörð Bjarmason

On Sat, Apr 14 2018, Nguyễn Thái Ngọc Duy wrote:

> This is basically a resend from the last round but rebased on
> 'master'. The old base results in some conflicts with packfile and oid
> conversion series. This one should reduce merge conflicts on 'pu' to
> trivial ones.

Thanks. I've been running this at work and as noted in
https://public-inbox.org/git/87vadpxv27@evledraar.gmail.com/ it's
had big performance impact to the better, users even started noticing it
(they'd previously get noticeable slowdowns while doing other task on
GC).

I also tried to see just how much worse this was making performance, my
hunch was that the difference should be trivial but noticeable since
we'll produce a less efficient pack.

What I found was the opposite, under real-world conditions it seems to
be making things 1-2% better on common git operations, which I suspect
is because once we've done a few pulls and coalesced those into their
own pack(s) there's more cache locality for the data we're actually
looking at.

I.e. once you've got a repo has a big pack you're not touching, and a
few weeks of updates from upstream that you've coalesced into another
pack there's a higher density of stuff you care about near HEAD per FS
page in the recent smaller pack, which if you're pressed for memory and
parts of your pack are getting paged out of the FS cache is a win. I
haven't confirmed that, it's just a hypothesis.

The only (trivial) issue I found in the patches themselves was that
between 4/5 and 5/7 you're adding an empty line to config.txt in 4/7
just to erase it in 5/7, better not to add it to begin with, but
hopefully Junio can fix that up (if he cares).


[PATCH 00/15] nd/pack-objects-pack-struct update

2018-04-14 Thread Nguyễn Thái Ngọc Duy
This is basically a resend from the last round but rebased on latest
master to take advantage of new object-store and oid changes.
One minor change is t/README now mentions about git-config when a
variable accepts a boolean value.

Nguyễn Thái Ngọc Duy (15):
  read-cache.c: make $GIT_TEST_SPLIT_INDEX boolean
  pack-objects: a bit of document about struct object_entry
  pack-objects: turn type and in_pack_type to bitfields
  pack-objects: use bitfield for object_entry::dfs_state
  pack-objects: use bitfield for object_entry::depth
  pack-objects: move in_pack_pos out of struct object_entry
  pack-objects: move in_pack out of struct object_entry
  pack-objects: refer to delta objects by index instead of pointer
  pack-objects: shrink z_delta_size field in struct object_entry
  pack-objects: don't check size when the object is bad
  pack-objects: clarify the use of object_entry::size
  pack-objects: shrink size field in struct object_entry
  pack-objects: shrink delta_size field in struct object_entry
  pack-objects: reorder members to shrink struct object_entry
  ci: exercise the whole test suite with uncommon code in pack-objects

 Documentation/config.txt   |   4 +-
 Documentation/git-pack-objects.txt |   4 +-
 Documentation/git-repack.txt   |   4 +-
 builtin/pack-objects.c | 364 +++--
 cache.h|   2 +
 ci/run-build-and-tests.sh  |   5 +-
 object-store.h |   1 +
 object.h   |   1 -
 pack-bitmap-write.c|  14 +-
 pack-bitmap.c  |   2 +-
 pack-bitmap.h  |   4 +-
 pack-objects.c |  68 ++
 pack-objects.h | 314 +++--
 read-cache.c   |   4 +-
 t/README   |  22 ++
 t/t5300-pack-object.sh |   5 +
 16 files changed, 654 insertions(+), 164 deletions(-)

-- 
2.17.0.367.g5dd2e386c3



[PATCH 0/7] nd/repack-keep-pack update

2018-04-14 Thread Nguyễn Thái Ngọc Duy
This is basically a resend from the last round but rebased on
'master'. The old base results in some conflicts with packfile and oid
conversion series. This one should reduce merge conflicts on 'pu' to
trivial ones.

Nguyễn Thái Ngọc Duy (7):
  t7700: have closing quote of a test at the beginning of line
  repack: add --keep-pack option
  gc: add --keep-largest-pack option
  gc: add gc.bigPackThreshold config
  gc: handle a corner case in gc.bigPackThreshold
  gc --auto: exclude base pack if not enough mem to "repack -ad"
  pack-objects: show some progress when counting kept objects

 Documentation/config.txt   |  12 +++
 Documentation/git-gc.txt   |  19 +++-
 Documentation/git-pack-objects.txt |   9 +-
 Documentation/git-repack.txt   |   9 +-
 builtin/gc.c   | 165 +++--
 builtin/pack-objects.c |  83 +++
 builtin/repack.c   |  21 +++-
 config.mak.uname   |   1 +
 git-compat-util.h  |   4 +
 object-store.h |   1 +
 pack-objects.h |   2 +
 t/t6500-gc.sh  |  32 ++
 t/t7700-repack.sh  |  27 -
 13 files changed, 349 insertions(+), 36 deletions(-)

-- 
2.17.0.367.g5dd2e386c3



Re: [PATCH v2 07/10] commit-graph.txt: update future work

2018-04-13 Thread Jakub Narebski
Derrick Stolee  writes:

> On 4/12/2018 5:12 AM, Junio C Hamano wrote:
>> Derrick Stolee  writes:
>>
>>> +Here is a diagram to visualize the shape of the full commit graph, and
>>> +how different generation numbers relate:
>>> +
>>> ++-+
>>> +| GENERATION_NUMBER_INFINITY = 0x |
>>> ++-+
>>> +   ||  ^
>>> +   ||  |
>>> +   |+--+
>>> +   | [gen(A) = gen(B)]
>>> +   V
>>> ++-+
>>> +| 0 < commit->generation < 0x4000 |
>>> ++-+
>>> +   ||  ^
>>> +   ||  |
>>> +   |+--+
>>> +   |[gen(A) > gen(B)]
>>> +   V
>>> ++-+
>>> +| GENERATION_NUMBER_ZERO = 0  |
>>> ++-+
>>> +|  ^
>>> +|  |
>>> ++--+
>>> +[gen(A) = gen(B)]
>>
>> It may be just me but all I can read out of the above is that

It's not just you.

>> commit->generation may store 0x, a value between 0 and
>> 0x4000, or 0.  I cannot quite tell what the notation [gen(A)
>>  gen(B)] is trying to say.  I am guessing "Two generation
>> numbers within the 'valid' range can be compared" is what the second
>> one is trying to say, but it is much less interesting to know that
>> two infinities compare equal than how generation numbers from
>> different classes compare, which cannot be depicted in the above
>> notation, I am afraid.  For example, don't we want to say that a
>> commit with INF can never be reached by a commit with a valid
>> generation number, or something like that?
>
> My intention with the arrows was to demonstrate where parent
> relationships can go, and the generation-number relation between a
> commit A with parent B. Clearly, this diagram is less than helpful.

Perhaps the following table would make the information clearer (perhaps
in addition to the above graph, but without "gen(A) {cmp} gen(B)"
arrows).

I assume that it is possible to have both GENERATION_NUMBER_ZERO and non
zero generation numbers in one repo, perhaps via alternates.  I also
assume that A != B, and that generation numbers (both set, and 0s) are
transitivelu closed under reachability.

gen(A) \   commit B ->   | gen(B)
\-\  |
commit A   \ | 0x | larger   | smaller | 0x
\++--+-+
0x   | =>  > >
0 < larger  < 0x4000 | < N  = n> >
0 < smaller < 0x4000 | < N  < N= n   >
0x   | < N  < N< N   =

The "<", "=", ">" denotes result of comparison between gen(A) and gen(B).

Generation numbers create a negative-cut filter: "N" and "n" denote
situation where we know from gen(A) and gen(B) that B is not reachable
from A.

As can be seen if we use gen(A) < gen(B) as cutoff, we don't need to
treat "infinity" and "zero" in a special way.


Best,
-- 
Jakub Narębski


Re: [PATCH v2 07/10] commit-graph.txt: update future work

2018-04-12 Thread Derrick Stolee

On 4/12/2018 5:12 AM, Junio C Hamano wrote:

Derrick Stolee  writes:


+Here is a diagram to visualize the shape of the full commit graph, and
+how different generation numbers relate:
+
++-+
+| GENERATION_NUMBER_INFINITY = 0x |
++-+
+   ||  ^
+   ||  |
+   |+--+
+   | [gen(A) = gen(B)]
+   V
++-+
+| 0 < commit->generation < 0x4000 |
++-+
+   ||  ^
+   ||  |
+   |+--+
+   |[gen(A) > gen(B)]
+   V
++-+
+| GENERATION_NUMBER_ZERO = 0  |
++-+
+|  ^
+|  |
++--+
+[gen(A) = gen(B)]

It may be just me but all I can read out of the above is that
commit->generation may store 0x, a value between 0 and
0x4000, or 0.  I cannot quite tell what the notation [gen(A)
 gen(B)] is trying to say.  I am guessing "Two generation
numbers within the 'valid' range can be compared" is what the second
one is trying to say, but it is much less interesting to know that
two infinities compare equal than how generation numbers from
different classes compare, which cannot be depicted in the above
notation, I am afraid.  For example, don't we want to say that a
commit with INF can never be reached by a commit with a valid
generation number, or something like that?


My intention with the arrows was to demonstrate where parent 
relationships can go, and the generation-number relation between a 
commit A with parent B. Clearly, this diagram is less than helpful.





  Design Details
  --
  
@@ -98,17 +141,12 @@ Future Work

  - The 'commit-graph' subcommand does not have a "verify" mode that is
necessary for integration with fsck.
  
-- The file format includes room for precomputed generation numbers. These

-  are not currently computed, so all generation numbers will be marked as
-  0 (or "uncomputed"). A later patch will include this calculation.
-
  - After computing and storing generation numbers, we must make graph
walks aware of generation numbers to gain the performance benefits they
enable. This will mostly be accomplished by swapping a commit-date-ordered
priority queue with one ordered by generation number. The following
-  operations are important candidates:
+  operation is an important candidate:

Good.




Re: [PATCH v2 07/10] commit-graph.txt: update future work

2018-04-12 Thread Junio C Hamano
Derrick Stolee  writes:

> +Here is a diagram to visualize the shape of the full commit graph, and
> +how different generation numbers relate:
> +
> ++-+
> +| GENERATION_NUMBER_INFINITY = 0x |
> ++-+
> + ||  ^
> + ||  |
> + |+--+
> + | [gen(A) = gen(B)]
> + V
> ++-+
> +| 0 < commit->generation < 0x4000 |
> ++-+
> + ||  ^
> + ||  |
> + |+--+
> + |[gen(A) > gen(B)]
> + V
> ++-+
> +| GENERATION_NUMBER_ZERO = 0  |
> ++-+
> +  |  ^
> +  |  |
> +  +--+
> +  [gen(A) = gen(B)]

It may be just me but all I can read out of the above is that
commit->generation may store 0x, a value between 0 and
0x4000, or 0.  I cannot quite tell what the notation [gen(A)
 gen(B)] is trying to say.  I am guessing "Two generation
numbers within the 'valid' range can be compared" is what the second
one is trying to say, but it is much less interesting to know that
two infinities compare equal than how generation numbers from
different classes compare, which cannot be depicted in the above
notation, I am afraid.  For example, don't we want to say that a
commit with INF can never be reached by a commit with a valid
generation number, or something like that?

>  Design Details
>  --
>  
> @@ -98,17 +141,12 @@ Future Work
>  - The 'commit-graph' subcommand does not have a "verify" mode that is
>necessary for integration with fsck.
>  
> -- The file format includes room for precomputed generation numbers. These
> -  are not currently computed, so all generation numbers will be marked as
> -  0 (or "uncomputed"). A later patch will include this calculation.
> -
>  - After computing and storing generation numbers, we must make graph
>walks aware of generation numbers to gain the performance benefits they
>enable. This will mostly be accomplished by swapping a commit-date-ordered
>priority queue with one ordered by generation number. The following
> -  operations are important candidates:
> +  operation is an important candidate:

Good.


[PATCH v2 07/10] commit-graph.txt: update future work

2018-04-09 Thread Derrick Stolee
We now calculate generation numbers in the commit-graph file and use
them in paint_down_to_common().

Expand the section on generation numbers to discuss how the two
"special" generation numbers GENERATION_NUMBER_INFINITY and *_ZERO
interact with other generation numbers.

Signed-off-by: Derrick Stolee 
---
 Documentation/technical/commit-graph.txt | 50 +---
 1 file changed, 44 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/commit-graph.txt 
b/Documentation/technical/commit-graph.txt
index 0550c6d0dc..a8df0ae9db 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -77,6 +77,49 @@ in the commit graph. We can treat these commits as having 
"infinite"
 generation number and walk until reaching commits with known generation
 number.
 
+We use the macro GENERATION_NUMBER_INFINITY = 0x to mark commits not
+in the commit-graph file. If a commit-graph file was written by a version
+of Git that did not compute generation numbers, then those commits will
+have generation number represented by the macro GENERATION_NUMBER_ZERO = 0.
+
+Since the commit-graph file is closed under reachability, we can guarantee
+the following weaker condition on all commits:
+
+If A and B are commits with generation numbers N amd M, respectively,
+and N < M, then A cannot reach B.
+
+Note how the strict inequality differs from the inequality when we have
+fully-computed generation numbers. Using strict inequality may result in
+walking a few extra commits, but the simplicity in dealing with commits
+with generation number *_INFINITY or *_ZERO is valuable.
+
+Here is a diagram to visualize the shape of the full commit graph, and
+how different generation numbers relate:
+
++-+
+| GENERATION_NUMBER_INFINITY = 0x |
++-+
+   ||  ^
+   ||  |
+   |+--+
+   | [gen(A) = gen(B)]
+   V
++-+
+| 0 < commit->generation < 0x4000 |
++-+
+   ||  ^
+   ||  |
+   |+--+
+   |[gen(A) > gen(B)]
+   V
++-+
+| GENERATION_NUMBER_ZERO = 0  |
++-+
+|  ^
+|  |
++--+
+[gen(A) = gen(B)]
+
 Design Details
 --
 
@@ -98,17 +141,12 @@ Future Work
 - The 'commit-graph' subcommand does not have a "verify" mode that is
   necessary for integration with fsck.
 
-- The file format includes room for precomputed generation numbers. These
-  are not currently computed, so all generation numbers will be marked as
-  0 (or "uncomputed"). A later patch will include this calculation.
-
 - After computing and storing generation numbers, we must make graph
   walks aware of generation numbers to gain the performance benefits they
   enable. This will mostly be accomplished by swapping a commit-date-ordered
   priority queue with one ordered by generation number. The following
-  operations are important candidates:
+  operation is an important candidate:
 
-- paint_down_to_common()
 - 'log --topo-order'
 
 - Currently, parse_commit_gently() requires filling in the root tree
-- 
2.17.0



Re: [PATCH 6/6] commit-graph.txt: update future work

2018-04-03 Thread Jonathan Tan
On Tue,  3 Apr 2018 12:51:43 -0400
Derrick Stolee  wrote:

> We now calculate generation numbers in the commit-graph file and use
> them in paint_down_to_common().

For completeness, I'll mention that I don't see any issues with this
patch, of course.

Thanks for this series.


[PATCH 6/6] commit-graph.txt: update future work

2018-04-03 Thread Derrick Stolee
We now calculate generation numbers in the commit-graph file and use
them in paint_down_to_common().

Signed-off-by: Derrick Stolee 
---
 Documentation/technical/commit-graph.txt | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/Documentation/technical/commit-graph.txt 
b/Documentation/technical/commit-graph.txt
index 0550c6d0dc..be68bee43d 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -98,17 +98,12 @@ Future Work
 - The 'commit-graph' subcommand does not have a "verify" mode that is
   necessary for integration with fsck.
 
-- The file format includes room for precomputed generation numbers. These
-  are not currently computed, so all generation numbers will be marked as
-  0 (or "uncomputed"). A later patch will include this calculation.
-
 - After computing and storing generation numbers, we must make graph
   walks aware of generation numbers to gain the performance benefits they
   enable. This will mostly be accomplished by swapping a commit-date-ordered
   priority queue with one ordered by generation number. The following
-  operations are important candidates:
+  operation is an important candidate:
 
-- paint_down_to_common()
 - 'log --topo-order'
 
 - Currently, parse_commit_gently() requires filling in the root tree
-- 
2.17.0.rc0



[PATCH v2 5/5] refs: use chdir_notify to update cached relative paths

2018-03-30 Thread Jeff King
Commit f57f37e2e1 (files-backend: remove the use of
git_path(), 2017-03-26) introduced a regression when a
relative $GIT_DIR is used in a working tree:

  - when we initialize the ref backend, we make a copy of
get_git_dir(), which may be relative

  - later, we may call setup_work_tree() and chdir to the
root of the working tree

  - further calls to the ref code will use the stored git
directory, but relative paths will now point to the
wrong place

The new test in t1501 demonstrates one such instance (the
bug causes us to write the ref update to the nonsense
"relative/relative/.git").

Since setup_work_tree() now uses chdir_notify, we can just
ask it update our relative paths when necessary.

Reported-by: Rafael Ascensao <rafa.al...@gmail.com>
Helped-by: Nguyễn Thái Ngọc Duy <pclo...@gmail.com>
Signed-off-by: Jeff King <p...@peff.net>
---
 refs/files-backend.c  |  6 ++
 refs/packed-backend.c |  3 +++
 t/t1501-work-tree.sh  | 12 
 3 files changed, 21 insertions(+)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index bec8e30e9e..a92a2aa821 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -9,6 +9,7 @@
 #include "../lockfile.h"
 #include "../object.h"
 #include "../dir.h"
+#include "../chdir-notify.h"
 
 /*
  * This backend uses the following flags in `ref_update::flags` for
@@ -106,6 +107,11 @@ static struct ref_store *files_ref_store_create(const char 
*gitdir,
refs->packed_ref_store = packed_ref_store_create(sb.buf, flags);
strbuf_release();
 
+   chdir_notify_reparent("files-backend $GIT_DIR",
+ >gitdir);
+   chdir_notify_reparent("files-backend $GIT_COMMONDIR",
+ >gitcommondir);
+
return ref_store;
 }
 
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 65288c6472..369c34f886 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -5,6 +5,7 @@
 #include "packed-backend.h"
 #include "../iterator.h"
 #include "../lockfile.h"
+#include "../chdir-notify.h"
 
 enum mmap_strategy {
/*
@@ -202,6 +203,8 @@ struct ref_store *packed_ref_store_create(const char *path,
refs->store_flags = store_flags;
 
refs->path = xstrdup(path);
+   chdir_notify_reparent("packed-refs", >path);
+
return ref_store;
 }
 
diff --git a/t/t1501-work-tree.sh b/t/t1501-work-tree.sh
index b06210ec5e..a5db53337b 100755
--- a/t/t1501-work-tree.sh
+++ b/t/t1501-work-tree.sh
@@ -431,4 +431,16 @@ test_expect_success 'error out gracefully on invalid 
$GIT_WORK_TREE' '
)
 '
 
+test_expect_success 'refs work with relative gitdir and work tree' '
+   git init relative &&
+   git -C relative commit --allow-empty -m one &&
+   git -C relative commit --allow-empty -m two &&
+
+   GIT_DIR=relative/.git GIT_WORK_TREE=relative git reset HEAD^ &&
+
+   git -C relative log -1 --format=%s >actual &&
+   echo one >expect &&
+   test_cmp expect actual
+'
+
 test_done
-- 
2.17.0.rc2.594.gdb94a0ce02


[PATCH 4/4] refs: use chdir_notify to update cached relative paths

2018-03-28 Thread Jeff King
Commit f57f37e2e1 (files-backend: remove the use of
git_path(), 2017-03-26) introduced a regression when a
relative $GIT_DIR is used in a working tree:

  - when we initialize the ref backend, we make a copy of
get_git_dir(), which may be relative

  - later, we may call setup_work_tree() and chdir to the
root of the working tree

  - further calls to the ref code will use the stored git
directory, but relative paths will now point to the
wrong place

The new test in t1501 demonstrates one such instance (the
bug causes us to write the ref update to the nonsense
"relative/relative/.git").

Since setup_work_tree() now uses chdir_notify, we can just
ask it update our relative paths when necessary.

Reported-by: Rafael Ascensao <rafa.al...@gmail.com>
Helped-by: Nguyễn Thái Ngọc Duy <pclo...@gmail.com>
Signed-off-by: Jeff King <p...@peff.net>
---
 refs/files-backend.c  |  4 
 refs/packed-backend.c |  3 +++
 t/t1501-work-tree.sh  | 12 
 3 files changed, 19 insertions(+)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index bec8e30e9e..ab3e00ffa0 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -9,6 +9,7 @@
 #include "../lockfile.h"
 #include "../object.h"
 #include "../dir.h"
+#include "../chdir-notify.h"
 
 /*
  * This backend uses the following flags in `ref_update::flags` for
@@ -106,6 +107,9 @@ static struct ref_store *files_ref_store_create(const char 
*gitdir,
refs->packed_ref_store = packed_ref_store_create(sb.buf, flags);
strbuf_release();
 
+   chdir_notify_reparent(>gitdir);
+   chdir_notify_reparent(>gitcommondir);
+
return ref_store;
 }
 
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 65288c6472..6725742f00 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -5,6 +5,7 @@
 #include "packed-backend.h"
 #include "../iterator.h"
 #include "../lockfile.h"
+#include "../chdir-notify.h"
 
 enum mmap_strategy {
/*
@@ -202,6 +203,8 @@ struct ref_store *packed_ref_store_create(const char *path,
refs->store_flags = store_flags;
 
refs->path = xstrdup(path);
+   chdir_notify_reparent(>path);
+
return ref_store;
 }
 
diff --git a/t/t1501-work-tree.sh b/t/t1501-work-tree.sh
index b06210ec5e..a5db53337b 100755
--- a/t/t1501-work-tree.sh
+++ b/t/t1501-work-tree.sh
@@ -431,4 +431,16 @@ test_expect_success 'error out gracefully on invalid 
$GIT_WORK_TREE' '
)
 '
 
+test_expect_success 'refs work with relative gitdir and work tree' '
+   git init relative &&
+   git -C relative commit --allow-empty -m one &&
+   git -C relative commit --allow-empty -m two &&
+
+   GIT_DIR=relative/.git GIT_WORK_TREE=relative git reset HEAD^ &&
+
+   git -C relative log -1 --format=%s >actual &&
+   echo one >expect &&
+   test_cmp expect actual
+'
+
 test_done
-- 
2.17.0.rc1.515.g3cc84c0ca4


Re: git submodule update - reset instead of checkout?

2018-03-26 Thread Stefan Beller
On Fri, Mar 9, 2018 at 7:07 AM Andreas Krey <a.k...@gmx.de> wrote:

> Hi everyone,

> I've been reading up on the current state of git submodules
> (our customer's customers want to use them, causing a slight
> groan from our customer).

> The usability thing I wonder about - with git submodule update,
> is it necessary to detach the head to the current (or upstream)
> head, instead of resetting the current branch to that state?

Try "git checkout --recurse-submodules" or
"git reset --recurse-submodules"  (there is also the
submodule.recurse option in case you don't want to type
the option all the time)


> Primary interest in the question: Seeing 'detached head' scares
> almost everybody. To brainstorm:

I agree on that. That is what we are trying to work out
eventually, too.

One idea is to "reattach the submodule branch if it fits"
another idea would be a submodule ref store that is
(partially) tied to the superproject, such that the HEAD
of the submodule is non-existent for most of the time.
https://public-inbox.org/git/cover.1512168087.git.jonathanta...@google.com/

> - as we can already use 'submodule update --remote' to update
>to the remote head of a branch, it would be logical to have
>that branch checked out locally (and unfortunately, potentially
>have that branch's name conflict with the remote branch setup).

> - when developers more or less accidentally commit on the detached
>head, all is not lost yet (I remember this being differently),
>but if the next 'submodule update' just resets again, the commit
>just made is still dropped, just as in the detached head state.

> - So, we'd need to have 'submodule update' act closer to the pull or
>rebase counterparts and refuse to just lose commits (or uncommitted
>changes).

> Having a checked-out branch in the submodule would have the advantage
> that I can 'just' push local commits. At the moment, doing that requires
> a relatively intricate dance, not at all similar to working in the
> base (parent) module.

> I'm working on hooks that automatically update the submodules after
> a commit change (merge, pull, checkout) in the parent module, but
> with the additional feature of (a) keeping a branch checked out
> and (b) avoid discarding local changes. Probably means I shouldn't
> invoke 'submodule update' at all, and deal with everyting myself.

> Any thoughs/comments/helpful hints?

Our plan is to deprecate "git submodule" just like "git remote" is not
a well known tool any more. (e.g. Instead of git remote update, use
git fetch, that learned about updating the remote tracking branches
on its own)


> (Addional complexity: egit/jgit is in use as well, and the work model
> we will arrive at probabaly needs to work with the current egit.)

That sounds familiar, we also have JGit/Gerrit in the setup.

Stefan


[RFC PATCH v5 2/8] rebase: update invocation of rebase dot-sourced scripts

2018-03-23 Thread Wink Saville
Due to historical reasons, the backend scriptlets for "git rebase"
are structured a bit unusually. As originally designed,
dot-sourcing them from "git rebase" was sufficient to invoke the
specific backend.

However, it was later discovered that some shell implementations
(e.g. FreeBSD 9.x) misbehaved by continuing to execute statements
following a top-level "return" rather than returning control to
the next statement in "git rebase" after dot-sourcing the
scriptlet. To work around this shortcoming, the whole body of
git-rebase--$backend.sh was made into a shell function
git_rebase__$backend, and then the very last line of the scriptlet
called that function.

A more normal architecture is for a dot-sourced scriptlet merely
to define functions (thus acting as a function library), and for
those functions to be called by the script doing the dot-sourcing.
Migrate to this arrangement by moving the git_rebase__$backend
call from the end of a scriptlet into "git rebase" itself.

While at it, remove the large comment block from each scriptlet
explaining this historic anomaly since it serves no purpose under
the new normalized architecture in which a scriptlet is merely a
function library.

Signed-off-by: Wink Saville 
Helped-by: Junio C Hamano 
Helped-by: Eric Sunshine 
---
 git-rebase--am.sh  | 11 ---
 git-rebase--interactive.sh | 11 ---
 git-rebase--merge.sh   | 11 ---
 git-rebase.sh  |  1 +
 4 files changed, 1 insertion(+), 33 deletions(-)

diff --git a/git-rebase--am.sh b/git-rebase--am.sh
index be3f06892..e5fd6101d 100644
--- a/git-rebase--am.sh
+++ b/git-rebase--am.sh
@@ -4,15 +4,6 @@
 # Copyright (c) 2010 Junio C Hamano.
 #
 
-# The whole contents of this file is run by dot-sourcing it from
-# inside a shell function.  It used to be that "return"s we see
-# below were not inside any function, and expected to return
-# to the function that dot-sourced us.
-#
-# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
-# construct and continue to run the statements that follow such a "return".
-# As a work-around, we introduce an extra layer of a function
-# here, and immediately call it after defining it.
 git_rebase__am () {
 
 case "$action" in
@@ -105,5 +96,3 @@ fi
 move_to_original_branch
 
 }
-# ... and then we call the whole thing.
-git_rebase__am
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 561e2660e..213d75f43 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -740,15 +740,6 @@ get_missing_commit_check_level () {
printf '%s' "$check_level" | tr 'A-Z' 'a-z'
 }
 
-# The whole contents of this file is run by dot-sourcing it from
-# inside a shell function.  It used to be that "return"s we see
-# below were not inside any function, and expected to return
-# to the function that dot-sourced us.
-#
-# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
-# construct and continue to run the statements that follow such a "return".
-# As a work-around, we introduce an extra layer of a function
-# here, and immediately call it after defining it.
 git_rebase__interactive () {
 
 case "$action" in
@@ -1029,5 +1020,3 @@ fi
 do_rest
 
 }
-# ... and then we call the whole thing.
-git_rebase__interactive
diff --git a/git-rebase--merge.sh b/git-rebase--merge.sh
index ceb715453..685f48ca4 100644
--- a/git-rebase--merge.sh
+++ b/git-rebase--merge.sh
@@ -104,15 +104,6 @@ finish_rb_merge () {
say All done.
 }
 
-# The whole contents of this file is run by dot-sourcing it from
-# inside a shell function.  It used to be that "return"s we see
-# below were not inside any function, and expected to return
-# to the function that dot-sourced us.
-#
-# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
-# construct and continue to run the statements that follow such a "return".
-# As a work-around, we introduce an extra layer of a function
-# here, and immediately call it after defining it.
 git_rebase__merge () {
 
 case "$action" in
@@ -171,5 +162,3 @@ done
 finish_rb_merge
 
 }
-# ... and then we call the whole thing.
-git_rebase__merge
diff --git a/git-rebase.sh b/git-rebase.sh
index a1f6e5de6..6edf8c5b1 100755
--- a/git-rebase.sh
+++ b/git-rebase.sh
@@ -197,6 +197,7 @@ run_specific_rebase () {
autosquash=
fi
. git-rebase--$type
+   git_rebase__$type
ret=$?
if test $ret -eq 0
then
-- 
2.16.2



Re: [RFC PATCH v4] rebase: Update invocation of rebase dot-sourced scripts

2018-03-23 Thread Eric Sunshine
On Fri, Mar 23, 2018 at 5:01 PM, Junio C Hamano  wrote:
> Eric Sunshine  writes:
>> (despite the run-on sentence in the first paragraph and the apparent
>> incorrect explanation of top-level "return" misbehavior -- the in-code
>> comment says top-level "return" was essentially a no-op in broken
>> shells, whereas he said it exited the shell).
>
> My bad, almost entirely.  Sorry.

No apology necessary. That minor error aside, your proposed commit
message gave just the right amount of detail for a person (me) not at
all familiar with the topic to be able to understand it fully and
intuit the patch content before even reading the patch proper. That's
a good commit message.


Re: [RFC PATCH v4] rebase: Update invocation of rebase dot-sourced scripts

2018-03-23 Thread Wink Saville
On Fri, Mar 23, 2018 at 1:51 PM, Junio C Hamano  wrote:
> Wink Saville  writes:
>
>> Here is one possibility:
>>
>> git format-patch --cover-letter --rfc --thread -v 5
>> --to=git@vger.kernel.org --cc=sunsh...@sunshineco.com
>> --cc=johannes.schinde...@gmx.de -o patches/v5 master..v5-2
>
> Sounds sensible.
>
>> If this was the first version then the above would seem to be a
>> reasonable choice.
>
> My personal preference (both as a reviewer and an occasional
> multi-patch series submitter) is to use a cover letter for a larger
> series (e.g. more than 3-5 patches), regardless of the iteration.
> In fact, a submitter tends to have _more_ things to say in the cover
> letter for v2 and subsequent iteration than the original iteration.
>
> The motivation behind the series may not change so greatly but will
> be refined as iterations go on, and you want help those who missed
> the earlier iteration understand what you are doing with the updated
> cover letter.  Also cover letter is the ideal place to outline where
> to find older iterations and their discussion and summarize what
> changed since these earlier attempts in this round.
>
>> But this is version 5 and maybe I don't need --cover-letter which, I
>> think means I
>> don't want to use --thread. If that's the case should I add --in-reply-to? 
>> But
>> that leads to the question. from which message should I get the Message-Id?
>
> The most typical practice I've seen around here is that v5's cover
> is made in-reply-to v4's cover.
>

Make sense


Re: [RFC PATCH v4] rebase: Update invocation of rebase dot-sourced scripts

2018-03-23 Thread Junio C Hamano
Eric Sunshine  writes:

>> When it was discovered that some shell implementations
> ...
> ECANTPARSE: This paragraph is grammatically corrupt.
>
> ECANTPARSE: Grammatically corrupt.
> ...
> (despite the run-on sentence in the first paragraph and the apparent
> incorrect explanation of top-level "return" misbehavior -- the in-code
> comment says top-level "return" was essentially a no-op in broken
> shells, whereas he said it exited the shell).

My bad, almost entirely.  Sorry.


Re: [RFC PATCH v4] rebase: Update invocation of rebase dot-sourced scripts

2018-03-23 Thread Junio C Hamano
Wink Saville  writes:

> Here is one possibility:
>
> git format-patch --cover-letter --rfc --thread -v 5
> --to=git@vger.kernel.org --cc=sunsh...@sunshineco.com
> --cc=johannes.schinde...@gmx.de -o patches/v5 master..v5-2

Sounds sensible.

> If this was the first version then the above would seem to be a
> reasonable choice.

My personal preference (both as a reviewer and an occasional
multi-patch series submitter) is to use a cover letter for a larger
series (e.g. more than 3-5 patches), regardless of the iteration.
In fact, a submitter tends to have _more_ things to say in the cover
letter for v2 and subsequent iteration than the original iteration.

The motivation behind the series may not change so greatly but will
be refined as iterations go on, and you want help those who missed
the earlier iteration understand what you are doing with the updated
cover letter.  Also cover letter is the ideal place to outline where
to find older iterations and their discussion and summarize what
changed since these earlier attempts in this round.

> But this is version 5 and maybe I don't need --cover-letter which, I
> think means I
> don't want to use --thread. If that's the case should I add --in-reply-to? But
> that leads to the question. from which message should I get the Message-Id?

The most typical practice I've seen around here is that v5's cover
is made in-reply-to v4's cover.



Re: [RFC PATCH v4] rebase: Update invocation of rebase dot-sourced scripts

2018-03-23 Thread Wink Saville
On Fri, Mar 23, 2018 at 10:12 AM, Johannes Schindelin
 wrote:
> Hi Wink,
>
> On Thu, 22 Mar 2018, Wink Saville wrote:
>
>> The backend scriptlets for "git rebase" were structured in a
>> bit unusual way for historical reasons.  Originally, it was
>> designed in such a way that dot-sourcing them from "git
>> rebase" would be sufficient to invoke the specific backend.
>>
>> When it was discovered that some shell implementations
>> (e.g. FreeBSD 9.x) misbehaved when exiting with a "return"
>> is executed at the top level of a dot-sourced script (the
>> original was expecting that the control returns to the next
>> command in "git rebase" after dot-sourcing the scriptlet).
>>
>> To fix this issue the whole body of git-rebase--$backend.sh
>> was made into a shell function git_rebase__$backend and then
>> the last statement of the scriptlet would invoke the function.
>>
>> Here the call is moved to "git rebase" side, instead of at the
>> end of each scriptlet.  This give us a more normal arrangement
>> where the scriptlet function library and allows multiple functions
>> to be implemented in a scriptlet.
>>
>> Signed-off-by: Wink Saville 
>> Reviewed-by: Junio C Hamano 
>> Reviewed-by: Eric Sunsine 
>> ---
>>  git-rebase--am.sh  | 11 ---
>>  git-rebase--interactive.sh | 11 ---
>>  git-rebase--merge.sh   | 11 ---
>>  git-rebase.sh  |  2 ++
>
> The patch makes sense to me.
>
> Thanks,
> Johannes

Junio, Eric and Johannes, thanks for the help!!!

I've created v5 with the two patches, what is the suggested
format-patch/send-email command(s)?

Here is one possibility:

git format-patch --cover-letter --rfc --thread -v 5
--to=git@vger.kernel.org --cc=sunsh...@sunshineco.com
--cc=johannes.schinde...@gmx.de -o patches/v5 master..v5-2

If this was the first version then the above would seem to be a
reasonable choice.
But this is version 5 and maybe I don't need --cover-letter which, I
think means I
don't want to use --thread. If that's the case should I add --in-reply-to? But
that leads to the question. from which message should I get the Message-Id?

More likely I'm totally wrong and should do something completely different,
advice appreciated.

-- Wink


Re: [PATCH v2 3/5] fast-import: update pool_* functions to work on local pool

2018-03-23 Thread Junio C Hamano
Jameson Miller <jam...@microsoft.com> writes:

> Update memory pool functions to work on a passed in mem_pool instead of
> the global mem_pool type. This is in preparation for making the memory
> pool logic reusable.

This should be part of 2/5; the same comments to 2/5 I just sent out
would still apply to a patch that is a combined result between 2/5
and 3/5.

Also, if mem_pool_alloc_block() etc. are the desired final names,
using them from the beginning would make the result easier to read
through (both during review and later "log -p").

Thanks.  This round, especially 2/5 (with anticipation of 3/5
already in reader's head), was easier to grok than the previous.



Re: [RFC PATCH v4] rebase: Update invocation of rebase dot-sourced scripts

2018-03-23 Thread Johannes Schindelin
Hi Wink,

On Thu, 22 Mar 2018, Wink Saville wrote:

> The backend scriptlets for "git rebase" were structured in a
> bit unusual way for historical reasons.  Originally, it was
> designed in such a way that dot-sourcing them from "git
> rebase" would be sufficient to invoke the specific backend.
> 
> When it was discovered that some shell implementations
> (e.g. FreeBSD 9.x) misbehaved when exiting with a "return"
> is executed at the top level of a dot-sourced script (the
> original was expecting that the control returns to the next
> command in "git rebase" after dot-sourcing the scriptlet).
> 
> To fix this issue the whole body of git-rebase--$backend.sh
> was made into a shell function git_rebase__$backend and then
> the last statement of the scriptlet would invoke the function.
> 
> Here the call is moved to "git rebase" side, instead of at the
> end of each scriptlet.  This give us a more normal arrangement
> where the scriptlet function library and allows multiple functions
> to be implemented in a scriptlet.
> 
> Signed-off-by: Wink Saville 
> Reviewed-by: Junio C Hamano 
> Reviewed-by: Eric Sunsine 
> ---
>  git-rebase--am.sh  | 11 ---
>  git-rebase--interactive.sh | 11 ---
>  git-rebase--merge.sh   | 11 ---
>  git-rebase.sh  |  2 ++

The patch makes sense to me.

Thanks,
Johannes


[PATCH v2 3/5] fast-import: update pool_* functions to work on local pool

2018-03-23 Thread Jameson Miller
Update memory pool functions to work on a passed in mem_pool instead of
the global mem_pool type. This is in preparation for making the memory
pool logic reusable.

Signed-off-by: Jameson Miller <jam...@microsoft.com>
---
 fast-import.c | 52 ++--
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index 1262d9e6be..519e1cbd7f 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -646,16 +646,16 @@ static unsigned int hc_str(const char *s, size_t len)
return r;
 }
 
-static struct mp_block *pool_alloc_block()
+static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_pool)
 {
struct mp_block *p;
 
-   fi_mem_pool.pool_alloc += sizeof(struct mp_block) + 
fi_mem_pool.block_alloc;
-   p = xmalloc(st_add(sizeof(struct mp_block), fi_mem_pool.block_alloc));
-   p->next_block = fi_mem_pool.mp_block;
+   mem_pool->pool_alloc += sizeof(struct mp_block) + mem_pool->block_alloc;
+   p = xmalloc(st_add(sizeof(struct mp_block), mem_pool->block_alloc));
+   p->next_block = mem_pool->mp_block;
p->next_free = (char *)p->space;
-   p->end = p->next_free + fi_mem_pool.block_alloc;
-   fi_mem_pool.mp_block = p;
+   p->end = p->next_free + mem_pool->block_alloc;
+   mem_pool->mp_block = p;
 
return p;
 }
@@ -676,21 +676,21 @@ static struct mp_block *pool_alloc_block()
  * the end of the list so that it will be the last block searched
  * for available space.
  */
-static struct mp_block *pool_alloc_block_with_size(size_t block_alloc)
+static struct mp_block *mem_pool_alloc_block_with_size(struct mem_pool 
*mem_pool, size_t block_alloc)
 {
struct mp_block *p, *block;
 
-   fi_mem_pool.pool_alloc += sizeof(struct mp_block) + block_alloc;
+   mem_pool->pool_alloc += sizeof(struct mp_block) + block_alloc;
p = xmalloc(st_add(sizeof(struct mp_block), block_alloc));
 
-   block = fi_mem_pool.mp_block;
+   block = mem_pool->mp_block;
if (block) {
while (block->next_block)
block = block->next_block;
 
block->next_block = p;
} else {
-   fi_mem_pool.mp_block = p;
+   mem_pool->mp_block = p;
}
 
p->next_block = NULL;
@@ -700,7 +700,7 @@ static struct mp_block *pool_alloc_block_with_size(size_t 
block_alloc)
return p;
 }
 
-static void *pool_alloc(size_t len)
+static void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
 {
struct mp_block *p;
void *r;
@@ -709,7 +709,7 @@ static void *pool_alloc(size_t len)
if (len & (sizeof(uintmax_t) - 1))
len += sizeof(uintmax_t) - (len & (sizeof(uintmax_t) - 1));
 
-   p = fi_mem_pool.mp_block;
+   p = mem_pool->mp_block;
 
/*
 * In performance profiling, there was a minor perf benefit to
@@ -724,10 +724,10 @@ static void *pool_alloc(size_t len)
}
 
if (!p) {
-   if (len >= (fi_mem_pool.block_alloc / 2))
-   p = pool_alloc_block_with_size(len);
+   if (len >= (mem_pool->block_alloc / 2))
+   p = mem_pool_alloc_block_with_size(mem_pool, len);
else
-   p = pool_alloc_block();
+   p = mem_pool_alloc_block(mem_pool);
}
 
r = p->next_free;
@@ -735,10 +735,10 @@ static void *pool_alloc(size_t len)
return r;
 }
 
-static void *pool_calloc(size_t count, size_t size)
+static void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_t 
size)
 {
size_t len = st_mult(count, size);
-   void *r = pool_alloc(len);
+   void *r = mem_pool_alloc(mem_pool, len);
memset(r, 0, len);
return r;
 }
@@ -746,7 +746,7 @@ static void *pool_calloc(size_t count, size_t size)
 static char *pool_strdup(const char *s)
 {
size_t len = strlen(s) + 1;
-   char *r = pool_alloc(len);
+   char *r = mem_pool_alloc(_mem_pool, len);
memcpy(r, s, len);
return r;
 }
@@ -755,7 +755,7 @@ static void insert_mark(uintmax_t idnum, struct 
object_entry *oe)
 {
struct mark_set *s = marks;
while ((idnum >> s->shift) >= 1024) {
-   s = pool_calloc(1, sizeof(struct mark_set));
+   s = mem_pool_calloc(_mem_pool, 1, sizeof(struct mark_set));
s->shift = marks->shift + 10;
s->data.sets[0] = marks;
marks = s;
@@ -764,7 +764,7 @@ static void insert_mark(uintmax_t idnum, struct 
object_entry *oe)
uintmax_t i = idnum >> s->shift;
idnum -= i << s->shift;
if (!s->data.sets[i]) {
-   s->data.sets[i] = pool_calloc(1, sizeof(struct 
mark_set));
+   

Re: [RFC PATCH v4] rebase: Update invocation of rebase dot-sourced scripts

2018-03-23 Thread Eric Sunshine
Thanks for splitting these changes into smaller, more manageable
chunks. A couple non-code comments below apply to both patches in this
2-patch series even though I'm responding only to this patch. (The
actual code changes in the other patch looked fine and the patch was
easily digested.)

On Fri, Mar 23, 2018 at 12:39 AM, Wink Saville <w...@saville.com> wrote:
> rebase: Update invocation of rebase dot-sourced scripts

Nit: On this project, the summary line is not capitalized, so: s/Update/update/

> The backend scriptlets for "git rebase" were structured in a

On this project, commit messages are written in imperative mood. The
commit message Junio suggested[1] said "are structured", which makes
for a better imperative mood fit.

> bit unusual way for historical reasons.  Originally, it was
> designed in such a way that dot-sourcing them from "git
> rebase" would be sufficient to invoke the specific backend.
>
> When it was discovered that some shell implementations
> (e.g. FreeBSD 9.x) misbehaved when exiting with a "return"
> is executed at the top level of a dot-sourced script (the
> original was expecting that the control returns to the next
> command in "git rebase" after dot-sourcing the scriptlet).

ECANTPARSE: This paragraph is grammatically corrupt.

?  "When {something}..." but then what?

?  "...when exiting with a "return" is executed"

> To fix this issue the whole body of git-rebase--$backend.sh
> was made into a shell function git_rebase__$backend and then
> the last statement of the scriptlet would invoke the function.

Junio's proposed commit message[1] called this a "workaround", not a
"fix", and, indeed, "workaround" better characterizes that change. If
anything, _this_ patch is a (more correct) "fix" for that workaround.

> Here the call is moved to "git rebase" side, instead of at the

Junio's version, using imperative mood, said "Move the call...".

> end of each scriptlet.  This give us a more normal arrangement
> where the scriptlet function library and allows multiple functions
> to be implemented in a scriptlet.

ECANTPARSE: Grammatically corrupt.

?  "where the ... library and allows..."

Overall, Junio's proposed message followed project practice
(imperative mood) more closely and felt somewhat more coherent
(despite the run-on sentence in the first paragraph and the apparent
incorrect explanation of top-level "return" misbehavior -- the in-code
comment says top-level "return" was essentially a no-op in broken
shells, whereas he said it exited the shell).

Perhaps the following re-write addresses the above concerns:

Due to historical reasons, the backend scriptlets for "git rebase"
are structured a bit unusually. As originally designed,
dot-sourcing them from "git rebase" was sufficient to invoke the
specific backend.

However, it was later discovered that some shell implementations
(e.g. FreeBSD 9.x) misbehaved by continuing to execute statements
following a top-level "return" rather than returning control to
the next statement in "git rebase" after dot-sourcing the
scriptlet. To work around this shortcoming, the whole body of
git-rebase--$backend.sh was made into a shell function
git_rebase__$backend, and then the very last line of the scriptlet
called that function.

A more normal architecture is for a dot-sourced scriptlet merely
to define functions (thus acting as a function library), and for
those functions to be called by the script doing the dot-sourcing.
Migrate to this arrangement by moving the git_rebase__$backend
call from the end of a scriptlet into "git rebase" itself.

While at it, remove the large comment block from each scriptlet
explaining this historic anomaly since it serves no purpose under
the new normalized architecture in which a scriptlet is merely a
function library.

> Signed-off-by: Wink Saville <w...@saville.com>
> Reviewed-by: Junio C Hamano <gits...@pobox.com>
> Reviewed-by: Eric Sunsine <suns...@sunshineco.com>

Despite its name, on this project, a Reviewed-by: does not mean merely
that a person looked at and commented on a patch. Rather, it is a way
for a person to say "I have studied and understood the patch and feel
that it is ready for inclusion in the project." Reviewed-by:'s are
therefore always given explicitly by the reviewer and Junio adds them
to a patch when queuing. (Reviewed-by:'s are not always given, though,
even when a reviewer has not found problems with a patch. For
instance, even if I review and comment on this or subsequent patches,
I will not give a Reviewed-by: since I'm not an area expert, thus
wouldn't feel comfortable stating 

[RFC PATCH v4] rebase: Update invocation of rebase dot-sourced scripts

2018-03-22 Thread Wink Saville
The backend scriptlets for "git rebase" were structured in a
bit unusual way for historical reasons.  Originally, it was
designed in such a way that dot-sourcing them from "git
rebase" would be sufficient to invoke the specific backend.

When it was discovered that some shell implementations
(e.g. FreeBSD 9.x) misbehaved when exiting with a "return"
is executed at the top level of a dot-sourced script (the
original was expecting that the control returns to the next
command in "git rebase" after dot-sourcing the scriptlet).

To fix this issue the whole body of git-rebase--$backend.sh
was made into a shell function git_rebase__$backend and then
the last statement of the scriptlet would invoke the function.

Here the call is moved to "git rebase" side, instead of at the
end of each scriptlet.  This give us a more normal arrangement
where the scriptlet function library and allows multiple functions
to be implemented in a scriptlet.

Signed-off-by: Wink Saville 
Reviewed-by: Junio C Hamano 
Reviewed-by: Eric Sunsine 
---
 git-rebase--am.sh  | 11 ---
 git-rebase--interactive.sh | 11 ---
 git-rebase--merge.sh   | 11 ---
 git-rebase.sh  |  2 ++
 4 files changed, 2 insertions(+), 33 deletions(-)

diff --git a/git-rebase--am.sh b/git-rebase--am.sh
index be3f06892..e5fd6101d 100644
--- a/git-rebase--am.sh
+++ b/git-rebase--am.sh
@@ -4,15 +4,6 @@
 # Copyright (c) 2010 Junio C Hamano.
 #
 
-# The whole contents of this file is run by dot-sourcing it from
-# inside a shell function.  It used to be that "return"s we see
-# below were not inside any function, and expected to return
-# to the function that dot-sourced us.
-#
-# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
-# construct and continue to run the statements that follow such a "return".
-# As a work-around, we introduce an extra layer of a function
-# here, and immediately call it after defining it.
 git_rebase__am () {
 
 case "$action" in
@@ -105,5 +96,3 @@ fi
 move_to_original_branch
 
 }
-# ... and then we call the whole thing.
-git_rebase__am
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 561e2660e..213d75f43 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -740,15 +740,6 @@ get_missing_commit_check_level () {
printf '%s' "$check_level" | tr 'A-Z' 'a-z'
 }
 
-# The whole contents of this file is run by dot-sourcing it from
-# inside a shell function.  It used to be that "return"s we see
-# below were not inside any function, and expected to return
-# to the function that dot-sourced us.
-#
-# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
-# construct and continue to run the statements that follow such a "return".
-# As a work-around, we introduce an extra layer of a function
-# here, and immediately call it after defining it.
 git_rebase__interactive () {
 
 case "$action" in
@@ -1029,5 +1020,3 @@ fi
 do_rest
 
 }
-# ... and then we call the whole thing.
-git_rebase__interactive
diff --git a/git-rebase--merge.sh b/git-rebase--merge.sh
index ceb715453..685f48ca4 100644
--- a/git-rebase--merge.sh
+++ b/git-rebase--merge.sh
@@ -104,15 +104,6 @@ finish_rb_merge () {
say All done.
 }
 
-# The whole contents of this file is run by dot-sourcing it from
-# inside a shell function.  It used to be that "return"s we see
-# below were not inside any function, and expected to return
-# to the function that dot-sourced us.
-#
-# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
-# construct and continue to run the statements that follow such a "return".
-# As a work-around, we introduce an extra layer of a function
-# here, and immediately call it after defining it.
 git_rebase__merge () {
 
 case "$action" in
@@ -171,5 +162,3 @@ done
 finish_rb_merge
 
 }
-# ... and then we call the whole thing.
-git_rebase__merge
diff --git a/git-rebase.sh b/git-rebase.sh
index a1f6e5de6..4595a316a 100755
--- a/git-rebase.sh
+++ b/git-rebase.sh
@@ -196,7 +196,9 @@ run_specific_rebase () {
export GIT_EDITOR
autosquash=
fi
+   # Source the code and invoke it
. git-rebase--$type
+   git_rebase__$type
ret=$?
if test $ret -eq 0
then
-- 
2.16.2



[GSoC][PATCH 2/2] git-ci: update .pylintrc to ignore current warning

2018-03-13 Thread Viet Hung Tran
pylint's check for unused variables, global statements,
redefined builtins, undefined loop variables, and unused imports are
disabled.

The current configuration allows git-p4.py to pass the check in Travis CI
in my forked repository.

Here is the link for the successful built: 
https://travis-ci.org/VietHTran/git/builds/353155158 

Signed-off-by: Viet Hung Tran 
---
 .pylintrc | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/.pylintrc b/.pylintrc
index 0db42d646..09a0019d6 100644
--- a/.pylintrc
+++ b/.pylintrc
@@ -60,20 +60,13 @@ enable=import-error,
used-before-assignment,
cell-var-from-loop,
global-variable-undefined,
-   redefined-builtin,
redefine-in-handler,
-   unused-import,
unused-wildcard-import,
global-variable-not-assigned,
-   undefined-loop-variable,
-   global-statement,
global-at-module-level,
bad-open-mode,
redundant-unittest-assert,
boolean-datetime,
-   # Has common issues with our style due to
-   # https://github.com/PyCQA/pylint/issues/210
-   unused-variable
 
 # Things we'd like to enable someday:
 # redefined-outer-name (requires a bunch of work to clean up our code first)
-- 
2.16.2.440.gc6284da



git submodule update - reset instead of checkout?

2018-03-09 Thread Andreas Krey
Hi everyone,

I've been reading up on the current state of git submodules
(our customer's customers want to use them, causing a slight
groan from our customer).

The usability thing I wonder about - with git submodule update,
is it necessary to detach the head to the current (or upstream)
head, instead of resetting the current branch to that state?

Primary interest in the question: Seeing 'detached head' scares
almost everybody. To brainstorm:

- as we can already use 'submodule update --remote' to update
  to the remote head of a branch, it would be logical to have
  that branch checked out locally (and unfortunately, potentially
  have that branch's name conflict with the remote branch setup).

- when developers more or less accidentally commit on the detached
  head, all is not lost yet (I remember this being differently),
  but if the next 'submodule update' just resets again, the commit
  just made is still dropped, just as in the detached head state.

- So, we'd need to have 'submodule update' act closer to the pull or
  rebase counterparts and refuse to just lose commits (or uncommitted
  changes).

Having a checked-out branch in the submodule would have the advantage
that I can 'just' push local commits. At the moment, doing that requires
a relatively intricate dance, not at all similar to working in the
base (parent) module.

I'm working on hooks that automatically update the submodules after
a commit change (merge, pull, checkout) in the parent module, but
with the additional feature of (a) keeping a branch checked out
and (b) avoid discarding local changes. Probably means I shouldn't
invoke 'submodule update' at all, and deal with everyting myself.

Any thoughs/comments/helpful hints?

(Addional complexity: egit/jgit is in use as well, and the work model
we will arrive at probabaly needs to work with the current egit.)

- Andreas

-- 
"Totally trivial. Famous last words."
From: Linus Torvalds <torvalds@*.org>
Date: Fri, 22 Jan 2010 07:29:21 -0800


I got a forced update after I run git pull --rebase

2018-03-07 Thread ZhenTian
I can't reproduct my issue, this is my first time, but my colleague
came across this issue several weeks ago.

After I pushed my commit to git server without rejection. I run git
pull --rebase, then I got a forced update, and my last commit is
missing.

I have asked a question on StackOverflow, there is more details about
this issue: 
https://stackoverflow.com/questions/49164906/why-git-pull-rebase-results-a-forced-update

I have no idea about this.

I'm using Ubuntu 16.04.4 with Git 2.16.2, PyCharm IDE.


Sincerely,
Tian Zhen


git-scm.com update

2018-03-06 Thread Jeff King
Last year I reported on the state of the git-scm.com website:

  
https://public-inbox.org/git/20170202023349.7fopb3a6pc6dk...@sigill.intra.peff.net/

There was a little bit of public discussion, and I privately got
approximately one zillion offers to host the site or otherwise help with
it. Thank you to everybody who responded.

Here's an update on what happened since then:

 - we now have a small group of maintainers able to triage incoming bug
   reports and patches, and generate some fixes. Thanks especially to
   Jean-Noël Avila, Pedro Rijo, and Samuel Lijin for their work over the
   past year.

 - we resolved most of the performance issues. These were partly due to
   some inefficient database queries, but mostly it was resolved with
   lots and lots of caching.

   The main problem, it turned out, is just that we get a _lot_ of hits.
   Like 5-6 million requests per day (that's individual HTTP requests;
   analytics report that we get close to 200,000 unique visitors on any
   given weekday).

   Details on the caching are in the architecture document below.

 - the site remains a rails app. This is probably overkill, but it was
   the path of least resistance to keep it one. Converting to a static
   site would require at least some grunt-work, but also figuring out
   some solution for the site-wide search. There's no immediate plan to
   move to anything else.

 - previously the site was being paid for by GitHub (and was tied to
   GitHub's Heroku account). We could have continued that indefinitely,
   but I wanted to move it to a community-owned account. That happened
   (details in the architecture document).

   Rather than ask for money from GitHub or elsewhere to cover hosting
   costs (which would have been easy -- we had quite a few offers!), I
   instead approached the companies whose services we're using to host
   the site and asked for direct sponsorship. Every one of them was
   happy to oblige, and Heroku in particular was helpful with migrating
   the site to the new account. So we're currently receiving free
   service from Heroku, Cloudflare, and Bonsai.

 - I've written up a few documents about how the site is organized. One
   higher-level page on the site for bug reporters, etc:

 https://github.com/git/git-scm.com/pull/1172/files

   and one more detailed architecture document:

 https://github.com/git/git-scm.com/blob/master/ARCHITECTURE.md

So at this point I think the site is in reasonably good shape going
forward. One of the long-standing issues is that the design is not very
responsive, and looks bad on phones and tablets. Jason Long, who did
much of the original site design, is working on a visual refresh that
should fix this and update some of the messier and outdated parts of the
CSS and related tooling. The plan is to have the refresh running
alongside the production site soon, so that we can gather feedback and
iterate from there.

-Peff


[PATCH 1/2] object.h: update flag allocation comment

2018-03-06 Thread Nguyễn Thái Ngọc Duy
Since the "flags" is shared, it's a good idea to keep track of who
uses what bit. When we need to use more flags in library code, we can
be sure it won't be re-used for another purpose by some caller.

While at there, fix the location of "5" (should be in a different
column than "4" two lines down)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclo...@gmail.com>
---
 builtin/index-pack.c | 1 +
 builtin/pack-objects.c   | 1 +
 builtin/reflog.c | 1 +
 builtin/unpack-objects.c | 1 +
 object.h | 6 +-
 5 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 7e3e1a461c..b4239a633c 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -49,6 +49,7 @@ struct thread_local {
    int pack_fd;
 };
 
+/* Remember to update object flag allocation in object.h */
 #define FLAG_LINK (1u<<20)
 #define FLAG_CHECKED (1u<<21)
 
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5c674b2843..833126e671 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2549,6 +2549,7 @@ static void read_object_list_from_stdin(void)
}
 }
 
+/* Remember to update object flag allocation in object.h */
 #define OBJECT_ADDED (1u<<20)
 
 static void show_commit(struct commit *commit, void *data)
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 2233725315..95becf0e7e 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -52,6 +52,7 @@ struct collect_reflog_cb {
int nr;
 };
 
+/* Remember to update object flag allocation in object.h */
 #define INCOMPLETE (1u<<10)
 #define STUDYING   (1u<<11)
 #define REACHABLE  (1u<<12)
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 7235d2ffbf..fba9498ffe 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -158,6 +158,7 @@ struct obj_info {
struct object *obj;
 };
 
+/* Remember to update object flag allocation in object.h */
 #define FLAG_OPEN (1u<<20)
 #define FLAG_WRITTEN (1u<<21)
 
diff --git a/object.h b/object.h
index 87563d9056..15901d2901 100644
--- a/object.h
+++ b/object.h
@@ -29,7 +29,7 @@ struct object_array {
 /*
  * object flag allocation:
  * revision.h:  0-1026
- * fetch-pack.c:0---5
+ * fetch-pack.c:05
  * walker.c:0-2
  * upload-pack.c:   4   1119
  * builtin/blame.c:   12-13
@@ -40,6 +40,10 @@ struct object_array {
  * sha1_name.c: 20
  * list-objects-filter.c: 21
  * builtin/fsck.c:  0--3
+ * builtin/index-pack.c:2021
+ * builtin/pack-objects.c:  20
+ * builtin/reflog.c:  10--12
+ * builtin/unpack-objects.c:2021
  */
 #define FLAG_BITS  27
 
-- 
2.16.2.784.gb291bd247e



[PATCH v3 07/13] perl: update our ancient copy of Error.pm

2018-03-03 Thread Ævar Arnfjörð Bjarmason
The Error.pm shipped with Git as a fallback if there was no Error.pm
on the system was released in April 2006. There's been dozens of
releases since then, the latest at August 7, 2017. Let's update to
that.

I don't know of anything we need from this new release or which this
fixes. This change is simply a matter of keeping up with
upstream. Before this users who'd install git via their package system
would get an up-to-date Error.pm, but if it's installed from source
they'd get one more than a decade old.

This undoes a local hack we'd accumulated in 96bc4de85c ("Eliminate
Scalar::Util usage from private-Error.pm", 2006-07-26), it's been
redundant since my d48b284183 ("perl: bump the required Perl version
to 5.8 from 5.6.[21]", 2010-09-24).

This also undoes 3a51467b94 ("Typo fix: replacing it's -> its",
2013-04-13). This is the Nth time I find that some upstream code of
ours (in contrib/, in sha1dc/ and now in perl/ ...) has diverged from
upstream because of some tree-wide typo fixing. Let's not do those
fixes against upstream projects, it's more valuable that we have a 1=1
mapping to upstream than to fix typos in docs we never even generate
from this code. If someone wants to fix typos in them fine, but they
should do it with a patch to upstream which git.git can then
incorporate.

The upstream code doesn't cleanly pass a --check, so I'm adding a
.gitattributes file for similar reasons as done for sha1dc in
5d184f468e ("sha1dc: ignore indent-with-non-tab whitespace
violations", 2017-06-06).

The updated source was retrieved from
https://fastapi.metacpan.org/source/SHLOMIF/Error-0.17025/lib/Error.pm

Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
---
 perl/Git/FromCPAN/.gitattributes |   1 +
 perl/Git/FromCPAN/Error.pm   | 296 +--
 2 files changed, 256 insertions(+), 41 deletions(-)
 create mode 100644 perl/Git/FromCPAN/.gitattributes

diff --git a/perl/Git/FromCPAN/.gitattributes b/perl/Git/FromCPAN/.gitattributes
new file mode 100644
index 00..8b64fc5e22
--- /dev/null
+++ b/perl/Git/FromCPAN/.gitattributes
@@ -0,0 +1 @@
+/Error.pm whitespace=-blank-at-eof
diff --git a/perl/Git/FromCPAN/Error.pm b/perl/Git/FromCPAN/Error.pm
index 6098135ae2..f9c36e9e98 100644
--- a/perl/Git/FromCPAN/Error.pm
+++ b/perl/Git/FromCPAN/Error.pm
@@ -12,10 +12,12 @@
 package Error;
 
 use strict;
+use warnings;
+
 use vars qw($VERSION);
 use 5.004;
 
-$VERSION = "0.15009";
+$VERSION = "0.17025";
 
 use overload (
'""'   =>   'stringify',
@@ -32,21 +34,35 @@ $Error::THROWN = undef; # last error thrown, a 
workaround until die $ref works
 my $LAST;  # Last error created
 my %ERROR; # Last error associated with package
 
-sub throw_Error_Simple
+sub _throw_Error_Simple
 {
 my $args = shift;
 return Error::Simple->new($args->{'text'});
 }
 
-$Error::ObjectifyCallback = \_Error_Simple;
+$Error::ObjectifyCallback = \&_throw_Error_Simple;
 
 
 # Exported subs are defined in Error::subs
 
+use Scalar::Util ();
+
 sub import {
 shift;
+my @tags = @_;
 local $Exporter::ExportLevel = $Exporter::ExportLevel + 1;
-Error::subs->import(@_);
+
+@tags = grep {
+   if( $_ eq ':warndie' ) {
+  Error::WarnDie->import();
+  0;
+   }
+   else {
+  1;
+   }
+} @tags;
+
+Error::subs->import(@tags);
 }
 
 # I really want to use last for the name of this method, but it is a keyword
@@ -107,10 +123,6 @@ sub stacktrace {
 $text;
 }
 
-# Allow error propagation, ie
-#
-# $ber->encode(...) or
-#return Error->prior($ber)->associate($ldap);
 
 sub associate {
 my $err = shift;
@@ -130,6 +142,7 @@ sub associate {
 return;
 }
 
+
 sub new {
 my $self = shift;
 my($pkg,$file,$line) = caller($Error::Depth);
@@ -246,6 +259,10 @@ sub value {
 
 package Error::Simple;
 
+use vars qw($VERSION);
+
+$VERSION = "0.17025";
+
 @Error::Simple::ISA = qw(Error);
 
 sub new {
@@ -288,14 +305,6 @@ use vars qw(@EXPORT_OK @ISA %EXPORT_TAGS);
 
 @ISA = qw(Exporter);
 
-
-sub blessed {
-   my $item = shift;
-   local $@; # don't kill an outer $@
-   ref $item and eval { $item->can('can') };
-}
-
-
 sub run_clauses ($$$\@) {
 my($clauses,$err,$wantarray,$result) = @_;
 my $code = undef;
@@ -314,16 +323,17 @@ sub run_clauses ($$$\@) {
my $pkg = $catch->[$i];
unless(defined $pkg) {
#except
-   splice(@$catch,$i,2,$catch->[$i+1]->());
+   splice(@$catch,$i,2,$catch->[$i+1]->($err));
$i -= 2;
next CATCHLOOP;
}
-   elsif(blessed($err) && $err->isa($pkg)) {
+   elsif(Scalar::Util::blessed($err) && $err->isa($pkg)) {
$code = $catch-&

[PATCH v3 08/13] perl: update our copy of Mail::Address

2018-03-03 Thread Ævar Arnfjörð Bjarmason
Update our copy of Mail::Address from 2.19 (Aug 22, 2017) to 2.20 (Jan
23, 2018). Like the preceding Error.pm update this is done simply to
keep up-to-date with upstream, and as can be shown from the diff
there's no functional changes.

The updated source was retrieved from
https://fastapi.metacpan.org/source/MARKOV/MailTools-2.20/lib/Mail/Address.pm

Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
---
 perl/Git/FromCPAN/Mail/Address.pm | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/perl/Git/FromCPAN/Mail/Address.pm 
b/perl/Git/FromCPAN/Mail/Address.pm
index 13b2ff7d05..683d490b2b 100644
--- a/perl/Git/FromCPAN/Mail/Address.pm
+++ b/perl/Git/FromCPAN/Mail/Address.pm
@@ -1,10 +1,14 @@
-# Copyrights 1995-2017 by [Mark Overmeer <p...@overmeer.net>].
+# Copyrights 1995-2018 by [Mark Overmeer].
 #  For other contributors see ChangeLog.
 # See the manual pages for details on the licensing terms.
 # Pod stripped from pm file by OODoc 2.02.
+# This code is part of the bundle MailTools.  Meta-POD processed with
+# OODoc into POD and HTML manual-pages.  See README.md for Copyright.
+# Licensed under the same terms as Perl itself.
+
 package Mail::Address;
 use vars '$VERSION';
-$VERSION = '2.19';
+$VERSION = '2.20';
 
 use strict;
 
-- 
2.15.1.424.g9478a66081



[PATCH v2 07/13] perl: update our ancient copy of Error.pm

2018-02-25 Thread Ævar Arnfjörð Bjarmason
The Error.pm shipped with Git as a fallback if there was no Error.pm
on the system was released in April 2006. There's been dozens of
releases since then, the latest at August 7, 2017. Let's update to
that.

I don't know of anything we need from this new release or which this
fixes. This change is simply a matter of keeping up with
upstream. Before this users who'd install git via their package system
would get an up-to-date Error.pm, but if it's installed from source
they'd get one more than a decade old.

This undoes a local hack we'd accumulated in 96bc4de85c ("Eliminate
Scalar::Util usage from private-Error.pm", 2006-07-26), it's been
redundant since my d48b284183 ("perl: bump the required Perl version
to 5.8 from 5.6.[21]", 2010-09-24).

This also undoes 3a51467b94 ("Typo fix: replacing it's -> its",
2013-04-13). This is the Nth time I find that some upstream code of
ours (in contrib/, in sha1dc/ and now in perl/ ...) has diverged from
upstream because of some tree-wide typo fixing. Let's not do those
fixes against upstream projects, it's more valuable that we have a 1=1
mapping to upstream than to fix typos in docs we never even generate
from this code. If someone wants to fix typos in them fine, but they
should do it with a patch to upstream which git.git can then
incorporate.

The upstream code doesn't cleanly pass a --check, so I'm adding a
.gitattributes file for similar reasons as done for sha1dc in
5d184f468e ("sha1dc: ignore indent-with-non-tab whitespace
violations", 2017-06-06).

The updated source was retrieved from
https://fastapi.metacpan.org/source/SHLOMIF/Error-0.17025/lib/Error.pm

Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
---
 perl/Git/FromCPAN/.gitattributes |   1 +
 perl/Git/FromCPAN/Error.pm   | 296 +--
 2 files changed, 256 insertions(+), 41 deletions(-)
 create mode 100644 perl/Git/FromCPAN/.gitattributes

diff --git a/perl/Git/FromCPAN/.gitattributes b/perl/Git/FromCPAN/.gitattributes
new file mode 100644
index 00..8b64fc5e22
--- /dev/null
+++ b/perl/Git/FromCPAN/.gitattributes
@@ -0,0 +1 @@
+/Error.pm whitespace=-blank-at-eof
diff --git a/perl/Git/FromCPAN/Error.pm b/perl/Git/FromCPAN/Error.pm
index 6098135ae2..f9c36e9e98 100644
--- a/perl/Git/FromCPAN/Error.pm
+++ b/perl/Git/FromCPAN/Error.pm
@@ -12,10 +12,12 @@
 package Error;
 
 use strict;
+use warnings;
+
 use vars qw($VERSION);
 use 5.004;
 
-$VERSION = "0.15009";
+$VERSION = "0.17025";
 
 use overload (
'""'   =>   'stringify',
@@ -32,21 +34,35 @@ $Error::THROWN = undef; # last error thrown, a 
workaround until die $ref works
 my $LAST;  # Last error created
 my %ERROR; # Last error associated with package
 
-sub throw_Error_Simple
+sub _throw_Error_Simple
 {
 my $args = shift;
 return Error::Simple->new($args->{'text'});
 }
 
-$Error::ObjectifyCallback = \_Error_Simple;
+$Error::ObjectifyCallback = \&_throw_Error_Simple;
 
 
 # Exported subs are defined in Error::subs
 
+use Scalar::Util ();
+
 sub import {
 shift;
+my @tags = @_;
 local $Exporter::ExportLevel = $Exporter::ExportLevel + 1;
-Error::subs->import(@_);
+
+@tags = grep {
+   if( $_ eq ':warndie' ) {
+  Error::WarnDie->import();
+  0;
+   }
+   else {
+  1;
+   }
+} @tags;
+
+Error::subs->import(@tags);
 }
 
 # I really want to use last for the name of this method, but it is a keyword
@@ -107,10 +123,6 @@ sub stacktrace {
 $text;
 }
 
-# Allow error propagation, ie
-#
-# $ber->encode(...) or
-#return Error->prior($ber)->associate($ldap);
 
 sub associate {
 my $err = shift;
@@ -130,6 +142,7 @@ sub associate {
 return;
 }
 
+
 sub new {
 my $self = shift;
 my($pkg,$file,$line) = caller($Error::Depth);
@@ -246,6 +259,10 @@ sub value {
 
 package Error::Simple;
 
+use vars qw($VERSION);
+
+$VERSION = "0.17025";
+
 @Error::Simple::ISA = qw(Error);
 
 sub new {
@@ -288,14 +305,6 @@ use vars qw(@EXPORT_OK @ISA %EXPORT_TAGS);
 
 @ISA = qw(Exporter);
 
-
-sub blessed {
-   my $item = shift;
-   local $@; # don't kill an outer $@
-   ref $item and eval { $item->can('can') };
-}
-
-
 sub run_clauses ($$$\@) {
 my($clauses,$err,$wantarray,$result) = @_;
 my $code = undef;
@@ -314,16 +323,17 @@ sub run_clauses ($$$\@) {
my $pkg = $catch->[$i];
unless(defined $pkg) {
#except
-   splice(@$catch,$i,2,$catch->[$i+1]->());
+   splice(@$catch,$i,2,$catch->[$i+1]->($err));
$i -= 2;
next CATCHLOOP;
}
-   elsif(blessed($err) && $err->isa($pkg)) {
+   elsif(Scalar::Util::blessed($err) && $err->isa($pkg)) {
$code = $catch-&

[PATCH v2 08/13] perl: update our copy of Mail::Address

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Update our copy of Mail::Address from 2.19 (Aug 22, 2017) to 2.20 (Jan
23, 2018). Like the preceding Error.pm update this is done simply to
keep up-to-date with upstream, and as can be shown from the diff
there's no functional changes.

The updated source was retrieved from
https://fastapi.metacpan.org/source/MARKOV/MailTools-2.20/lib/Mail/Address.pm

Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
---
 perl/Git/FromCPAN/Mail/Address.pm | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/perl/Git/FromCPAN/Mail/Address.pm 
b/perl/Git/FromCPAN/Mail/Address.pm
index 13b2ff7d05..683d490b2b 100644
--- a/perl/Git/FromCPAN/Mail/Address.pm
+++ b/perl/Git/FromCPAN/Mail/Address.pm
@@ -1,10 +1,14 @@
-# Copyrights 1995-2017 by [Mark Overmeer <p...@overmeer.net>].
+# Copyrights 1995-2018 by [Mark Overmeer].
 #  For other contributors see ChangeLog.
 # See the manual pages for details on the licensing terms.
 # Pod stripped from pm file by OODoc 2.02.
+# This code is part of the bundle MailTools.  Meta-POD processed with
+# OODoc into POD and HTML manual-pages.  See README.md for Copyright.
+# Licensed under the same terms as Perl itself.
+
 package Mail::Address;
 use vars '$VERSION';
-$VERSION = '2.19';
+$VERSION = '2.20';
 
 use strict;
 
-- 
2.15.1.424.g9478a66081



Re: [PATCH 4/8] perl: update our ancient copy of Error.pm

2018-02-15 Thread Ævar Arnfjörð Bjarmason

On Wed, Feb 14 2018, Jonathan Nieder jotted:

> Ævar Arnfjörð Bjarmason wrote:
>
>> The Error.pm shipped with Git as a fallback if there was no Error.pm
>> on the system was released in April 2006, there's been dozens of
>> releases since then, the latest at August 7, 2017, let's update to
>> that.
>
> Comma splices:
>  s/, there's/. There's/
>  s/, let's/. Let's/
>
> The one piece of information I was curious about that this (quite clear,
> thank you) commit message is missing is what changed in the intervening
> time.  Is this just about keeping up with upstream to make it easy to
> keep up later, or has upstream made any useful changes?  E.g. did any
> API or behaviors get better?

Will note that in v2, nothing we need changed, this is just a matter of
keeping up, and e.g. if I run Git on Debian I get a few month old
Error.pm, but if I build Git from source I get one that's more than a
decade old.

> Related: do we have to worry about in-tree users taking advantage of
> improved API and packagers forgetting to add a dependency on the new
> version?  Do we declare the minimal required Error.pm version somewhere
> (e.g. in the INSTALL file)?

We don't, but if that situation arises we could just start doing 'use
 ' to declare it, see "perldoc -f use", but right now
we work on seemingly every version of these modules ever released.


Re: [PATCH 5/8] perl: update our copy of Mail::Address

2018-02-15 Thread Ævar Arnfjörð Bjarmason

On Thu, Feb 15 2018, Matthieu Moy jotted:

> "Jonathan Nieder" <jrnie...@gmail.com> wrote:
>
>> Ævar Arnfjörð Bjarmason wrote:
>>
>> > Update our copy of Mail::Address from 2.19 (Aug 22, 2017) to 2.20 (Jan
>> > 23, 2018). This should be a trivial update[1] but it seems the version
>> > Matthieu Moy imported in bd869f67b9 ("send-email: add and use a local
>> > copy of Mail::Address", 2018-01-05) doesn't correspond to any 2.19
>> > version found on the CPAN. From the comment at the top of the file it
>> > looks like some OS version with the POD stripped, and with different
>> > indentation.
>>
>> Were there changes other than the POD stripping?
>
> No.
>
> I should have mentionned it in the commit message, but the one I took was
> from:
>
>   http://cpansearch.perl.org/src/MARKOV/MailTools-2.19/lib/Mail/Address.pm
>
> i.e. following the "source" link from:
>
>   http://search.cpan.org/~markov/MailTools-2.19/lib/Mail/Address.pod
>
> The link name suggested it was the actual source code but indeed it's a
> pre-processed file with the POD stripped.
>
> It would make sense to indicate explicitly where this file is from in
> this commit's message to avoid having the same discussion next time someone
> upgrades the package.

I see that I'm being a complete idiot and added the *.pod file in the
distro instead of the *.pm file, I got it from metacpan and didn't check
it carefully enough (and only tested with system Error.pm removed, not
Mail::Address), sorry. So yours was the right version. Will fix in a
re-roll.


Re: [PATCH 5/8] perl: update our copy of Mail::Address

2018-02-15 Thread Matthieu Moy
"Jonathan Nieder" <jrnie...@gmail.com> wrote:

> Ævar Arnfjörð Bjarmason wrote:
>
> > Update our copy of Mail::Address from 2.19 (Aug 22, 2017) to 2.20 (Jan
> > 23, 2018). This should be a trivial update[1] but it seems the version
> > Matthieu Moy imported in bd869f67b9 ("send-email: add and use a local
> > copy of Mail::Address", 2018-01-05) doesn't correspond to any 2.19
> > version found on the CPAN. From the comment at the top of the file it
> > looks like some OS version with the POD stripped, and with different
> > indentation.
>
> Were there changes other than the POD stripping?

No.

I should have mentionned it in the commit message, but the one I took was
from:

  http://cpansearch.perl.org/src/MARKOV/MailTools-2.19/lib/Mail/Address.pm

i.e. following the "source" link from:

  http://search.cpan.org/~markov/MailTools-2.19/lib/Mail/Address.pod

The link name suggested it was the actual source code but indeed it's a
pre-processed file with the POD stripped.

It would make sense to indicate explicitly where this file is from in
this commit's message to avoid having the same discussion next time someone
upgrades the package.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH v3 23/23] cat-file: update of docs

2018-02-14 Thread Jeff King
On Mon, Feb 12, 2018 at 08:08:54AM +, Olga Telezhnaya wrote:

> Update the docs for cat-file command. Some new formatting atoms added
> because of reusing ref-filter code.
> We do not support cat-file atoms in general formatting logic
> (there is just the support for cat-file), that is why some of the atoms
> are still explained in cat-file docs.
> We need to move these explanations when atoms will be supported
> by other commands.

OK, so I've read through the whole series now. I still think it really
needs some squashing in the middle to turn into a more comprehensible
series.

And I think by the end we need to address the atoms here that don't work
in ref-filter. We certainly want them to work in the long term, and the
fact that they don't is I think pointing to having the wrong
architecture here in the intermediate step. We should just have a single
set of placeholders, with no special "cat_file_data". If I understand
correctly, that's what's causing those atoms not to work in
for-each-ref.

-Peff


Re: [PATCH 5/8] perl: update our copy of Mail::Address

2018-02-14 Thread Jonathan Nieder
Ævar Arnfjörð Bjarmason wrote:

> Update our copy of Mail::Address from 2.19 (Aug 22, 2017) to 2.20 (Jan
> 23, 2018). This should be a trivial update[1] but it seems the version
> Matthieu Moy imported in bd869f67b9 ("send-email: add and use a local
> copy of Mail::Address", 2018-01-05) doesn't correspond to any 2.19
> version found on the CPAN. From the comment at the top of the file it
> looks like some OS version with the POD stripped, and with different
> indentation.

Were there changes other than the POD stripping?

> Let's instead use the upstream version as-is, and without copyright
> notices stripped. Like Error.pm this doesn't cleanly pass --check, so
> add a .gitattributes file to ignore the errors.
>
> 1. 
> https://metacpan.org/diff/file?target=MARKOV/MailTools-2.20/lib%2FMail%2FAddress.pod=MARKOV%2FMailTools-2.19%2Flib%2FMail%2FAddress.pod
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
> ---
>  perl/Git/FromCPAN/Mail/.gitattributes |   1 +
>  perl/Git/FromCPAN/Mail/Address.pm | 436 
> +-
>  2 files changed, 163 insertions(+), 274 deletions(-)
>  create mode 100644 perl/Git/FromCPAN/Mail/.gitattributes

Yikes re the stripped POD with license notice. Thanks for fixing it.

Reviewed-by: Jonathan Nieder <jrnie...@gmail.com>


Re: [PATCH 4/8] perl: update our ancient copy of Error.pm

2018-02-14 Thread Jonathan Nieder
Ævar Arnfjörð Bjarmason wrote:

> The Error.pm shipped with Git as a fallback if there was no Error.pm
> on the system was released in April 2006, there's been dozens of
> releases since then, the latest at August 7, 2017, let's update to
> that.

Comma splices:
 s/, there's/. There's/
 s/, let's/. Let's/

The one piece of information I was curious about that this (quite clear,
thank you) commit message is missing is what changed in the intervening
time.  Is this just about keeping up with upstream to make it easy to
keep up later, or has upstream made any useful changes?  E.g. did any
API or behaviors get better?

Related: do we have to worry about in-tree users taking advantage of
improved API and packagers forgetting to add a dependency on the new
version?  Do we declare the minimal required Error.pm version somewhere
(e.g. in the INSTALL file)?

[...]
>  perl/Git/FromCPAN/.gitattributes |   1 +
>  perl/Git/FromCPAN/Error.pm   | 296 
> +--
>  2 files changed, 256 insertions(+), 41 deletions(-)
>  create mode 100644 perl/Git/FromCPAN/.gitattributes

Most of the added lines are documentation, so this diffstat doesn't look
half-bad.

Thanks for writing it.

Jonathan


[PATCH 5/8] perl: update our copy of Mail::Address

2018-02-14 Thread Ævar Arnfjörð Bjarmason
Update our copy of Mail::Address from 2.19 (Aug 22, 2017) to 2.20 (Jan
23, 2018). This should be a trivial update[1] but it seems the version
Matthieu Moy imported in bd869f67b9 ("send-email: add and use a local
copy of Mail::Address", 2018-01-05) doesn't correspond to any 2.19
version found on the CPAN. From the comment at the top of the file it
looks like some OS version with the POD stripped, and with different
indentation.

Let's instead use the upstream version as-is, and without copyright
notices stripped. Like Error.pm this doesn't cleanly pass --check, so
add a .gitattributes file to ignore the errors.

1. 
https://metacpan.org/diff/file?target=MARKOV/MailTools-2.20/lib%2FMail%2FAddress.pod=MARKOV%2FMailTools-2.19%2Flib%2FMail%2FAddress.pod

Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
---
 perl/Git/FromCPAN/Mail/.gitattributes |   1 +
 perl/Git/FromCPAN/Mail/Address.pm | 436 +-
 2 files changed, 163 insertions(+), 274 deletions(-)
 create mode 100644 perl/Git/FromCPAN/Mail/.gitattributes

diff --git a/perl/Git/FromCPAN/Mail/.gitattributes 
b/perl/Git/FromCPAN/Mail/.gitattributes
new file mode 100644
index 00..94f3e5bb86
--- /dev/null
+++ b/perl/Git/FromCPAN/Mail/.gitattributes
@@ -0,0 +1 @@
+/Address.pm whitespace=-trailing-space
diff --git a/perl/Git/FromCPAN/Mail/Address.pm 
b/perl/Git/FromCPAN/Mail/Address.pm
index 13b2ff7d05..ee333e0f5a 100644
--- a/perl/Git/FromCPAN/Mail/Address.pm
+++ b/perl/Git/FromCPAN/Mail/Address.pm
@@ -1,276 +1,164 @@
-# Copyrights 1995-2017 by [Mark Overmeer <p...@overmeer.net>].
-#  For other contributors see ChangeLog.
-# See the manual pages for details on the licensing terms.
-# Pod stripped from pm file by OODoc 2.02.
-package Mail::Address;
-use vars '$VERSION';
-$VERSION = '2.19';
-
-use strict;
-
-use Carp;
-
-# use locale;   removed in version 1.78, because it causes taint problems
-
-sub Version { our $VERSION }
-
-
-
-# given a comment, attempt to extract a person's name
-sub _extract_name
-{   # This function can be called as method as well
-my $self = @_ && ref $_[0] ? shift : undef;
-
-local $_ = shift
-or return '';
-
-# Using encodings, too hard. See Mail::Message::Field::Full.
-return '' if m/\=\?.*?\?\=/;
-
-# trim whitespace
-s/^\s+//;
-s/\s+$//;
-s/\s+/ /;
-
-# Disregard numeric names (e.g. 123456.1...@compuserve.com)
-return "" if /^[\d ]+$/;
-
-s/^\((.*)\)$/$1/; # remove outermost parenthesis
-s/^"(.*)"$/$1/;   # remove outer quotation marks
-s/\(.*?\)//g; # remove minimal embedded comments
-s/\\//g;  # remove all escapes
-s/^"(.*)"$/$1/;   # remove internal quotation marks
-s/^([^\s]+) ?, ?(.*)$/$2 $1/; # reverse "Last, First M." if applicable
-s/,.*//;
-
-# Change casing only when the name contains only upper or only
-# lower cased characters.
-unless( m/[A-Z]/ && m/[a-z]/ )
-{   # Set the case of the name to first char upper rest lower
-s/\b(\w+)/\L\u$1/igo;  # Upcase first letter on name
-s/\bMc(\w)/Mc\u$1/igo; # Scottish names such as 'McLeod'
-s/\bo'(\w)/O'\u$1/igo; # Irish names such as 'O'Malley, O'Reilly'
-s/\b(x*(ix)?v*(iv)?i*)\b/\U$1/igo; # Roman numerals, eg 'Level III 
Support'
-}
+=encoding utf8
 
-# some cleanup
-s/\[[^\]]*\]//g;
-s/(^[\s'"]+|[\s'"]+$)//g;
-s/\s{2,}/ /g;
-
-$_;
-}
-
-sub _tokenise
-{   local $_ = join ',', @_;
-my (@words,$snippet,$field);
-
-s/\A\s+//;
-s/[\r\n]+/ /g;
-
-while ($_ ne '')
-{   $field = '';
-if(s/^\s*\(/(/ )# (...)
-{   my $depth = 0;
-
- PAREN: while(s/^(\(([^\(\)\\]|\\.)*)//)
-{   $field .= $1;
-$depth++;
-while(s/^(([^\(\)\\]|\\.)*\)\s*)//)
-{   $field .= $1;
-last PAREN unless --$depth;
-   $field .= $1 if s/^(([^\(\)\\]|\\.)+)//;
-}
-}
-
-carp "Unmatched () '$field' '$_'"
-if $depth;
-
-$field =~ s/\s+\Z//;
-push @words, $field;
-
-next;
-}
-
-if( s/^("(?:[^"\\]+|\\.)*")\s*//   # "..."
- || s/^(\[(?:[^\]\\]+|\\.)*\])\s*//# [...]
- || s/^([^\s()<>\@,;:\\".[\]]+)\s*//
- || s/^([()<>\@,;:\\".[\]])\s*//
-  )
-{   push @words, $1;
-next;
-}
-
-croak "Unrecognised line: $_";
-}
-
-push @words, ",";
-\@words;
-}
-
-sub _find_next
-{   my ($idx, $tokens, $len) = @_;
-
-while($idx < $len)
-{   my $c = $tokens->[$idx];
-return $c if $c eq ',' || $c eq ';' || $c eq '<';
-$idx++;
-}
-
-"";
-}
-
-sub _complete
-{   my ($class, $phrase, $address, $c

[PATCH 4/8] perl: update our ancient copy of Error.pm

2018-02-14 Thread Ævar Arnfjörð Bjarmason
The Error.pm shipped with Git as a fallback if there was no Error.pm
on the system was released in April 2006, there's been dozens of
releases since then, the latest at August 7, 2017, let's update to
that.

This undoes a local hack we'd accumulated in 96bc4de85c ("Eliminate
Scalar::Util usage from private-Error.pm", 2006-07-26), it's been
redundant since my d48b284183 ("perl: bump the required Perl version
to 5.8 from 5.6.[21]", 2010-09-24).

This also undoes 3a51467b94 ("Typo fix: replacing it's -> its",
2013-04-13). This is the Nth time I find that some upstream code of
ours (in contrib/, in sha1dc/ and now in perl/ ...) has diverged from
upstream because of some tree-wide typo fixing. Let's not do those
fixes against upstream projects, it's more valuable that we have a 1=1
mapping to upstream than to fix typos in docs we never even generate
from this code. If someone wants to fix typos in them fine, but they
should do it with a patch to upstream which git.git can then
incorporate.

The upstream code doesn't cleanly pass a --check, so I'm adding a
.gitattributes file for similar reasons as done for sha1dc in
5d184f468e ("sha1dc: ignore indent-with-non-tab whitespace
violations", 2017-06-06).

Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
---
 perl/Git/FromCPAN/.gitattributes |   1 +
 perl/Git/FromCPAN/Error.pm   | 296 +--
 2 files changed, 256 insertions(+), 41 deletions(-)
 create mode 100644 perl/Git/FromCPAN/.gitattributes

diff --git a/perl/Git/FromCPAN/.gitattributes b/perl/Git/FromCPAN/.gitattributes
new file mode 100644
index 00..8b64fc5e22
--- /dev/null
+++ b/perl/Git/FromCPAN/.gitattributes
@@ -0,0 +1 @@
+/Error.pm whitespace=-blank-at-eof
diff --git a/perl/Git/FromCPAN/Error.pm b/perl/Git/FromCPAN/Error.pm
index 6098135ae2..f9c36e9e98 100644
--- a/perl/Git/FromCPAN/Error.pm
+++ b/perl/Git/FromCPAN/Error.pm
@@ -12,10 +12,12 @@
 package Error;
 
 use strict;
+use warnings;
+
 use vars qw($VERSION);
 use 5.004;
 
-$VERSION = "0.15009";
+$VERSION = "0.17025";
 
 use overload (
'""'   =>   'stringify',
@@ -32,21 +34,35 @@ $Error::THROWN = undef; # last error thrown, a 
workaround until die $ref works
 my $LAST;  # Last error created
 my %ERROR; # Last error associated with package
 
-sub throw_Error_Simple
+sub _throw_Error_Simple
 {
 my $args = shift;
 return Error::Simple->new($args->{'text'});
 }
 
-$Error::ObjectifyCallback = \_Error_Simple;
+$Error::ObjectifyCallback = \&_throw_Error_Simple;
 
 
 # Exported subs are defined in Error::subs
 
+use Scalar::Util ();
+
 sub import {
 shift;
+my @tags = @_;
 local $Exporter::ExportLevel = $Exporter::ExportLevel + 1;
-Error::subs->import(@_);
+
+@tags = grep {
+   if( $_ eq ':warndie' ) {
+  Error::WarnDie->import();
+  0;
+   }
+   else {
+  1;
+   }
+} @tags;
+
+Error::subs->import(@tags);
 }
 
 # I really want to use last for the name of this method, but it is a keyword
@@ -107,10 +123,6 @@ sub stacktrace {
 $text;
 }
 
-# Allow error propagation, ie
-#
-# $ber->encode(...) or
-#return Error->prior($ber)->associate($ldap);
 
 sub associate {
 my $err = shift;
@@ -130,6 +142,7 @@ sub associate {
 return;
 }
 
+
 sub new {
 my $self = shift;
 my($pkg,$file,$line) = caller($Error::Depth);
@@ -246,6 +259,10 @@ sub value {
 
 package Error::Simple;
 
+use vars qw($VERSION);
+
+$VERSION = "0.17025";
+
 @Error::Simple::ISA = qw(Error);
 
 sub new {
@@ -288,14 +305,6 @@ use vars qw(@EXPORT_OK @ISA %EXPORT_TAGS);
 
 @ISA = qw(Exporter);
 
-
-sub blessed {
-   my $item = shift;
-   local $@; # don't kill an outer $@
-   ref $item and eval { $item->can('can') };
-}
-
-
 sub run_clauses ($$$\@) {
 my($clauses,$err,$wantarray,$result) = @_;
 my $code = undef;
@@ -314,16 +323,17 @@ sub run_clauses ($$$\@) {
my $pkg = $catch->[$i];
unless(defined $pkg) {
#except
-   splice(@$catch,$i,2,$catch->[$i+1]->());
+   splice(@$catch,$i,2,$catch->[$i+1]->($err));
$i -= 2;
next CATCHLOOP;
}
-   elsif(blessed($err) && $err->isa($pkg)) {
+   elsif(Scalar::Util::blessed($err) && $err->isa($pkg)) {
$code = $catch->[$i+1];
while(1) {
my $more = 0;
-   local($Error::THROWN);
+   local($Error::THROWN, $@);
my $ok = eval {
+   $@ = $err;
if($wantarray) {
@{$result} = $code->($err,\$more);
}
@@ -341,10

Re: [PATCH v1] fsmonitor: update documentation to remove reference to invalid config settings

2018-02-14 Thread Junio C Hamano
Ben Peart <benpe...@microsoft.com> writes:

> Remove the reference to setting core.fsmonitor to `true` (or `false`) as those
> are not valid settings.
>
> Signed-off-by: Ben Peart <benpe...@microsoft.com>
> ---

Thanks.  It is a bit embarrassing that nobody caught it for this
long.  Will apply.

>
> Notes:
> Base Ref: master
> Web-Diff: https://github.com/benpeart/git/commit/4b7ec2c11e
> Checkout: git fetch https://github.com/benpeart/git fsmonitor_docs-v1 && 
> git checkout 4b7ec2c11e
>
>  Documentation/git-update-index.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-update-index.txt 
> b/Documentation/git-update-index.txt
> index bdb0342593..ad2383d7ed 100644
> --- a/Documentation/git-update-index.txt
> +++ b/Documentation/git-update-index.txt
> @@ -484,8 +484,8 @@ the `core.fsmonitor` configuration variable (see
>  linkgit:git-config[1]) than using the `--fsmonitor` option to
>  `git update-index` in each repository, especially if you want to do so
>  across all repositories you use, because you can set the configuration
> -variable to `true` (or `false`) in your `$HOME/.gitconfig` just once
> -and have it affect all repositories you touch.
> +variable in your `$HOME/.gitconfig` just once and have it affect all
> +repositories you touch.
>  
>  When the `core.fsmonitor` configuration variable is changed, the
>  file system monitor is added to or removed from the index the next time
>
> base-commit: e7e80778e705ea3f9332c634781d6d0f8c6eab64


[PATCH v1] fsmonitor: update documentation to remove reference to invalid config settings

2018-02-14 Thread Ben Peart
Remove the reference to setting core.fsmonitor to `true` (or `false`) as those
are not valid settings.

Signed-off-by: Ben Peart <benpe...@microsoft.com>
---

Notes:
Base Ref: master
Web-Diff: https://github.com/benpeart/git/commit/4b7ec2c11e
Checkout: git fetch https://github.com/benpeart/git fsmonitor_docs-v1 && 
git checkout 4b7ec2c11e

 Documentation/git-update-index.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-update-index.txt 
b/Documentation/git-update-index.txt
index bdb0342593..ad2383d7ed 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -484,8 +484,8 @@ the `core.fsmonitor` configuration variable (see
 linkgit:git-config[1]) than using the `--fsmonitor` option to
 `git update-index` in each repository, especially if you want to do so
 across all repositories you use, because you can set the configuration
-variable to `true` (or `false`) in your `$HOME/.gitconfig` just once
-and have it affect all repositories you touch.
+variable in your `$HOME/.gitconfig` just once and have it affect all
+repositories you touch.
 
 When the `core.fsmonitor` configuration variable is changed, the
 file system monitor is added to or removed from the index the next time

base-commit: e7e80778e705ea3f9332c634781d6d0f8c6eab64
-- 
2.15.0.windows.1



Re: [PATCH v3 07/14] commit-graph: update graph-head during write

2018-02-13 Thread Jonathan Tan
On Thu,  8 Feb 2018 15:37:31 -0500
Derrick Stolee  wrote:

> It is possible to have multiple commit graph files in a pack directory,
> but only one is important at a time. Use a 'graph_head' file to point
> to the important file. Teach git-commit-graph to write 'graph_head' upon
> writing a new commit graph file.

You should probably include the rationale for a special "graph_head"
file that you describe here [1] in the commit message.

[1] https://public-inbox.org/git/99543db0-26e4-8daa-a580-b618497e4...@gmail.com/

> +char *get_graph_head_filename(const char *pack_dir)
> +{
> + struct strbuf fname = STRBUF_INIT;
> + strbuf_addstr(, pack_dir);
> + strbuf_addstr(, "/graph-head");
> + return strbuf_detach(, 0);

NULL, not 0.

> +}


Re: [PATCH v3 07/14] commit-graph: update graph-head during write

2018-02-12 Thread Derrick Stolee

On 2/12/2018 3:37 PM, Junio C Hamano wrote:

Junio C Hamano <gits...@pobox.com> writes:


Derrick Stolee <sto...@gmail.com> writes:


It is possible to have multiple commit graph files in a pack directory,
but only one is important at a time. Use a 'graph_head' file to point
to the important file. Teach git-commit-graph to write 'graph_head' upon
writing a new commit graph file.

Why this design, instead of what "repack -a" would do, iow, if there
always is a singleton that is the only one that matters, shouldn't
the creation of that latest singleton just clear the older ones
before it returns control?

Note that I am not complaining---I am just curious why we want to
expose this "there is one relevant one but we keep irrelevant ones
we usually do not look at and need to be garbage collected" to end
users, and also expect readers of the series, resulting code and
docs would have the same puzzled feeling.



Aside: I forgot to mention in my cover letter that the experience around 
the "--delete-expired" flag for "git commit-graph write" is different 
than v2. If specified, we delete all ".graph" files in the pack 
directory other than the one referenced by "graph_head" at the beginning 
of the process or the one written by the process. If these deletes fail, 
then we ignore the failure (assuming that they are being used by another 
Git process). In usual cases, we will delete these expired files in the 
next instance. I believe this matches similar behavior in gc and repack.


-- Back to discussion about the value of "graph_head" --

The current design of using a pointer file (graph_head) is intended to 
have these benefits:


1. We do not need to rely on a directory listing and mtimes to determine 
which graph file to use.


2. If we write a new graph file while another git process is reading the 
existing graph file, we can update the graph_head pointer without 
deleting the file that is currently memory-mapped. (This is why we 
cannot just rely on a canonical file name, such as "the_graph", to store 
the data.)


3. We can atomically change the 'graph_head' file without interrupting 
concurrent git processes. I think this is different from the "repack" 
situation because a concurrent process would load all packfiles in the 
pack directory and possibly have open handles when the repack is trying 
to delete them.


4. We remain open to making the graph file incremental (as the MIDX 
feature is designed to do; see [1]). It is less crucial to have an 
incremental graph file structure (the graph file for the Windows 
repository is currently ~120MB versus a MIDX file of 1.25 GB), but the 
graph_head pattern makes this a possibility.


I tried to avoid item 1 due to personal taste, and since I am storing 
the files in the objects/pack directory (so that listing may be very 
large with a lot of wasted entries). This is less important with our 
pending change of moving the graph files to a different directory. This 
also satisfies items 2 and 3, as long as we never write graph files so 
quickly that we have a collision on mtime.


I cannot think of another design that satisfies item 4.

As for your end user concerns: My philosophy with this feature is that 
end users will never interact with the commit-graph builtin. 99% of 
users will benefit from a repack or GC automatically computing a commit 
graph (when we add that integration point). The other uses for the 
builtin are for users who want extreme control over their data, such as 
code servers and build agents.


Perhaps someone with experience managing large repositories with git in 
a server environment could chime in with some design requirements they 
would need.


Thanks,
-Stolee

[1] 
https://public-inbox.org/git/20180107181459.222909-2-dsto...@microsoft.com/

    [RFC PATCH 01/18] docs: Multi-Pack Index (MIDX) Design Notes


Re: [PATCH v3 07/14] commit-graph: update graph-head during write

2018-02-12 Thread Junio C Hamano
Junio C Hamano  writes:

> Derrick Stolee  writes:
>
>> It is possible to have multiple commit graph files in a pack directory,
>> but only one is important at a time. Use a 'graph_head' file to point
>> to the important file. Teach git-commit-graph to write 'graph_head' upon
>> writing a new commit graph file.
>
> Why this design, instead of what "repack -a" would do, iow, if there
> always is a singleton that is the only one that matters, shouldn't
> the creation of that latest singleton just clear the older ones
> before it returns control?

Note that I am not complaining---I am just curious why we want to
expose this "there is one relevant one but we keep irrelevant ones
we usually do not look at and need to be garbage collected" to end
users, and also expect readers of the series, resulting code and
docs would have the same puzzled feeling.




Re: [PATCH v3 07/14] commit-graph: update graph-head during write

2018-02-12 Thread Junio C Hamano
Derrick Stolee  writes:

> It is possible to have multiple commit graph files in a pack directory,
> but only one is important at a time. Use a 'graph_head' file to point
> to the important file. Teach git-commit-graph to write 'graph_head' upon
> writing a new commit graph file.

Why this design, instead of what "repack -a" would do, iow, if there
always is a singleton that is the only one that matters, shouldn't
the creation of that latest singleton just clear the older ones
before it returns control?


[PATCH v3 23/23] cat-file: update of docs

2018-02-12 Thread Olga Telezhnaya
Update the docs for cat-file command. Some new formatting atoms added
because of reusing ref-filter code.
We do not support cat-file atoms in general formatting logic
(there is just the support for cat-file), that is why some of the atoms
are still explained in cat-file docs.
We need to move these explanations when atoms will be supported
by other commands.

Signed-off-by: Olga Telezhnaia <olyatelezhn...@gmail.com>
Mentored-by: Christian Couder <christian.cou...@gmail.com>
Mentored by: Jeff King <p...@peff.net>
---
 Documentation/git-cat-file.txt | 13 ++---
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index f90f09b03fae5..90639ac21d0e8 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -187,17 +187,8 @@ linkgit:git-rev-parse[1].
 You can specify the information shown for each object by using a custom
 ``. The `` is copied literally to stdout for each
 object, with placeholders of the form `%(atom)` expanded, followed by a
-newline. The available atoms are:
-
-`objectname`::
-   The 40-hex object name of the object.
-
-`objecttype`::
-   The type of the object (the same as `cat-file -t` reports).
-
-`objectsize`::
-   The size, in bytes, of the object (the same as `cat-file -s`
-   reports).
+newline. The available atoms are the same as that of
+linkgit:git-for-each-ref[1], but there are some additional ones:
 
 `objectsize:disk`::
The size, in bytes, that the object takes up on disk. See the

--
https://github.com/git/git/pull/452


Re: [PATCH 1/2] update-index doc: note a fixed bug in the untracked cache

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Fri, Feb 09 2018, Junio C. Hamano jotted:

> Ævar Arnfjörð Bjarmason  writes:
>
>>> Will queue with the above typofix, together with the other one.  I
>>> am not sure if we want to say "Before 2.17", though.
>>
>> I'm just keeping in mind the user who later on upgrades git from say
>> 2.14 to 2.18 or something, and is able to find in the docs when/why this
>> new warning got added, which helps diagnose it.
>
> Ah, no, that is not what I meant.  I just didn't think '2.17' in
> that sentence may not be understood as "Git version 2.17" by most
> readers.

Ah, I see. Yes I agree, sorry. Do you mind fixing it up to just say
"Before Git version 2.17, ..." ?


Re: [PATCH 1/2] update-index doc: note a fixed bug in the untracked cache

2018-02-09 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

>> Will queue with the above typofix, together with the other one.  I
>> am not sure if we want to say "Before 2.17", though.
>
> I'm just keeping in mind the user who later on upgrades git from say
> 2.14 to 2.18 or something, and is able to find in the docs when/why this
> new warning got added, which helps diagnose it.

Ah, no, that is not what I meant.  I just didn't think '2.17' in
that sentence may not be understood as "Git version 2.17" by most
readers.


Re: [PATCH 1/2] update-index doc: note a fixed bug in the untracked cache

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Fri, Feb 09 2018, Junio C. Hamano jotted:

> Ævar Arnfjörð Bjarmason   writes:
>
>> +Before 2.17, the untracked cache had a bug where replacing a directory
>> +with a symlink to another directory could cause it to incorrectly show
>> +files tracked by git as untracked. See the "status: add a failing test
>> +showing a core.untrackedCache bug" commit to git.git. A workaround for
>> +that was (and this might work for other undiscoverd bugs in the
>> +future):
>
> s/undiscoverd/undiscovered/
>
> But more importantly, would it help _us_ to encourage people to
> squelch the diagnoses without telling us about potential breakage, I
> wonder, by telling them to do this for other undiscovered cases,
> too?

You mean including something like "if you see this the git ML would like
to hear about it"?

> Will queue with the above typofix, together with the other one.  I
> am not sure if we want to say "Before 2.17", though.

I'm just keeping in mind the user who later on upgrades git from say
2.14 to 2.18 or something, and is able to find in the docs when/why this
new warning got added, which helps diagnose it.

>> +
>> +
>> +$ git -c core.untrackedCache=false status
>> +
>> +
>> +This bug has also been shown to affect non-symlink cases of replacing
>> +a directory with a file when it comes to the internal structures of
>> +the untracked cache, but no case has been found where this resulted in
>> +wrong "git status" output.
>> +
>>  File System Monitor
>>  ---


Re: [PATCH 0/2] update-index doc: note new caveats in 2.17

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Fri, Feb 09 2018, Junio C. Hamano jotted:

> Ævar Arnfjörð Bjarmason   writes:
>
>> When users upgrade to 2.17 they're going to have git yelling at them
>> (as my users did) on existing checkouts, with no indication whatsoever
>> that it's due to the UC or how to fix it.
>
> Wait.  Are you saying that the new warning is "warning" against a
> condition that is not an error?

No I mean the warning itself gives you no indication what the solution
is, or why it might be happening, and because it's printed on every
occurrence we had "git status" invocations on some big repos where there
would be hundreds of lines of the same warning for different
now-nonexisting directories.

Deferring it and just printing "we had N cases of..." would be better,
but would make the logic a lot more complex, I'm not sure how common
this would be in the wild, but as noted happened on a large proportion
of our checkouts, so having something mentioning this in the docs is
good.

I only had that git version out for about an hour, but had some users rm
-rfing their checkouts and re-cloning because they couldn't figure out
how to fix it.

Is noting it in the docs going to help all those users? No, but at least
we'll have something Google-able they're likely to find.

>> ... doesn't it only warn under that mode? I.e.:
>>
>> -"could not open directory '%s'")
>> +"core.untrackedCache: could not open directory '%s'")
>
> For example, if it attempts to open a directory which does *not*
> have to exist, and sees an error from opendir() due to ENOENT, then
> I do not think it should be giving a warning.  If we positively know
> that a directory should exist there and we cannot open it, of course
> we should warn.  Also, if we know a directory should be readable
> when it exists, then we should be warning when we see an error other
> than ENOENT.

*nod*, so not UC-specific, even though I've only seen it in relation to
that.


Re: [PATCH 1/2] update-index doc: note a fixed bug in the untracked cache

2018-02-09 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> +Before 2.17, the untracked cache had a bug where replacing a directory
> +with a symlink to another directory could cause it to incorrectly show
> +files tracked by git as untracked. See the "status: add a failing test
> +showing a core.untrackedCache bug" commit to git.git. A workaround for
> +that was (and this might work for other undiscoverd bugs in the
> +future):

s/undiscoverd/undiscovered/

But more importantly, would it help _us_ to encourage people to
squelch the diagnoses without telling us about potential breakage, I
wonder, by telling them to do this for other undiscovered cases,
too?

Will queue with the above typofix, together with the other one.  I
am not sure if we want to say "Before 2.17", though.

> +
> +
> +$ git -c core.untrackedCache=false status
> +
> +
> +This bug has also been shown to affect non-symlink cases of replacing
> +a directory with a file when it comes to the internal structures of
> +the untracked cache, but no case has been found where this resulted in
> +wrong "git status" output.
> +
>  File System Monitor
>  ---


Re: [PATCH 0/2] update-index doc: note new caveats in 2.17

2018-02-09 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> When users upgrade to 2.17 they're going to have git yelling at them
> (as my users did) on existing checkouts, with no indication whatsoever
> that it's due to the UC or how to fix it.

Wait.  Are you saying that the new warning is "warning" against a
condition that is not an error?

> ... doesn't it only warn under that mode? I.e.:
>
> -"could not open directory '%s'")
> +"core.untrackedCache: could not open directory '%s'")

For example, if it attempts to open a directory which does *not*
have to exist, and sees an error from opendir() due to ENOENT, then
I do not think it should be giving a warning.  If we positively know
that a directory should exist there and we cannot open it, of course
we should warn.  Also, if we know a directory should be readable
when it exists, then we should be warning when we see an error other
than ENOENT.

So...


[PATCH 1/2] update-index doc: note a fixed bug in the untracked cache

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Document the bug tested for in my "status: add a failing test showing
a core.untrackedCache bug" and fixed in Duy's "dir.c: fix missing dir
invalidation in untracked code".

Since this is very likely something others will encounter in the
future on older versions, and it's not obvious how to fix it let's
document both that it exists, and how to "fix" it with a one-off
command.

As noted in that commit, even though this bug gets the untracked cache
into a bad state, we have not yet found a case where this is user
visible, and thus it makes sense for these docs to focus on the
symlink case only.

Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
Signed-off-by: Junio C Hamano <gits...@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclo...@gmail.com>
---
 Documentation/git-update-index.txt | 16 
 1 file changed, 16 insertions(+)

diff --git a/Documentation/git-update-index.txt 
b/Documentation/git-update-index.txt
index bdb0342593..e30b185918 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -464,6 +464,22 @@ command reads the index; while when 
`--[no-|force-]untracked-cache`
 are used, the untracked cache is immediately added to or removed from
 the index.
 
+Before 2.17, the untracked cache had a bug where replacing a directory
+with a symlink to another directory could cause it to incorrectly show
+files tracked by git as untracked. See the "status: add a failing test
+showing a core.untrackedCache bug" commit to git.git. A workaround for
+that was (and this might work for other undiscoverd bugs in the
+future):
+
+
+$ git -c core.untrackedCache=false status
+
+
+This bug has also been shown to affect non-symlink cases of replacing
+a directory with a file when it comes to the internal structures of
+the untracked cache, but no case has been found where this resulted in
+wrong "git status" output.
+
 File System Monitor
 ---
 
-- 
2.15.1.424.g9478a66081



[PATCH 0/2] update-index doc: note new caveats in 2.17

2018-02-09 Thread Ævar Arnfjörð Bjarmason
On Fri, Feb 09 2018, Junio C. Hamano jotted:
> Ævar Arnfjörð Bjarmason <ava...@gmail.com> writes:
>> I think you / Nguyễn may not have seen my
>> <87d11omi2o@evledraar.gmail.com>
>> (https://public-inbox.org/git/87d11omi2o@evledraar.gmail.com/)
>>
>> As noted there I think it's best to proceed without the "dir.c: stop
>> ignoring opendir[...]" patch.
>>
>> It's going to be a bad regression in 2.17 if it ends up spewing pagefuls
>> of warnings in previously working repos if the UC is on.
>
> Well, I am not sure if it is a regression to diagnose problematic
> untracked cache information left by earlier version of the software
> with bugs.  After all, this is still an experimental feature, and we
> do want to see the warning to serve its purpose to diagnose possible
> remaining bugs, and new similar ones when a newer iteration of the
> feature introduces, no?

Fair enough. I just wanted to make sure it had been seen. It wasn't
obvious whether it had just been missed since there was no follow-up
there or note in WC.

We were disagreeing to the extent that some of this should be
documented in 878tcbmbqb@evledraar.gmail.com and related, and I
see you've ejected the doc patch I had in that series.

I really think we should be saying something like what this new doc
series says, it's conceptually on top of Duy's series but applies on
top of master.

I think there's room to quibble about whether to include 1/2 at all
since it's a relatively obscure edge case.

2/2 however is something I think a *lot* of users are going to hit. I
just skimmed the relevant bits of git-config and git-update-index now,
and nothing describes the UC as an experimental feature, and it's been
well-known to improve performance.

When users upgrade to 2.17 they're going to have git yelling at them
(as my users did) on existing checkouts, with no indication whatsoever
that it's due to the UC or how to fix it.

Let's at least documente it. I also wonder if the "dir.c: stop
ignoring opendir() error in open_cached_dir()" change shouldn't have
something in the warning itself about it being caused by the UC,
doesn't it only warn under that mode? I.e.:

-"could not open directory '%s'")
+"core.untrackedCache: could not open directory '%s'")

Ævar Arnfjörð Bjarmason (2):
  update-index doc: note a fixed bug in the untracked cache
  update-index doc: note the caveat with "could not open..."

 Documentation/git-update-index.txt | 26 ++
 1 file changed, 26 insertions(+)

-- 
2.15.1.424.g9478a66081



[PATCH 2/2] update-index doc: note the caveat with "could not open..."

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Note the caveat where 2.17 is stricter about index validation
potentially causing "could not open directory" warnings when git is
upgraded. See the preceding "dir.c: stop ignoring opendir() error in
open_cached_dir()" change.

This caused some mayhem when I upgraded git to a version with this
series at Booking.com, and other users have doubtless enabled the UC
extension and are in for a surprise when they upgrade. Let's give them
a headsup in the docs.

Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
---
 Documentation/git-update-index.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/git-update-index.txt 
b/Documentation/git-update-index.txt
index e30b185918..0c81600d8c 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -480,6 +480,16 @@ a directory with a file when it comes to the internal 
structures of
 the untracked cache, but no case has been found where this resulted in
 wrong "git status" output.
 
+There are also cases where existing indexes written by git versions
+before 2.17 will reference directories that don't exist anymore,
+potentially causing many "could not open directory" warnings to be
+printed on "git status". These are new warnings for existing issues
+that were previously silently discarded.
+
+As with the bug described above the solution is to one-off do a "git
+status" run with `core.untrackedCache=false` to flush out the leftover
+bad data.
+
 File System Monitor
 ---
 
-- 
2.15.1.424.g9478a66081



[PATCH v3 07/14] commit-graph: update graph-head during write

2018-02-08 Thread Derrick Stolee
It is possible to have multiple commit graph files in a pack directory,
but only one is important at a time. Use a 'graph_head' file to point
to the important file. Teach git-commit-graph to write 'graph_head' upon
writing a new commit graph file.

Signed-off-by: Derrick Stolee <dsto...@microsoft.com>
---
 Documentation/git-commit-graph.txt | 11 ++-
 builtin/commit-graph.c | 27 +--
 commit-graph.c |  8 
 commit-graph.h |  1 +
 t/t5318-commit-graph.sh| 25 +++--
 5 files changed, 63 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-commit-graph.txt 
b/Documentation/git-commit-graph.txt
index 67e107f06a..5e32c43b27 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -33,7 +33,9 @@ COMMANDS
 Write a commit graph file based on the commits found in packfiles.
 Includes all commits from the existing commit graph file. Outputs the
 checksum hash of the written file.
-
++
+With `--update-head` option, update the graph-head file to point
+to the written graph file.
 
 'read'::
 
@@ -53,6 +55,13 @@ EXAMPLES
 $ git commit-graph write
 
 
+* Write a graph file for the packed commits in your local .git folder
+* and update graph-head.
++
+
+$ git commit-graph write --update-head
+
+
 * Read basic information from a graph file.
 +
 
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 3ffa7ec433..776ca087e8 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -1,12 +1,13 @@
 #include "builtin.h"
 #include "config.h"
+#include "lockfile.h"
 #include "parse-options.h"
 #include "commit-graph.h"
 
 static char const * const builtin_commit_graph_usage[] = {
N_("git commit-graph [--pack-dir ]"),
N_("git commit-graph read [--graph-hash=]"),
-   N_("git commit-graph write [--pack-dir ]"),
+   N_("git commit-graph write [--pack-dir ] [--update-head]"),
NULL
 };
 
@@ -16,13 +17,14 @@ static const char * const builtin_commit_graph_read_usage[] 
= {
 };
 
 static const char * const builtin_commit_graph_write_usage[] = {
-   N_("git commit-graph write [--pack-dir ]"),
+   N_("git commit-graph write [--pack-dir ] [--update-head]"),
NULL
 };
 
 static struct opts_commit_graph {
const char *pack_dir;
const char *graph_hash;
+   int update_head;
 } opts;
 
 static int graph_read(int argc, const char **argv)
@@ -87,6 +89,22 @@ static int graph_read(int argc, const char **argv)
return 0;
 }
 
+static void update_head_file(const char *pack_dir, const struct object_id 
*graph_hash)
+{
+   int fd;
+   struct lock_file lk = LOCK_INIT;
+   char *head_fname = get_graph_head_filename(pack_dir);
+
+   fd = hold_lock_file_for_update(, head_fname, LOCK_DIE_ON_ERROR);
+   FREE_AND_NULL(head_fname);
+
+   if (fd < 0)
+   die_errno("unable to open graph-head");
+
+   write_in_full(fd, oid_to_hex(graph_hash), GIT_MAX_HEXSZ);
+   commit_lock_file();
+}
+
 static int graph_write(int argc, const char **argv)
 {
struct object_id *graph_hash;
@@ -95,6 +113,8 @@ static int graph_write(int argc, const char **argv)
{ OPTION_STRING, 'p', "pack-dir", _dir,
N_("dir"),
    N_("The pack directory to store the graph") },
+   OPT_BOOL('u', "update-head", _head,
+   N_("update graph-head to written graph file")),
OPT_END(),
};
 
@@ -111,6 +131,9 @@ static int graph_write(int argc, const char **argv)
 
graph_hash = write_commit_graph(opts.pack_dir);
 
+   if (opts.update_head)
+   update_head_file(opts.pack_dir, graph_hash);
+
if (graph_hash) {
printf("%s\n", oid_to_hex(graph_hash));
FREE_AND_NULL(graph_hash);
diff --git a/commit-graph.c b/commit-graph.c
index 9a337cea4d..9789fe37f9 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -38,6 +38,14 @@
 #define GRAPH_MIN_SIZE (GRAPH_CHUNKLOOKUP_SIZE + GRAPH_FANOUT_SIZE + \
GRAPH_OID_LEN + 8)
 
+char *get_graph_head_filename(const char *pack_dir)
+{
+   struct strbuf fname = STRBUF_INIT;
+   strbuf_addstr(, pack_dir);
+   strbuf_addstr(, "/graph-head");
+   return strbuf_detach(, 0);
+}
+
 char* get_commit_graph_filename_hash(const char *pack_dir,
 struct object_id *hash)
 {
diff --git a/commit-graph.h b/commit-graph.h
index c1608976b3..726

[PATCH v2] hash: update obsolete reference to SHA1_HEADER

2018-02-07 Thread brian m. carlson
We moved away from SHA1_HEADER to a preprocessor if chain, but didn't
update the comment discussing the platform defines.  Update this comment
so it reflects the current state of our codebase.

Signed-off-by: brian m. carlson <sand...@crustytoothpaste.net>
---
 hash.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hash.h b/hash.h
index eb30f59be3..7c8238bc2e 100644
--- a/hash.h
+++ b/hash.h
@@ -18,8 +18,8 @@
 #ifndef platform_SHA_CTX
 /*
  * platform's underlying implementation of SHA-1; could be OpenSSL,
- * blk_SHA, Apple CommonCrypto, etc...  Note that including
- * SHA1_HEADER may have already defined platform_SHA_CTX for our
+ * blk_SHA, Apple CommonCrypto, etc...  Note that the relevant
+ * SHA-1 header may have already defined platform_SHA_CTX for our
  * own implementations like block-sha1 and ppc-sha1, so we list
  * the default for OpenSSL compatible SHA-1 implementations here.
  */


Re: [PATCH v2 07/14] commit-graph: implement git-commit-graph --update-head

2018-02-05 Thread Derrick Stolee

On 2/1/2018 8:35 PM, SZEDER Gábor wrote:

It is possible to have multiple commit graph files in a pack directory,
but only one is important at a time. Use a 'graph_head' file to point
to the important file.

This implies that all those other files are ignored, right?


Yes. We do not use directory listings to find graph files.




Teach git-commit-graph to write 'graph_head' upon
writing a new commit graph file.

Signed-off-by: Derrick Stolee <dsto...@microsoft.com>
---
  Documentation/git-commit-graph.txt | 34 ++
  builtin/commit-graph.c | 38 +++---
  commit-graph.c | 25 +
  commit-graph.h |  2 ++
  t/t5318-commit-graph.sh| 12 ++--
  5 files changed, 106 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-commit-graph.txt 
b/Documentation/git-commit-graph.txt
index 09aeaf6c82..99ced16ddc 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -12,15 +12,49 @@ SYNOPSIS
  'git commit-graph' --write  [--pack-dir ]
  'git commit-graph' --read  [--pack-dir ]
  
+OPTIONS

+---

Oh, look, the 'OPTIONS' section I missed in the previous patches! ;)

This should be split up and squashed into the previous patches where
the individual --options are first mentioned.


+--pack-dir::
+   Use given directory for the location of packfiles, graph-head,
+   and graph files.
+
+--read::
+   Read a graph file given by the graph-head file and output basic
+   details about the graph file. (Cannot be combined with --write.)

 From the output of 'git commit-graph --read' it seems that it's not a
generally useful option to the user.  Perhaps it should be mentioned
that it's only intended as a debugging aid?  Or maybe it doesn't
really matter, because eventually this command will become irrelevant,
as other commands (clone, fetch, gc) will invoke it automagically...


I'll add some wording to make this clear.




+--graph-id::
+   When used with --read, consider the graph file graph-.graph.
+
+--write::
+   Write a new graph file to the pack directory. (Cannot be combined
+   with --read.)

I think this should also mention that it prints the generated graph
file's checksum.


+
+--update-head::
+   When used with --write, update the graph-head file to point to
+   the written graph file.

So it should be used with '--write', noted.


  EXAMPLES
  
  
+* Output the hash of the graph file pointed to by /graph-head.

++
+
+$ git commit-graph --pack-dir=
+
+
  * Write a commit graph file for the packed commits in your local .git folder.
  +
  
  $ git commit-graph --write
  
  
+* Write a graph file for the packed commits in your local .git folder,

+* and update graph-head.
++
+
+$ git commit-graph --write --update-head
+
+
  * Read basic information from a graph file.
  +
  
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 218740b1f8..d73cbc907d 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -11,7 +11,7 @@
  static char const * const builtin_commit_graph_usage[] = {
N_("git commit-graph [--pack-dir ]"),
N_("git commit-graph --read [--graph-hash=]"),
-   N_("git commit-graph --write [--pack-dir ]"),
+   N_("git commit-graph --write [--pack-dir ] [--update-head]"),
NULL
  };
  
@@ -20,6 +20,9 @@ static struct opts_commit_graph {

int read;
const char *graph_hash;
int write;
+   int update_head;
+   int has_existing;
+   struct object_id old_graph_hash;
  } opts;
  
  static int graph_read(void)

@@ -30,8 +33,8 @@ static int graph_read(void)
  
  	if (opts.graph_hash && strlen(opts.graph_hash) == GIT_MAX_HEXSZ)

get_oid_hex(opts.graph_hash, _hash);
-   else
-   die("no graph hash specified");
+   else if (!get_graph_head_hash(opts.pack_dir, _hash))
+   die("no graph-head exists");
  
  	graph_file = get_commit_graph_filename_hash(opts.pack_dir, _hash);

graph = load_commit_graph_one(graph_file, opts.pack_dir);
@@ -62,10 +65,33 @@ static int graph_read(void)
return 0;
  }
  
+static void update_head_file(const char *pack_dir, const struct object_id *graph_hash)

+{
+   struct strbuf head_path = STRBUF_INIT;
+   int fd;
+   struct lock_file lk = LOCK_INIT;
+
+   strbuf_addstr(_path, pack_dir);
+   strbuf_addstr(_path, "/");
+   strbuf_addstr(_path, "graph-head");

str

<    1   2   3   4   5   6   7   8   9   10   >