Documentation Breakage at 2.5.6

2017-12-05 Thread Randall S. Becker
Hi All,

I'm trying to upgrade the NonStop port from 2.3.7 upward eventually to
2.15.1 and hit a snag on documentation. The xmlto component is a bit new to
me and I hit the following error:

XMLTO git-remote-testgit.1
xmlto: /home/git/git/Documentation/git-remote-testgit.xml does not validate
(status 3)
xmlto: Fix document syntax or use --skip-validation option
I/O error : Attempt to load network entity
http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd
/home/git/git/Documentation/git-remote-testgit.xml:2: warning: failed to
load external entity
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd;
D DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd;
 
^
I/O error : Attempt to load network entity
http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd
warning: failed to load external entity
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd;
validity error : Could not load the external subset
http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd

The -skip-validation option just takes me to a different problem validating
via sourceforge URL that appears not to exist anymore, although I had to
modify ./git/Documention/Makefile, which is vexing.

XMLTO git-remote-testgit.1
I/O error : Attempt to load network entity
http://docbook.sourceforge.net/release/xsl/current/manpages/docbook.xsl
warning: failed to load external entity
"http://docbook.sourceforge.net/release/xsl/current/manpages/docbook.xsl;
compilation error: file /tmp/xmlto-xsl.ie6J8p line 4 element import
xsl:import : unable to load
http://docbook.sourceforge.net/release/xsl/current/manpages/docbook.xsl
Makefile:328: recipe for target 'git-remote-testgit.1' failed

Any advice on getting past this? It would be nice to get git help to working
again. An answer of "you need to get past 2.5.6" is ok too as long as I know
where I'm going.

Thanks,
Randall

-- Brief whoami: NonStop developer since approximately
UNIX(421664400)/NonStop(2112884442) 
-- In my real life, I talk too much.





Re: [PATCH v2 0/9] rebase -i: add config to abbreviate command names

2017-12-05 Thread liam Beguin


On 05/12/17 05:21 PM, Junio C Hamano wrote:
> Liam Beguin  writes:
> 
>> This series will add the 'rebase.abbreviateCommands' configuration
>> option to allow `git rebase -i` to default to the single-letter command
>> names when generating the todo list.
>>
>> Using single-letter command names can present two benefits. First, it
>> makes it easier to change the action since you only need to replace a
>> single character (i.e.: in vim "r" instead of
>> "ciw").  Second, using this with a large enough value of
>> 'core.abbrev' enables the lines of the todo list to remain aligned
>> making the files easier to read.
>>
>> Changes in V2:
>> - Refactor and rename 'transform_todo_ids'
>> - Replace SHA-1 by OID in rebase--helper.c
>> - Update todo list related functions to take a generic 'flags' parameter
>> - Rename 'add_exec_commands' function to 'sequencer_add_exec_commands'
>> - Rename 'add-exec' option to 'add-exec-commands'
>> - Use 'strbur_read_file' instead of rewriting it
>> - Make 'command_to_char' return 'comment_char_line' if no single-letter
>>   command name is defined
>> - Combine both tests into a single test case
>> - Update commit messages
>>
>> Changes in V2:
>> - Rename 'transform_todo_insn' to 'transform_todos'
>> - Fix flag name TODO_LIST_SHORTE{D,N}_IDS
> 
> I've replaced this series and pushed out the result.

Great! Thanks again,

> 
> Thanks.
> 

Liam


git send-email could check for no To: line

2017-12-05 Thread Matthew Wilcox

I keep coming up with new and innovative ways to send stupid-looking
emails with git send-email.  Please save me from myself.

My latest SNAFU is to spend so much time setting up the 'cc' list in
the git-format-patch step that I completely forgot to put anybody on the
'to' line, and even being prompted by git send-email that I might want
to add somebody to the 'To' line wasn't enough.

Could there be an extra confirmation, perhaps in send_message() that

if ($cc ne '' && $to eq '')
are_you_sure()

(you don't want me writing perl.  you really don't.)



Re: [PATCH v7 4/7] checkout: describe_detached_head: remove ellipsis after committish

2017-12-05 Thread Ann T Ropea
We do not want an ellipsis displayed following an (abbreviated) SHA-1
value.

The days when this was necessary to indicate the truncation to
lower-level Git commands and/or the user are bygone.

However, to ease the transition, the ellipsis will still be printed if
the user sets the environment variable GIT_PRINT_SHA1_ELLIPSIS to "yes".

Correct documentation with respect to what describe_detached_head prints
when GIT_PRINT_SHA1_ELLIPSIS is not set as indicated above.

Add tests for the old and new behaviour.

Signed-off-by: Ann T Ropea 
---
v2: rename patch series & focus on removal of ellipses
v3: env var instead of config option, use one-line comments where appropriate, 
preserve indent level
v4: improve env var handling (rename, helper func to query, docu)
v5: rewrite series to take Junio's comments in 
 aboard
v6: polish to take Junio's comments from 
 into account
v7: fix style issues (redirection, here-dox, long lines, 
setting/exporting/unsetting of env var): cf. 

 Documentation/user-manual.txt |   2 +-
 builtin/checkout.c|  10 +++-
 t/t2020-checkout-detach.sh| 123 ++
 3 files changed, 132 insertions(+), 3 deletions(-)

diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt
index 497e82e88dd0..eff78902742a 100644
--- a/Documentation/user-manual.txt
+++ b/Documentation/user-manual.txt
@@ -319,7 +319,7 @@ do so (now or later) by using -b with the checkout command 
again. Example:
 
   git checkout -b new_branch_name
 
-HEAD is now at 427abfa... Linux v2.6.17
+HEAD is now at 427abfa Linux v2.6.17
 
 
 The HEAD then refers to the SHA-1 of the commit instead of to a branch,
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 3faae382de4f..b0499542158f 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -400,10 +400,16 @@ static void show_local_changes(struct object *head,
 static void describe_detached_head(const char *msg, struct commit *commit)
 {
struct strbuf sb = STRBUF_INIT;
+
if (!parse_commit(commit))
pp_commit_easy(CMIT_FMT_ONELINE, commit, );
-   fprintf(stderr, "%s %s... %s\n", msg,
-   find_unique_abbrev(commit->object.oid.hash, DEFAULT_ABBREV), 
sb.buf);
+   if (print_sha1_ellipsis()) {
+   fprintf(stderr, "%s %s... %s\n", msg,
+   find_unique_abbrev(commit->object.oid.hash, 
DEFAULT_ABBREV), sb.buf);
+   } else {
+   fprintf(stderr, "%s %s %s\n", msg,
+   find_unique_abbrev(commit->object.oid.hash, 
DEFAULT_ABBREV), sb.buf);
+   }
strbuf_release();
 }
 
diff --git a/t/t2020-checkout-detach.sh b/t/t2020-checkout-detach.sh
index fbb4ee9bb42d..e8e284cf9c04 100755
--- a/t/t2020-checkout-detach.sh
+++ b/t/t2020-checkout-detach.sh
@@ -186,4 +186,127 @@ test_expect_success 'no advice given for explicit 
detached head state' '
test_cmp expect.no-advice actual
 '
 
+# Detached HEAD tests for GIT_PRINT_SHA1_ELLIPSIS (new format)
+test_expect_success 'describe_detached_head prints no SHA-1 ellipsis when not 
asked to' "
+
+   # The first detach operation is more chatty than the following ones.
+   cat >1st_detach <<-'EOF' &&
+   Note: checking out 'HEAD^'.
+
+   You are in 'detached HEAD' state. You can look around, make experimental
+   changes and commit them, and you can discard any commits you make in 
this
+   state without impacting any branches by performing another checkout.
+
+   If you want to create a new branch to retain commits you create, you may
+   do so (now or later) by using -b with the checkout command again. 
Example:
+
+ git checkout -b 
+
+   HEAD is now at 7c7cd714e262 three
+   EOF
+
+   # The remaining ones just show info about previous and current HEADs.
+   cat >2nd_detach <<-'EOF' &&
+   Previous HEAD position was 7c7cd714e262 three
+   HEAD is now at 139b20d8e6c5 two
+   EOF
+
+   cat >3rd_detach <<-'EOF' &&
+   Previous HEAD position was 139b20d8e6c5 two
+   HEAD is now at d79ce1670bdc one
+   EOF
+
+   reset &&
+   check_not_detached &&
+
+   # Various ways of *not* asking for ellipses
+
+   sane_unset GIT_PRINT_SHA1_ELLIPSIS &&
+   git -c 'core.abbrev=12' checkout HEAD^ >actual 2>&1 &&
+   check_detached &&
+   test_i18ncmp 1st_detach actual &&
+
+   GIT_PRINT_SHA1_ELLIPSIS="no" git -c 'core.abbrev=12' checkout HEAD^ 
>actual 2>&1 &&
+   check_detached &&
+   test_i18ncmp 2nd_detach actual &&
+
+   GIT_PRINT_SHA1_ELLIPSIS= git -c 'core.abbrev=12' checkout HEAD^ >actual 
2>&1 &&
+   check_detached &&
+   test_i18ncmp 3rd_detach actual &&
+
+   sane_unset GIT_PRINT_SHA1_ELLIPSIS &&
+
+   # We only 

Re: How hard would it be to implement sparse fetching/pulling?

2017-12-05 Thread Philip Oakley

From: "Jeff Hostetler" 
Sent: Monday, December 04, 2017 3:36 PM


On 12/2/2017 11:30 AM, Philip Oakley wrote:

From: "Jeff Hostetler" 
Sent: Friday, December 01, 2017 2:30 PM

On 11/30/2017 8:51 PM, Vitaly Arbuzov wrote:

I think it would be great if we high level agree on desired user
experience, so let me put a few possible use cases here.

1. Init and fetch into a new repo with a sparse list.
Preconditions: origin blah exists and has a lot of folders inside of
src including "bar".
Actions:
git init foo && cd foo
git config core.sparseAll true # New flag to activate all sparse
operations by default so you don't need to pass options to each
command.
echo "src/bar" > .git/info/sparse-checkout
git remote add origin blah
git pull origin master
Expected results: foo contains src/bar folder and nothing else,
objects that are unrelated to this tree are not fetched.
Notes: This should work same when fetch/merge/checkout operations are
used in the right order.


With the current patches (parts 1,2,3) we can pass a blob-ish
to the server during a clone that refers to a sparse-checkout
specification.


I hadn't appreciated this capability. I see it as important, and should 
be available both ways, so that a .gitNarrow spec can be imposed from the 
server side, as well as by the requester.


It could also be used to assist in the 'precious/secret' blob problem, so 
that AWS keys are never pushed, nor available for fetching!


To be honest, I've always considered partial clone/fetch as
a client-side request as a performance feature to minimize
download times and disk space requirements on the client.


Mine was a two way view where one side or other specified an extent for the 
narrow clone to achieve either the speed/space improvement or partitioning 
capability.



I've not thought of it from the "server has secrets" point
of view.


My potential for "secrets" was a little softer that some of the 'hard' 
security that is often discussed. I'm for the layered risk approach (swiss 
cheese model)


We can talk about it, but I'd like to keep it outside the
scope of the current effort.


Agreed.


 My concerns are that that is
not the appropriate mechanism to enforce MAC/DAC like security
mechanisms.  For example:
[a] The client will still receive the containing trees that
refer to the sensitive blobs, so the user can tell when
the secret blobs change -- they wouldn't have either blob,
but can tell when they are changed.  This event by itself
may or may not leak sensitive information depending on the
terms of the security policy in place.
[b] The existence of such missing blobs would tell the client
which blobs are significant and secret and allow them to
focus their attack.  It would be better if those assets
were completely hidden and not in the tree at all.
[c] The client could push a fake secret blob to replace the
valid one on the server.  You would have to audit the
server to ensure that it never accepts a push containing
a change to any secret blob.  And the server would need
an infrastructure to know about all secrets in the tree.
[d] When a secret blob does change, any local merges by the
user lack information to complete the merge -- they can't
merge the secrets and they can't be trusted to correctly
pick-ours or pick-theirs -- so their workflows are broken.
I'm not trying to blindly spread FUD here, but it is arguments
like these that make me suggest that the partial clone mechanism
is not the right vehicle for such "secret" blobs.


I'm on the 'a little security is better than no security' side, but all the 
points are valid.






There's a bit of a chicken-n-egg problem getting
things set up. So if we assume your team would create a series
of "known enlistments" under version control, then you could


s/enlistments/entitlements/ I presume?


Within my org we speak of "enlistments" as subset of the tree
that you plan to work on.  For example, you might enlist in the
"file system" portion of the tree or in the "device drivers"
portion.  If the Makefiles have good partitioning, you should
only need one of the above portions to do productive work within
a feature area.
Ah, so it's the things that have been requested by the client (I'd like to 
the enlist..)




I'm not sure what you mean by "entitlements".


It is like having the title deeds to a house - a list things you have, or 
can have. (e.g. a father saying: you can have the car on Saturday 6pm -11pm)


At the end of the day the particular lists would be the same, they guide 
what is sent.







just reference one by : during your clone. The
server can lookup that blob and just use it.

git clone --filter=sparse:oid=master:templates/bar URL

And then the server will filter-out the unwanted blobs during
the clone. (The current version only filters blobs; you still
get full commits and trees. That will be revisited later.)


I'm for the idea that 

[PATCH v4 2/2] git-prompt: fix reading files with windows line endings

2017-12-05 Thread Robert Abel
If any of the files read by __git_eread have \r\n line endings, read
will only strip \n, leaving \r. This results in an ugly prompt, where
instead of

user@pc MINGW64 /path/to/repo (BARE:master)

the last parenthesis is printed over the beginning of the prompt like

)ser@pc MINGW64 /path/to/repo (BARE:master

This patch fixes the issue by changing the internal field separator
variable IFS to $'\r\n' before using the read builtin command.

Note that ANSI-C Quoting/POSIX Quoting ($'...') is supported by bash
as well as zsh, which are the current targets of git-prompt, cf.
contrib/completion/git-prompt.sh.

Signed-off-by: Robert Abel 
---
 contrib/completion/git-prompt.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index 41a471957a..983e419d2b 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -283,7 +283,7 @@ __git_ps1_colorize_gitstring ()
 # variable, in that order.
 __git_eread ()
 {
-   test -r "$1" && read "$2" <"$1"
+   test -r "$1" && IFS=$'\r\n' read "$2" <"$1"
 }
 
 # __git_ps1 accepts 0 or 1 arguments (i.e., format string)
-- 
2.13.0.windows.1



[PATCH v4 1/2] git-prompt: make __git_eread intended use explicit

2017-12-05 Thread Robert Abel
__git_eread is used to read a single line of a given file (if it exists)
into a single variable stripping the EOL.
This patch removes the unused capability to split file contents into tokens
by passing multiple variable names. Add a comment and explicitly use $2
instead of misleading $@ as argument to the read builtin command.

Signed-off-by: Robert Abel 
---
 contrib/completion/git-prompt.sh | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index c6cbef38c2..41a471957a 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -278,11 +278,12 @@ __git_ps1_colorize_gitstring ()
r="$c_clear$r"
 }
 
+# Helper function to read the first line of a file into a variable.
+# __git_eread requires 2 arguments, the file path and the name of the
+# variable, in that order.
 __git_eread ()
 {
-   local f="$1"
-   shift
-   test -r "$f" && read "$@" <"$f"
+   test -r "$1" && read "$2" <"$1"
 }
 
 # __git_ps1 accepts 0 or 1 arguments (i.e., format string)
-- 
2.13.0.windows.1



Re: [PATCH v2 1/2] git-prompt: make __git_eread intended use explicit

2017-12-05 Thread Robert Abel
Dear Junio,

I'm amazed at how much time and energy you spend on correcting these
essentially non-issues in my git commit messages for a quadruple-liner
code change.

I'll resend both patches one last time addressing the grave issue of the
informative mention of multi-line files.

Regards,

Robert


Re: [PATCH] diff-tree: read the index so attribute checks work in bare repositories

2017-12-05 Thread Eric Sunshine
On Tue, Dec 5, 2017 at 5:13 PM, Brandon Williams  wrote:
> A regression was introduced in 557a5998d (submodule: remove
> gitmodules_config, 2017-08-03) to how attribute processing was handled
> in bare repositories when running the diff-tree command.
>
> By default the attribute system will first try to read ".gitattribute"
> files from the working tree and then falls back to reading them from the
> index if there isn't a copy checked out in the worktree.  Prior to
> 557a5998d the index was read as a side effect of the call to
> 'gitmodules_config()' which ensured that the index was already populated
> before entering the attribute subsystem.
>
> Since the call to 'gitmodules_config()' was removed the index is no
> longer being read so when the attribute system tries to read from the
> in-memory index it doesn't find any ".gitattribute" entries effectively
> ignoring any configured attributes.
>
> Fix this by explicitly reading the index during the setup of diff-tree.

This commit message does a good job of explaining the issue, so
someone who hasn't followed the thread (or has not followed it
closely, like me) can understand the problem and solution. Thanks.

A few comments below...

> Reported-by: Ben Boeckel 
> Signed-off-by: Brandon Williams 
> ---
> diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
> @@ -110,6 +110,7 @@ int cmd_diff_tree(int argc, const char **argv, const char 
> *prefix)
>
> git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
> init_revisions(opt, prefix);
> +   read_cache();

557a5998d (submodule: remove gitmodules_config, 2017-08-03) touched a
fair number of built-in commands. It's not clear from the current
patch's commit message if diff-tree is the only command which
regressed. Is it? Or are other commands also likely to have regressed?
Perhaps the commit message could say something about this. For
instance: "All other commands touched by 557a5998d have been audited
and were found to be regression-free" or "Other commands may regress
in the same way, but we will take a wait-and-see attitude and fix them
as needed because ".

> diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
> @@ -636,6 +636,23 @@ test_expect_success 'check with space before tab in 
> indent (diff-tree)' '
> +test_expect_success 'check with ignored trailing whitespace attr 
> (diff-tree)' '
> +   test_when_finished "git reset --hard HEAD^" &&

A few style nits...

> +   # Create a whitespace error that should be ignored.

Comments in nearby tests are not capitalized and do not end with period.

> +   echo "* -whitespace" > ".gitattributes" &&

Please drop unnecessary quotes around the filename, as the extra noise
makes it a bit harder to read. Also, lose space after redirection
operator:

echo "* -whitespace" >.gitattributes &&

> +   git add ".gitattributes" &&
> +   echo "trailing space -> " > "trailing-space" &&

All the nearby tests use some variation of:

echo "foo ();  " >x &&

which differs from the "trailing space ->" and filename
'trailing-space' used in this test. Lack of consistency makes this new
test stick out like a sore thumb.

> +   git add "trailing-space" &&
> +   git commit -m "trailing space" &&
> +
> +   # With a worktree diff-tree ignores the whitespace error
> +   git diff-tree --root --check HEAD &&
> +
> +   # Without a worktree diff-tree still ignores the whitespace error
> +   git -C .git diff-tree --root --check HEAD
> +'


Re: [PATCH] diff-tree: read the index so attribute checks work in bare repositories

2017-12-05 Thread Junio C Hamano
Brandon Williams  writes:

> A regression was introduced in 557a5998d (submodule: remove
> gitmodules_config, 2017-08-03) to how attribute processing was handled
> in bare repositories when running the diff-tree command.
>
> By default the attribute system will first try to read ".gitattribute"
> files from the working tree and then falls back to reading them from the
> index if there isn't a copy checked out in the worktree.  Prior to
> 557a5998d the index was read as a side effect of the call to
> 'gitmodules_config()' which ensured that the index was already populated
> before entering the attribute subsystem.
>
> Since the call to 'gitmodules_config()' was removed the index is no
> longer being read so when the attribute system tries to read from the
> in-memory index it doesn't find any ".gitattribute" entries effectively
> ignoring any configured attributes.
>
> Fix this by explicitly reading the index during the setup of diff-tree.

Thanks, both.  Will queue.




> Reported-by: Ben Boeckel 
> Signed-off-by: Brandon Williams 
> ---
>
> This patch should fix the regression.  Let me know if it doesn't solve the
> issue and I'll investigate some more.
>
>  builtin/diff-tree.c|  1 +
>  t/t4015-diff-whitespace.sh | 17 +
>  2 files changed, 18 insertions(+)
>
> diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
> index d66499909..cfe7d0281 100644
> --- a/builtin/diff-tree.c
> +++ b/builtin/diff-tree.c
> @@ -110,6 +110,7 @@ int cmd_diff_tree(int argc, const char **argv, const char 
> *prefix)
>  
>   git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
>   init_revisions(opt, prefix);
> + read_cache();
>   opt->abbrev = 0;
>   opt->diff = 1;
>   opt->disable_stdin = 1;
> diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
> index 559a7541a..6e061a002 100755
> --- a/t/t4015-diff-whitespace.sh
> +++ b/t/t4015-diff-whitespace.sh
> @@ -636,6 +636,23 @@ test_expect_success 'check with space before tab in 
> indent (diff-tree)' '
>   test_must_fail git diff-tree --check HEAD^ HEAD
>  '
>  
> +test_expect_success 'check with ignored trailing whitespace attr 
> (diff-tree)' '
> + test_when_finished "git reset --hard HEAD^" &&
> +
> + # Create a whitespace error that should be ignored.
> + echo "* -whitespace" > ".gitattributes" &&
> + git add ".gitattributes" &&
> + echo "trailing space -> " > "trailing-space" &&
> + git add "trailing-space" &&
> + git commit -m "trailing space" &&
> +
> + # With a worktree diff-tree ignores the whitespace error
> + git diff-tree --root --check HEAD &&
> +
> + # Without a worktree diff-tree still ignores the whitespace error
> + git -C .git diff-tree --root --check HEAD
> +'
> +
>  test_expect_success 'check trailing whitespace (trailing-space: off)' '
>   git config core.whitespace "-trailing-space" &&
>   echo "foo ();   " >x &&


Re: [PATCH] diff-tree: read the index so attribute checks work in bare repositories

2017-12-05 Thread Stefan Beller
On Tue, Dec 5, 2017 at 2:13 PM, Brandon Williams  wrote:
> A regression was introduced in 557a5998d (submodule: remove
> gitmodules_config, 2017-08-03) to how attribute processing was handled
> in bare repositories when running the diff-tree command.
>
> By default the attribute system will first try to read ".gitattribute"
> files from the working tree and then falls back to reading them from the
> index if there isn't a copy checked out in the worktree.  Prior to
> 557a5998d the index was read as a side effect of the call to
> 'gitmodules_config()' which ensured that the index was already populated
> before entering the attribute subsystem.
>
> Since the call to 'gitmodules_config()' was removed the index is no
> longer being read so when the attribute system tries to read from the
> in-memory index it doesn't find any ".gitattribute" entries effectively
> ignoring any configured attributes.
>
> Fix this by explicitly reading the index during the setup of diff-tree.
>
> Reported-by: Ben Boeckel 
> Signed-off-by: Brandon Williams 
> ---
>
> This patch should fix the regression.  Let me know if it doesn't solve the
> issue and I'll investigate some more.
>

Thanks for fixing this bug! The commit message is helpful
to understand how this bug could slip in!

> diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
> index d66499909..cfe7d0281 100644
> --- a/builtin/diff-tree.c
> +++ b/builtin/diff-tree.c
> @@ -110,6 +110,7 @@ int cmd_diff_tree(int argc, const char **argv, const char 
> *prefix)
>
> git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
> init_revisions(opt, prefix);
> +   read_cache();


Although we do have very few unchecked calls to read_cache, I'd suggest
to avoid spreading them. Most of the read_cache calls are guarded via:

if (read_cache() < 0)
die(_("index file corrupt"));

I wonder if this hints at a bad API, and we'd rather have read_cache
die() on errors, and the few callers that try to get out of trouble might
need to use read_cache_gently() instead.
(While this potentially large refactoring may be deferred, I'd ask for
an if at least)

Thanks,
Stefan


Re: [PATCH] diff-tree: read the index so attribute checks work in bare repositories

2017-12-05 Thread Brandon Williams
On 12/05, Ben Boeckel wrote:
> On Tue, Dec 05, 2017 at 14:13:37 -0800, Brandon Williams wrote:
> > This patch should fix the regression.  Let me know if it doesn't solve the
> > issue and I'll investigate some more.
> 
> Our test suite passes again. Thanks!

Of course! Glad I could help :)

> 
> Acked-by: Ben Boeckel 
> 
> --Ben

-- 
Brandon Williams


Re: [PATCH] diff-tree: read the index so attribute checks work in bare repositories

2017-12-05 Thread Ben Boeckel
On Tue, Dec 05, 2017 at 14:13:37 -0800, Brandon Williams wrote:
> This patch should fix the regression.  Let me know if it doesn't solve the
> issue and I'll investigate some more.

Our test suite passes again. Thanks!

Acked-by: Ben Boeckel 

--Ben


Re: [WIP 2/2] submodule: read-only super-backed ref backend

2017-12-05 Thread Stefan Beller
On Fri, Dec 1, 2017 at 2:50 PM, Jonathan Tan  wrote:
> Note that a few major parts are still missing:
>  - special handling of the current branch of the superproject
>  - writing (whether "refs/..." to the superproject as an index change or
>a commit, or non-"refs/..." directly to the subproject like usual)
>
> Signed-off-by: Jonathan Tan 


> --- a/refs.c
> +++ b/refs.c
> @@ -1575,12 +1575,17 @@ static struct ref_store *lookup_ref_store_map(struct 
> hashmap *map,
>  static struct ref_store *ref_store_init(const char *gitdir,
> unsigned int flags)
>  {
> -   const char *be_name = "files";
> -   struct ref_storage_be *be = find_ref_storage_backend(be_name);
> +   struct ref_storage_be *be;
> struct ref_store *refs;
>
> +   if (getenv("USE_SP")) {
> +   be = _be_sp;
> +   } else {
> +   be = _be_files;
> +   }
> +
> if (!be)
> -   die("BUG: reference backend %s is unknown", be_name);
> +   die("BUG: reference backend %s is unknown", "files");

If we pursue this approach more than just RFC-ish we would probably not use
an environment variable but have some detection logic for the
different ref backends.

For example Shawns refstable[1] proposal uses the configuration
`extensions.refStorage == reftable`, which this proposal could reuse and
set to 'indexed_superproject' or 'commit_in_superproject', depending on the
write semantics that we'd need to disucss (or do we want to offer 2 different
superproject backends having these different semantics?)

More to the code here, we could still use `find_ref_storage_backend()`
depending on the env:

const char *be_name = "files";
...
+if (getenv("USE_SP"))
+be_name = "sp-backend";
...

and then we'd have to change the 'next ' pointer in refs_be_files
(in refs/files-backend.c line ~3100) to point at the superproject backend
instead of NULL.

[1] 
https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md#Repository-format

> +static int sp_read_raw_ref(struct ref_store *ref_store,
> + const char *refname, struct object_id *oid,
> + struct strbuf *referent, unsigned int *type)
> +{
> +   struct sp_ref_store *refs;
> +
> +   refs = sp_downcast(ref_store, REF_STORE_READ, "read_raw_ref");
> +
> +   if (!starts_with(refname, "refs/")) {
> +   return refs_read_raw_ref(refs->files, refname, oid, referent, 
> type);
> +   }
> +
> +   /* read from the superproject instead */
> +   return get_superproject_gitlink_oid(refname, oid);
> +}

This function would need to get more sophisticated logic I would assume;
the current RFC is a read-only thing, such that it may be sufficient
to differentiate
between refs/* and specials such as [FETCH_]HEAD, but as seen in the
sp_ref_store struct definition we're already keeping a local reflog. I would
assume that remotes are also local to the submodule instead of reusing the
superprojects remote, for the sake of pulling in upstream changes of the
submodule instead of being fully integrated.



> diff --git a/submodule.c b/submodule.c
> index ce511180e..1ffaeec82 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -471,6 +471,7 @@ static void prepare_submodule_repo_env_no_git_dir(struct 
> argv_array *out)
>  void prepare_submodule_repo_env(struct argv_array *out)
>  {
> prepare_submodule_repo_env_no_git_dir(out);
> +   argv_array_pushf(out, "USE_SP");
> argv_array_pushf(out, "%s=%s", GIT_DIR_ENVIRONMENT,
>  DEFAULT_GIT_DIR_ENVIRONMENT);
>  }

This is an easy way to enforce all submodules use the superproject refs mode,
which makes it easy to make recursive commands (e.g. `git checkout --recursive`
would not need to pass down the exact gitlink sha1, but could pass
down the branch
name or even commit-ish "HEAD^^" and the submodule stays in perfect
sync (according
to superproject commit-ish). Once we take this further I would imagine
there is an option
in the submodule that specifies what mode it is using. Not sure if
we'd want to keep the
environment variable to force that mode from the superproject, though.

> @@ -1986,7 +1987,7 @@ void absorb_git_dir_into_superproject(const char 
> *prefix,
>   * Return 0 if successful, 1 if not (for example, if the parent
>   * directory is not a repo or an unrelated one).
>   */
> -int get_superproject_entry(const char **full_name)
> +static int get_superproject_entry(const char *ref, const char **full_name, 
> struct object_id *oid)

Making it static could be part of the prior patch?

When the ref is given we use `ls-tree -r` instead of `ls-files --stage`
which give similar outputs to have the parsing logic overlap below,
the difference
is whether the index is looked up or the given ref.

> @@ -2016,9 +2017,11 @@ int 

Re: [PATCH v2 0/9] rebase -i: add config to abbreviate command names

2017-12-05 Thread Junio C Hamano
Liam Beguin  writes:

> This series will add the 'rebase.abbreviateCommands' configuration
> option to allow `git rebase -i` to default to the single-letter command
> names when generating the todo list.
>
> Using single-letter command names can present two benefits. First, it
> makes it easier to change the action since you only need to replace a
> single character (i.e.: in vim "r" instead of
> "ciw").  Second, using this with a large enough value of
> 'core.abbrev' enables the lines of the todo list to remain aligned
> making the files easier to read.
>
> Changes in V2:
> - Refactor and rename 'transform_todo_ids'
> - Replace SHA-1 by OID in rebase--helper.c
> - Update todo list related functions to take a generic 'flags' parameter
> - Rename 'add_exec_commands' function to 'sequencer_add_exec_commands'
> - Rename 'add-exec' option to 'add-exec-commands'
> - Use 'strbur_read_file' instead of rewriting it
> - Make 'command_to_char' return 'comment_char_line' if no single-letter
>   command name is defined
> - Combine both tests into a single test case
> - Update commit messages
>
> Changes in V2:
> - Rename 'transform_todo_insn' to 'transform_todos'
> - Fix flag name TODO_LIST_SHORTE{D,N}_IDS

I've replaced this series and pushed out the result.

Thanks.


Re: [RFE] Inverted sparseness (amended)

2017-12-05 Thread Philip Oakley

From: "Randall S. Becker" 
On December 3, 2017 6:14 PM, Philip Oakley wrote a nugget of wisdom:

From: "Randall S. Becker" 

[...]


If using the empty tree part doesn't pass muster (i.e. showing nothing
isn't sufficient), then the narrow clone could come into play to limit
what parts of the trees are widely visible, but mainly its using the
grafts to cover the regulatory gap, and (for the moment) using
fast-export to transfer the singleton commit / tags


Oh Just remembered, there is the newish capability to fetch random blobs, 
so that may help.


I think you hit the nail on the head pretty well. We're currently at 2.3.7, 
with a push to 2.15.1 this week, so I'm looking forward to trying this. My 
two worries are whether the empty tree is acceptable (it should be to the 
client, and might be to the vendor), and doing this reliably 
(semi-automated) so the user base does not have to worry about the gory 
details of doing this. The unit tests for it are undoubtedly going to give 
me headaches.


Thanks for the advice. Islands of shallowness are a really descriptive image 
for what this is. So identifying that there are shoals (to extend the 
metaphor somewhat), will be crucial to this adventure.


These islands of shallowness, however, are also concerns as described in the 
[Re: How hard would it be to implement sparse fetching/pulling?] thread. The 
matter of the security audit is important here also:
I'm just thinking that even if we get a *perfectly working* partial 
clone/fetch/push/etc. that it would not pass a security audit.



Philip says:
I'd totally disagree in the sense that if we had a submodule anywhere_ in 
the repo that would be an independent island of code, and we are quite happy 
with that - we use the web of trust with the auditors for them to go check, 
separately, the oid of the independent portion, which may be at another site 
or another vendor/client. That's OK, so what's the problem here...


We do the same for pinning the tips and tails of the lines of development 
that make for the shallowness and narrowness that create these shoals, and 
oxbows of development. Managing them is normal human activity, with the 
technical support that the Git chain provides - so much better than previous 
'versioning systems' that we see regularly in engineering, with backdoor 
tweaks etc.


The key is to ensure that there is a proper hand holding across the air 
gaps, such that the oids exist both sides of the gaps, and a properly built 
on, such that the hash chain is unbroken. It's a similar negotiation to 
those used for establishing web security between IP clients, so it is 
doable. But you are right to have concerns and suspisions to ensure that it 
is all tested and verified

--
Philip (sorry about the poor quoting of the reply)




Not having the capability would similarly cause a failure of a security 
audit.


Cheers,
Randall

-- Brief whoami: NonStop developer since approximately 
UNIX(421664400)/NonStop(2112884442)

-- In my real life, I talk too much.





[PATCH] diff-tree: read the index so attribute checks work in bare repositories

2017-12-05 Thread Brandon Williams
A regression was introduced in 557a5998d (submodule: remove
gitmodules_config, 2017-08-03) to how attribute processing was handled
in bare repositories when running the diff-tree command.

By default the attribute system will first try to read ".gitattribute"
files from the working tree and then falls back to reading them from the
index if there isn't a copy checked out in the worktree.  Prior to
557a5998d the index was read as a side effect of the call to
'gitmodules_config()' which ensured that the index was already populated
before entering the attribute subsystem.

Since the call to 'gitmodules_config()' was removed the index is no
longer being read so when the attribute system tries to read from the
in-memory index it doesn't find any ".gitattribute" entries effectively
ignoring any configured attributes.

Fix this by explicitly reading the index during the setup of diff-tree.

Reported-by: Ben Boeckel 
Signed-off-by: Brandon Williams 
---

This patch should fix the regression.  Let me know if it doesn't solve the
issue and I'll investigate some more.

 builtin/diff-tree.c|  1 +
 t/t4015-diff-whitespace.sh | 17 +
 2 files changed, 18 insertions(+)

diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
index d66499909..cfe7d0281 100644
--- a/builtin/diff-tree.c
+++ b/builtin/diff-tree.c
@@ -110,6 +110,7 @@ int cmd_diff_tree(int argc, const char **argv, const char 
*prefix)
 
git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
init_revisions(opt, prefix);
+   read_cache();
opt->abbrev = 0;
opt->diff = 1;
opt->disable_stdin = 1;
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 559a7541a..6e061a002 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -636,6 +636,23 @@ test_expect_success 'check with space before tab in indent 
(diff-tree)' '
test_must_fail git diff-tree --check HEAD^ HEAD
 '
 
+test_expect_success 'check with ignored trailing whitespace attr (diff-tree)' '
+   test_when_finished "git reset --hard HEAD^" &&
+
+   # Create a whitespace error that should be ignored.
+   echo "* -whitespace" > ".gitattributes" &&
+   git add ".gitattributes" &&
+   echo "trailing space -> " > "trailing-space" &&
+   git add "trailing-space" &&
+   git commit -m "trailing space" &&
+
+   # With a worktree diff-tree ignores the whitespace error
+   git diff-tree --root --check HEAD &&
+
+   # Without a worktree diff-tree still ignores the whitespace error
+   git -C .git diff-tree --root --check HEAD
+'
+
 test_expect_success 'check trailing whitespace (trailing-space: off)' '
git config core.whitespace "-trailing-space" &&
echo "foo ();   " >x &&
-- 
2.15.1.424.g9478a66081-goog



Re: [PATCH v6 09/12] fixup: sha1_file: convert gotos to break/continue

2017-12-05 Thread Jeff Hostetler



On 12/5/2017 3:54 PM, Junio C Hamano wrote:

Jeff Hostetler  writes:


From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
  sha1_file.c | 40 
  1 file changed, 20 insertions(+), 20 deletions(-)


The second (i.e. this) part and the third part are not yet in
'next', so it will perfectly be fine to squash these into the
commits that introduces the issues that are corrected in this
"fixup".  The same comment applies to all the other "fixup" patches.



OK thanks.  I left this one as a separate commit so that
we could debate the merits of the while(1) vs the gotos
and keep/reject independent of the other changes.  I'll
squash it in the next version.

Thanks
Jeff



Re: [PATCH v4 1/4] Makefile: generate Perl header from template file

2017-12-05 Thread Junio C Hamano
Dan Jacques  writes:

> Thanks for checking! The patch that you quoted above looks like it's from
> this "v4" thread; however, the patch that you are diffing against in your
> latest reply seems like it is from an earlier version.
>
> I believe that the $(pathsep) changes in your proposed patch are already
> present in v4,...

You're of course right.  The patches I had in my tree are outdated.

Will replace, even though I won't be merging them to 'pu' while we
wait for Ævar's perl build procedure update to stabilize.

Thanks.


Re: [PATCH v4 1/4] Makefile: generate Perl header from template file

2017-12-05 Thread Dan Jacques
Johannes Sixt wrote:

> I don't know what I tested last week; most likely not the version of the
> patch I quoted above.
>
> Today's version, with the tip at 5d7f59c391ce, is definitely bogus
> with its quoting. It needs the patch below, otherwise an unquoted
> semicolon may be expanded from $(pathsep). This would terminate the sed
> command, of course.

Thanks for checking! The patch that you quoted above looks like it's from
this "v4" thread; however, the patch that you are diffing against in your
latest reply seems like it is from an earlier version.

I believe that the $(pathsep) changes in your proposed patch are already
present in v4, having been caught by Johannes, so perhaps you were using
an older version of the patch as the diffbase? That change only appeared in
v4, so any older version would have had the previous incorrect quoting.

I think the reason that is necessary to remove the single quotes is not
because the ";" is terminating the `sed` expression, but because it's
terminating the entire shell stanza, causing the shell to execute two
commands:

1) {"sed" "-e" "s=@@PATHSEP@@="}
2) {"=g" "-e" "s=@@INSTLIBDIR@@=..." ...}

By including the ";" within the single-tick-protected `sed` argument, the
shell receives the entire block, "s=@@PATHSEP@@=;=", as a single string.

Also see Johannes' explanation here:
https://github.com/danjacques/git/pull/1/commits/57dc265478692ea2a395fc61fa122cb73ce58dc3

Could you apply this v4 patch series and confirm that it is working for
you? Or, if I am mistaken and your patch is correctly against this "v4"
patch series, let me know and I'll try and figure out what I'm missing.

Thanks again!
-Dan


Re: [PATCH v4 1/4] Makefile: generate Perl header from template file

2017-12-05 Thread Junio C Hamano
Johannes Sixt  writes:

> Today's version, with the tip at 5d7f59c391ce, is definitely bogus
> with its quoting. It needs the patch below, otherwise an unquoted
> semicolon may be expanded from $(pathsep). This would terminate the sed
> command, of course.

Of course ;-)

Somehow I was lucky that the topic as been de-queued from 'pu' for
the past few days.

> diff --git a/Makefile b/Makefile
> index 484dc44ade..a658c8169a 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2071,10 +2071,10 @@ GIT-PERL-HEADER: $(PERL_HEADER_TEMPLATE) 
> GIT-PERL-DEFINES perl/perl.mak
>   INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C perl -s --no-print-directory 
> instlibdir` && \
>   INSTLIBDIR_EXTRA='$(PERLLIB_EXTRA_SQ)' && \
>   INSTLIBDIR="$$INSTLIBDIR$${INSTLIBDIR_EXTRA:+:$$INSTLIBDIR_EXTRA}" && \
> - sed -e 's=@@PATHSEP@@='$(pathsep)'=g' \
> - -e 's=@@INSTLIBDIR@@='$$INSTLIBDIR'=g' \
> - -e 's=@@GITEXECDIR@@='$(gitexecdir_relative_SQ)'=g' \
> - -e 's=@@PERLLIBDIR@@='$(perllibdir_relative_SQ)'=g' \
> + sed -e 's=@@PATHSEP@@=$(pathsep)=g' \
> + -e 's=@@INSTLIBDIR@@='"$$INSTLIBDIR"'=g' \
> + -e 's=@@GITEXECDIR@@=$(gitexecdir_relative_SQ)=g' \
> + -e 's=@@PERLLIBDIR@@=$(perllibdir_relative_SQ)=g' \
>   $< >$@
>  
>  .PHONY: gitweb


Re: [PATCH v6 09/12] fixup: sha1_file: convert gotos to break/continue

2017-12-05 Thread Junio C Hamano
Jeff Hostetler  writes:

> + while (1) {
> + if (find_pack_entry(real, ))
> + break;
>  
> + /* Most likely it's a loose object. */
> + if (!sha1_loose_object_info(real, oi, flags))
> + return 0;
>  
> + /* Not a loose object; someone else may have just packed it. */
> + reprepare_packed_git();
> + if (find_pack_entry(real, ))
> + break;
>  
> + /* Check if it is a missing object */
> + if (fetch_if_missing && repository_format_partial_clone &&
> + !already_retried) {
> + fetch_object(repository_format_partial_clone, real);
> + already_retried = 1;
> + continue;
> + }
> +
> + return -1;
> + }

OK.  This "infinite" loop actually never runs more than twice,
thanks to "already-retried" thing.  It probably makes sense to
structure the code this way.

I suspect that this would also have worked better with the older
incarnation of the jk/fewer-pack-rescan topic where an optimization
used to be after that the initial queries to packs fail to find the
object, but the topic has since been updated to do its optimization
check much earlier in the function, so its interaction with this
topic would not be much of an issue anymore with or without this
rewrite.

In any case, I think this is a good change.



Re: [PATCH v6 09/12] fixup: sha1_file: convert gotos to break/continue

2017-12-05 Thread Junio C Hamano
Jeff Hostetler  writes:

> From: Jeff Hostetler 
>
> Signed-off-by: Jeff Hostetler 
> ---
>  sha1_file.c | 40 
>  1 file changed, 20 insertions(+), 20 deletions(-)

The second (i.e. this) part and the third part are not yet in
'next', so it will perfectly be fine to squash these into the
commits that introduces the issues that are corrected in this
"fixup".  The same comment applies to all the other "fixup" patches.



>
> diff --git a/sha1_file.c b/sha1_file.c
> index fc7718a..ce67f27 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1180,30 +1180,30 @@ int sha1_object_info_extended(const unsigned char 
> *sha1, struct object_info *oi,
>   }
>   }
>  
> -retry:
> - if (find_pack_entry(real, ))
> - goto found_packed;
> + while (1) {
> + if (find_pack_entry(real, ))
> + break;
>  
> - /* Most likely it's a loose object. */
> - if (!sha1_loose_object_info(real, oi, flags))
> - return 0;
> + /* Most likely it's a loose object. */
> + if (!sha1_loose_object_info(real, oi, flags))
> + return 0;
>  
> - /* Not a loose object; someone else may have just packed it. */
> - reprepare_packed_git();
> - if (find_pack_entry(real, ))
> - goto found_packed;
> -
> - /* Check if it is a missing object */
> - if (fetch_if_missing && repository_format_partial_clone &&
> - !already_retried) {
> - fetch_object(repository_format_partial_clone, real);
> - already_retried = 1;
> - goto retry;
> - }
> + /* Not a loose object; someone else may have just packed it. */
> + reprepare_packed_git();
> + if (find_pack_entry(real, ))
> + break;
>  
> - return -1;
> + /* Check if it is a missing object */
> + if (fetch_if_missing && repository_format_partial_clone &&
> + !already_retried) {
> + fetch_object(repository_format_partial_clone, real);
> + already_retried = 1;
> + continue;
> + }
> +
> + return -1;
> + }
>  
> -found_packed:
>   if (oi == _oi)
>   /*
>* We know that the caller doesn't actually need the


Re: [PATCH v4 1/4] Makefile: generate Perl header from template file

2017-12-05 Thread Johannes Sixt
Am 01.12.2017 um 18:25 schrieb Johannes Sixt:
> Am 01.12.2017 um 18:13 schrieb Johannes Schindelin:
>> Hi Hannes,
>>
>> On Fri, 1 Dec 2017, Johannes Sixt wrote:
>>
>>> Am 29.11.2017 um 16:56 schrieb Dan Jacques:
 @@ -1989,6 +1986,15 @@ GIT-PERL-DEFINES: FORCE
     echo "$$FLAGS" >$@; \
     fi
    +GIT-PERL-HEADER: $(PERL_HEADER_TEMPLATE) GIT-PERL-DEFINES 
 perl/perl.mak
 Makefile
 +    $(QUIET_GEN)$(RM) $@ && \
 +    INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C perl -s --no-print-directory
 instlibdir` && \
 +    INSTLIBDIR_EXTRA='$(PERLLIB_EXTRA_SQ)' && \
 +    
 INSTLIBDIR="$$INSTLIBDIR$${INSTLIBDIR_EXTRA:+:$$INSTLIBDIR_EXTRA}" &&
 \
 +    sed -e 's=@@PATHSEP@@=$(pathsep)=g' \
>>>
>>> This doesn't work, unfortunately. When $(pathsep) is ';', we get an 
>>> incomplete
>>> sed expression because ';' is also a command separator in the sed 
>>> language.
>>
>> Funny, I tried this also with ';' as pathsep, and it worked in the Git 
>> for
>> Windows SDK...
> 
> My ancient sed vs. your modern sed, perhaps? I can check this on Monday.

I don't know what I tested last week; most likely not the version of the
patch I quoted above.

Today's version, with the tip at 5d7f59c391ce, is definitely bogus
with its quoting. It needs the patch below, otherwise an unquoted
semicolon may be expanded from $(pathsep). This would terminate the sed
command, of course.


diff --git a/Makefile b/Makefile
index 484dc44ade..a658c8169a 100644
--- a/Makefile
+++ b/Makefile
@@ -2071,10 +2071,10 @@ GIT-PERL-HEADER: $(PERL_HEADER_TEMPLATE) 
GIT-PERL-DEFINES perl/perl.mak
INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C perl -s --no-print-directory 
instlibdir` && \
INSTLIBDIR_EXTRA='$(PERLLIB_EXTRA_SQ)' && \
INSTLIBDIR="$$INSTLIBDIR$${INSTLIBDIR_EXTRA:+:$$INSTLIBDIR_EXTRA}" && \
-   sed -e 's=@@PATHSEP@@='$(pathsep)'=g' \
-   -e 's=@@INSTLIBDIR@@='$$INSTLIBDIR'=g' \
-   -e 's=@@GITEXECDIR@@='$(gitexecdir_relative_SQ)'=g' \
-   -e 's=@@PERLLIBDIR@@='$(perllibdir_relative_SQ)'=g' \
+   sed -e 's=@@PATHSEP@@=$(pathsep)=g' \
+   -e 's=@@INSTLIBDIR@@='"$$INSTLIBDIR"'=g' \
+   -e 's=@@GITEXECDIR@@=$(gitexecdir_relative_SQ)=g' \
+   -e 's=@@PERLLIBDIR@@=$(perllibdir_relative_SQ)=g' \
$< >$@
 
 .PHONY: gitweb
-- 
2.14.2.808.g3bc32f2729


Re: How hard would it be to implement sparse fetching/pulling?

2017-12-05 Thread Jonathan Nieder
Hi,

Jeff Hostetler wrote:
> On 12/2/2017 1:24 PM, Philip Oakley wrote:
>> From: "Jeff Hostetler" 
>> Sent: Friday, December 01, 2017 5:23 PM

>>> Discussing this feature in the context of the defense industry
>>> makes me a little nervous.  (I used to be in that area.)
>>
>> I'm viewing the desire for codebase partitioning from a soft layering
>> of risk view (perhaps a more UK than USA approach ;-)
>
> I'm not sure I know what this means or how the UK defense
> security models/policy/procedures are different from the US,
> so I can't say much here.  I'm just thinking that even if we
> get a *perfectly working* partial clone/fetch/push/etc. that
> it would not pass a security audit.  I might be wrong here
> (and I'm no expert on the subject), but I think they would
> push you towards a different solution architecture.

I'm pretty ignorant about the defense industry, but a few more
comments:

- gitolite implements some features on top of git's server code that I
  consider to be important for security.  So much so that I've been
  considering what it would take to remove the git-shell command from
  git.git and move it to the gitolite project where people would be
  better equipped to use it in an appropriate context

- in particular, git's reachability checking code could use some
  hardening/improvement.  In particular, think of edge cases like
  where someone pushes a pack with deltas referring to objects they
  should not be able to reach.

- Anyone willing to audit git code's security wins my approval.
  Please, please, audit git code and report the issues you find. :)

[...]
> Also omitting certain trees means you now (obviously) have both missing
> trees and blobs.  And both need to be dynamically or batch fetched as
> needed.  And certain operations will need multiple round trips to fully
> resolve -- fault in a tree and then fault in blobs referenced by it.

For omitting trees, we will need to modify the index format, since the
index has entries for all paths today.  That's on the roadmap but has
not been implemented yet.

Thanks,
Jonathan


Re: [PATCH v6 00/12] Partial clone part 2: fsck and promisors

2017-12-05 Thread Jonathan Tan
On Tue,  5 Dec 2017 16:58:42 +
Jeff Hostetler  wrote:

> From: Jeff Hostetler 
> 
> This is V6 of part 2 of partial clone.  This assumes V6 of part 1
> is already present.  This version fixes a problem in fetch-pack
> observed in V5.  It also contains 2 "fixup" commits that are
> WIP responses to comments on V5.

A note on the fix of a problem in fetch-pack observed in V5: to do this,
I renamed the "no_haves" setting to "no_dependents", since this setting
now has a broader effect than merely suppressing "have" lines. This
setting is described in patch 7 ("introduce fetch-object: fetch one
promisor object").

> Part 2 is concerned with fsck, gc, initial support for dynamic
> object fetching, and tracking promisor objects.  Jonathan Tan
> originally developed this code.  I have moved it on top of
> part 1 and updated it slightly.

Thanks. I checked the diff between this and V5 and it looks as I
expected. ("git am -3" didn't work on the patches as e-mailed to the
list, though - I had to use the one hosted at GitHub [1].)

[1] https://github.com/jeffhostetler/git/tree/core/pc6_p2


Re: [WIP 1/2] submodule: refactor acquisition of superproject info

2017-12-05 Thread Stefan Beller
On Fri, Dec 1, 2017 at 2:50 PM, Jonathan Tan  wrote:
> Signed-off-by: Jonathan Tan 
> ---
>  submodule.c | 76 
> +
>  submodule.h |  3 +++


This patch reads very similar to [1], which was a preparation part of the series
"[RFC PATCH 0/4] git-status reports relation to superproject"; there
we also need
the same information about the superproject. I'll take a closer look
and compare these
two patches if it turns out we want to go this way long term.

[1] https://public-inbox.org/git/20171108195509.7839-3-sbel...@google.com/


Re: [PATCH v1 1/2] t/README: remove mention of adding copyright notices

2017-12-05 Thread Jonathan Nieder
Hi,

Thomas Gummerer wrote:

> We generally no longer include copyright notices in new test scripts.
> However t/README still mentions it as something to include at the top of
> every new script.

Where can I read more about this change?  Was it a deliberate change
or something that simply happened?

Thanks,
Jonathan


Re: gitattributes not read for diff-tree anymore in 2.15?

2017-12-05 Thread Ben Boeckel
On Tue, Dec 05, 2017 at 10:16:45 -0800, Brandon Williams wrote:
> Perfect, thanks!

OK, attached is a shell script which recreates the issue. I haven't been
able to get it to happen without the `GIT_WORK_TREE` and `GIT_INDEX_FILE`
setup involved, so that seems to be important.

I reran the bisect using this script and came up with this commit:

commit be4ca290570f9173db64ea1f925b5b3831c6efed
Author: David Turner 
Date:   Thu Apr 20 16:41:18 2017 -0400

Increase core.packedGitLimit



which seems even less relevant…

Thanks,

--Ben


diff-tree-break.sh
Description: Bourne shell script


Re: [PATCH v1 2/2] t/README: document test_cmp_rev

2017-12-05 Thread Jonathan Nieder
Hi,

Thomas Gummerer wrote:

> test_cmp_rev is a useful function that's used in quite a few test
> scripts.  It is however not documented in t/README.  Document it.
>
> Signed-off-by: Thomas Gummerer 
> ---
>  t/README | 5 +
>  1 file changed, 5 insertions(+)

Reviewed-by: Jonathan Nieder 

I admit I usually go straight to t/test-lib-functions.sh when I want
to find an appropriate helper.  I think this kind of introductory
documentation in t/README is still useful, though.

Thanks,
Jonathan


RE: [RFE] Inverted sparseness (amended)

2017-12-05 Thread Randall S. Becker
On December 3, 2017 6:14 PM, Philip Oakley wrote a nugget of wisdom: 
>From: "Randall S. Becker" 
>Sent: Friday, December 01, 2017 6:31 PM
>> On December 1, 2017 1:19 PM, Jeff Hostetler wrote:
>>>On 12/1/2017 12:21 PM, Randall S. Becker wrote:
 I recently encountered a really strange use-case relating to sparse 
 clone/fetch that is really backwards from the discussion that has 
 been going on, and well, I'm a bit embarrassed to bring it up, but 
 I have no good solution including building a separate data store 
 that will end up inconsistent with repositories (a bad solution).  
 The use-case is as
 follows:

 Given a backbone of multiple git repositories spread across an 
 organization with a server farm and upstream vendors.
 The vendor delivers code by having the client perform git pull into 
 a specific branch.
 The customer may take the code as is or merge in customizations.
 The vendor wants to know exactly what commit of theirs is installed 
 on each server, in near real time.
 The customer is willing to push the commit-ish to the vendor's 
 upstream repo but does not want, by default, to share the actual 
 commit contents for security reasons.
 Realistically, the vendor needs to know that their own commit id 
 was put somewhere (process exists to track this, so not part of the
 use-case) and whether there is a subsequent commit contributed >by 
 the customer, but the content is not relevant initially.

 After some time, the vendor may request the commit contents from 
 the customer in order to satisfy support requirements - a.k.a. a 
 defect was found but has to be resolved.
 The customer would then perform a deeper push that looks a lot like 
 a "slightly" symmetrical operation of a deep fetch following a 
 prior sparse fetch to supply the vendor with the specific commit(s).
>>
>>>Perhaps I'm not understanding the subtleties of what you're 
>>>describing, but could you do this with stock git functionality.
>>
>>>Let the vendor publish a "well known branch" for the client.
>>>Let the client pull that and build.
>>>Let the client create a branch set to the same commit that they fetched.
>>>Let the client push that branch as a client-specific branch to the 
>>>vendor to indicate that that is the official release they are based on.
>>
>>>Then the vendor would know the official commit that the client was using.
>> This is the easy part, and it doesn't require anything sparse to exist.
>>
>>>If the client makes local changes, does the vendor really need the 
>>>SHA of those -- without the actual content?
>>>I mean any SHA would do right?  Perhaps let the client create a 
>>>second client-specific branch (set to  the same commit as the first) 
>>>to indicate they had mods.
>>>Later, when the vendor needs the actual client changes, the client 
>>>does a normal push to this 2nd client-specific branch at the vendor.
>>>This would send everything that the client has done to the code since 
>>>the official release.
>>
>> What I should have added to the use-case was that there is a strong 
>> audit requirement (regulatory, actually) involved that the SHA is 
>> exact, immutable, and cannot be substitute or forged (one of the 
>> reasons git is in such high regard). So, no I can't arrange a fake 
>> SHA to represent a SHA to be named later. It SHA of the installed 
>> commit is part of the official record of what happened on the specific 
>> server, so I'm stuck with it.
>>
>>>I'm not sure what you mean about "it is inside a tree".
>>
>> m---a---b---c---H1
>>  `---d---H2
>>
>> d would be at a head. b would be inside. Determining content of c is 
>> problematic if b is sparse, so I'm really unsure that any of this is 
>> possible.

>I think I get the jist of your use case. Would I be right that you 
>don't have a true working solution yet? i.e. that it's a problem that is 
>almost sorted but falls down at the last step.

>If one pretended that this was a single development shop, and the 
>various vendors, clients and customers as being independent devolopers, 
>each of whom is over protective of their code, it may give a better view that 
>maps onto classic feature development diagrams.
>(i.e draw the answer for local devs, then mark where the splits happen)

>In particular, I think you could use a notional regulator's view that 
>the whole code base is part of a large Git heirarchy of branches and 
>merges, and that some of the feature loops are only available via the 
>particular developer that worked on that feature.

>This would mean that from a regulatory overview there is a merge commit in the 
>'main'
>(master) heirachy that has the main and feature commits listed, and the 
>feature commit is probably an --allow-empty commit (that has an empty 
>tree if they are that paranoid) that says 'function X released' (and 
>probably tagged), and that release commit 

Re: [PATCH v2] pathspec: only match across submodule boundaries when requested

2017-12-05 Thread Junio C Hamano
Brandon Williams  writes:

> Commit 74ed43711fd (grep: enable recurse-submodules to work on 
> objects, 2016-12-16) taught 'tree_entry_interesting()' to be able to
> match across submodule boundaries in the presence of wildcards.  This is
> done by performing literal matching up to the first wildcard and then
> punting to the submodule itself to perform more accurate pattern
> matching.  Instead of introducing a new flag to request this behavior,
> commit 74ed43711fd overloaded the already existing 'recursive' flag in
> 'struct pathspec' to request this behavior.
>
> This leads to a bug where whenever any other caller has the 'recursive'
> flag set as well as a pathspec with wildcards that all submodules will
> be indicated as matches.  One simple example of this is:
>
>   git init repo
>   cd repo
>
>   git init submodule
>   git -C submodule commit -m initial --allow-empty
>
>   touch "[bracket]"
>   git add "[bracket]"
>   git commit -m bracket
>   git add submodule
>   git commit -m submodule
>
>   git rev-list HEAD -- "[bracket]"
>
> Fix this by introducing the new flag 'recurse_submodules' in 'struct
> pathspec' and using this flag to determine if matches should be allowed
> to cross submodule boundaries.

Makes sense.

I initially misread the title

"pathspec: only match across submodule boundaries when requested"

to be saying that the match happens only at the boundary, but that
"only" is not about where the match happens.

"pathspec: match across submodule boundaries only when requested"

would have avoided such a confusion.

> diff --git a/builtin/grep.c b/builtin/grep.c
> index 5a6cfe6b4..3ca4ac80d 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -1015,6 +1015,7 @@ int cmd_grep(int argc, const char **argv, const char 
> *prefix)
>  prefix, argv + i);
>   pathspec.max_depth = opt.max_depth;
>   pathspec.recursive = 1;
> + pathspec.recurse_submodules = !!recurse_submodules;

With the current code, recurse_submodules can only be 0 or 1 (the
only assignment is from the return value of git_config_bool()), so
the force-boolean !! is not strictly needed, but it may be a good
future-proofing measure.

Will queue; thanks.


Re: How hard would it be to implement sparse fetching/pulling?

2017-12-05 Thread Jeff Hostetler



On 12/2/2017 1:24 PM, Philip Oakley wrote:

From: "Jeff Hostetler" 
Sent: Friday, December 01, 2017 5:23 PM

On 11/30/2017 6:43 PM, Philip Oakley wrote:

[...]


Discussing this feature in the context of the defense industry
makes me a little nervous.  (I used to be in that area.)


I'm viewing the desire for codebase partitioning from a soft layering
of risk view (perhaps a more UK than USA approach ;-)


I'm not sure I know what this means or how the UK defense
security models/policy/procedures are different from the US,
so I can't say much here.  I'm just thinking that even if we
get a *perfectly working* partial clone/fetch/push/etc. that
it would not pass a security audit.  I might be wrong here
(and I'm no expert on the subject), but I think they would
push you towards a different solution architecture.





What we have in the code so far may be a nice start, but
probably doesn't have the assurances that you would need
for actual deployment.  But it's a start


True. I need to get some of my collegues more engaged...



[...]

Yes, this does tend to lead towards an always-online mentality.
However, there are 2 parts:
[a] dynamic object fetching for missing objects, such as during a
    random command like diff or blame or merge.  We need this
    regardless of usage -- because we can't always predict (or
    dry-run) every command the user might run in advance.


Making something "useful" happen here when off-line is an obvious goal.


[b] batch fetch mode, such as using partial-fetch to match your
    sparse-checkout so that you always have the blobs of interest
    to you.  And assuming you don't wander outside of this subset
    of the tree, you should be able to work offline as usual.
If you can work within the confines of [b], you wouldn't need to
always be online.


I feel this is the area that does need ensure a capability to avoid
any perception of the much maligned 'Embrace, extend, and extinguish'> by 
accidental lockout.

I don't think this should be viewed as a type of sparse checkout -
it's just a checkout of what you have (under the hood it could use
the same code though).


Right, I'm only thinking of this effort as a way to get a partial
clone and fetch that omits unneeded (or, not immediately needed)
objects for performance reasons.  There are several use scenarios
that I've discussed and sparse-checkout is one of them, but I do
not consider this to be a sparse-checkout feature.

 
[...]


The main problem with markers or other lists of missing objects is
that it has scale problems for large repos.  Suppose I have 100M
blobs in my repo.  If I do a blob:none clone, I'd have 100M missing
blobs that would need tracking.  If I then do a batch fetch of the
blobs needed to do a sparse checkout of HEAD, I'd have to remove
those entries from the tracking data.  Not impossible, but not
speedy either.


** Ahhh. I see. That's a consequence of having all the trees isn't it. **

I've always thought that limiting the trees is at the heart of the Narrow 
clone/fetch problem.

OK so if you have flat, wide structures with 10k files/directories per tree 
then it's still a fair sized problem, but it should *scale logarithmically* for 
the part of the tree structure that's not being downloaded.

You never have to add a marker for a blob that you have no containing tree for. 
Nor for the tree that contained the blob's tree, all the way up to primary line 
of descent to the tree of concern. All those trees are never down loaded, there 
are few markers (.gitNarrowTree files) for those tree stubs, certainly no 100M 
missing blob markers.


Currently, the code only omits blobs.  I want to extend the current
code to have filters that also exclude unneeded trees.  That will help
address some of these size concerns, but there are still perf issues
here.



* Marking of 'missing' objects in the local object store, and on the wire.
The missing objects are replaced by a place holder object, which used the
same oid/sha1, but has a short fixed length, with content “GitNarrowObject
”. The chance that that string would actually have such an oid clash is
the same as all other object hashes, so is a *safe* self-referential device.


Again, there is a scale problem here.  If I have 100M missing blobs,
I can't afford to create 100M loose place holder files.  Or juggle
a 2GB file of missing objects on various operations.


As above, I'm also trimming the trees, so in general, there would be no missing 
 blobs, just the content of the directory one was interested in.

That's not quite true if higher level trees have blob references in them that 
are otherwise unwanted - they may each need a marker. [Or maybe a special 
single 'tree-of-blobs' marker for them all thus only one marker per tree - 
over-thinking maybe...]


Also omitting certain trees means you now (obviously) have both missing
trees and blobs.  And both need to be dynamically or batch fetched as
needed.  And certain operations 

Re: gitattributes not read for diff-tree anymore in 2.15?

2017-12-05 Thread Brandon Williams
On 12/05, Ben Boeckel wrote:
> On Mon, Dec 04, 2017 at 15:03:55 -0800, Brandon Williams wrote:
> > Reading the attributes files should be done regardless if the gitmodules
> > file is read.  The gitmodules file should only come into play if you are
> > dealing with submodules.
> 
> Yeah, it doesn't seem to make sense why this commit breaks us, but there
> it is *shrug*.

While it doesn't make the most sense, I still wouldn't be surprised if I
missed something when writing that patch that inadvertently caused an
issue.

> 
> > Do you mind providing a reproduction recipe with expected outcome vs
> > actual outcome and I can take a closer look.
> 
> I'll work on one. It isn't as simple as I thought it was :) . The setup
> we do before running the checks is apparently involved as running it
> from the command line is not exhibiting the difference.
> 
> --Ben

Perfect, thanks!

-- 
Brandon Williams


[PATCH v3 9/9] t3404: add test case for abbreviated commands

2017-12-05 Thread Liam Beguin
Make sure the todo list ends up using single-letter command
abbreviations when the rebase.abbreviateCommands is enabled.
This configuration option should not change anything else.

Signed-off-by: Liam Beguin 
---
 t/t3404-rebase-interactive.sh | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 6a82d1ed876d..481a3500900d 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1260,6 +1260,28 @@ test_expect_success 'rebase -i respects 
rebase.missingCommitsCheck = error' '
test B = $(git cat-file commit HEAD^ | sed -ne \$p)
 '
 
+test_expect_success 'respects rebase.abbreviateCommands with fixup, squash and 
exec' '
+   rebase_setup_and_clean abbrevcmd &&
+   test_commit "first" file1.txt "first line" first &&
+   test_commit "second" file1.txt "another line" second &&
+   test_commit "fixup! first" file2.txt "first line again" first_fixup &&
+   test_commit "squash! second" file1.txt "another line here" 
second_squash &&
+   cat >expected <<-EOF &&
+   p $(git rev-list --abbrev-commit -1 first) first
+   f $(git rev-list --abbrev-commit -1 first_fixup) fixup! first
+   x git show HEAD
+   p $(git rev-list --abbrev-commit -1 second) second
+   s $(git rev-list --abbrev-commit -1 second_squash) squash! second
+   x git show HEAD
+   EOF
+   git checkout abbrevcmd &&
+   set_cat_todo_editor &&
+   test_config rebase.abbreviateCommands true &&
+   test_must_fail git rebase -i --exec "git show HEAD" \
+   --autosquash master >actual &&
+   test_cmp expected actual
+'
+
 test_expect_success 'static check of bad command' '
rebase_setup_and_clean bad-cmd &&
set_fake_editor &&
-- 
2.15.1.280.g10402c1f5b5c



[PATCH v3 5/9] rebase -i: replace reference to sha1 with oid

2017-12-05 Thread Liam Beguin
Since we are trying to abstract the hash function name elsewhere in the
code base, lets use OID instead of SHA-1 in the rebase--helper too.

Signed-off-by: Liam Beguin 
---
 builtin/rebase--helper.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 8ad4779d1650..c3b8e4d401f8 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
struct replay_opts opts = REPLAY_OPTS_INIT;
int keep_empty = 0;
enum {
-   CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
+   CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH
} command = 0;
struct option options[] = {
@@ -27,9 +27,9 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
OPT_CMDMODE(0, "make-script", ,
N_("make rebase script"), MAKE_SCRIPT),
OPT_CMDMODE(0, "shorten-ids", ,
-   N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S),
+   N_("shorten commit ids in the todo list"), 
SHORTEN_OIDS),
OPT_CMDMODE(0, "expand-ids", ,
-   N_("expand SHA-1s in the todo list"), EXPAND_SHA1S),
+   N_("expand commit ids in the todo list"), EXPAND_OIDS),
OPT_CMDMODE(0, "check-todo-list", ,
N_("check the todo list"), CHECK_TODO_LIST),
OPT_CMDMODE(0, "skip-unnecessary-picks", ,
@@ -54,9 +54,9 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
return !!sequencer_remove_state();
if (command == MAKE_SCRIPT && argc > 1)
return !!sequencer_make_script(keep_empty, stdout, argc, argv);
-   if (command == SHORTEN_SHA1S && argc == 1)
+   if (command == SHORTEN_OIDS && argc == 1)
return !!transform_todos(1);
-   if (command == EXPAND_SHA1S && argc == 1)
+   if (command == EXPAND_OIDS && argc == 1)
return !!transform_todos(0);
if (command == CHECK_TODO_LIST && argc == 1)
return !!check_todo_list();
-- 
2.15.1.280.g10402c1f5b5c



[PATCH v2 0/9] rebase -i: add config to abbreviate command names

2017-12-05 Thread Liam Beguin
Hi everyone,

This series will add the 'rebase.abbreviateCommands' configuration
option to allow `git rebase -i` to default to the single-letter command
names when generating the todo list.

Using single-letter command names can present two benefits. First, it
makes it easier to change the action since you only need to replace a
single character (i.e.: in vim "r" instead of
"ciw").  Second, using this with a large enough value of
'core.abbrev' enables the lines of the todo list to remain aligned
making the files easier to read.

Changes in V2:
- Refactor and rename 'transform_todo_ids'
- Replace SHA-1 by OID in rebase--helper.c
- Update todo list related functions to take a generic 'flags' parameter
- Rename 'add_exec_commands' function to 'sequencer_add_exec_commands'
- Rename 'add-exec' option to 'add-exec-commands'
- Use 'strbur_read_file' instead of rewriting it
- Make 'command_to_char' return 'comment_char_line' if no single-letter
  command name is defined
- Combine both tests into a single test case
- Update commit messages

Changes in V2:
- Rename 'transform_todo_insn' to 'transform_todos'
- Fix flag name TODO_LIST_SHORTE{D,N}_IDS

Liam Beguin (9):
  Documentation: move rebase.* configs to new file
  Documentation: use preferred name for the 'todo list' script
  rebase -i: set commit to null in exec commands
  rebase -i: refactor transform_todo_ids
  rebase -i: replace reference to sha1 with oid
  rebase -i: update functions to use a flags parameter
  rebase -i -x: add exec commands via the rebase--helper
  rebase -i: learn to abbreviate command names
  t3404: add test case for abbreviated commands

 Documentation/config.txt|  31 +---
 Documentation/git-rebase.txt|  19 +
 Documentation/rebase-config.txt |  52 +
 builtin/rebase--helper.c|  29 +---
 git-rebase--interactive.sh  |  23 +-
 sequencer.c | 126 +---
 sequencer.h |  10 ++-
 t/t3404-rebase-interactive.sh   |  22 ++
 8 files changed, 186 insertions(+), 126 deletions(-)
 create mode 100644 Documentation/rebase-config.txt

-- 
2.15.1.280.g10402c1f5b5c



[PATCH v3 6/9] rebase -i: update functions to use a flags parameter

2017-12-05 Thread Liam Beguin
Update functions used in the rebase--helper so that they take a generic
'flags' parameter instead of a growing list of options.

Signed-off-by: Liam Beguin 
---
 builtin/rebase--helper.c | 13 +++--
 sequencer.c  |  9 +
 sequencer.h  |  8 +---
 3 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index c3b8e4d401f8..1102ecb43b67 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -12,7 +12,7 @@ static const char * const builtin_rebase_helper_usage[] = {
 int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 {
struct replay_opts opts = REPLAY_OPTS_INIT;
-   int keep_empty = 0;
+   unsigned flags = 0, keep_empty = 0;
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH
@@ -48,16 +48,17 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
argc = parse_options(argc, argv, NULL, options,
builtin_rebase_helper_usage, PARSE_OPT_KEEP_ARGV0);
 
+   flags |= keep_empty ? TODO_LIST_KEEP_EMPTY : 0;
+   flags |= command == SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;
+
if (command == CONTINUE && argc == 1)
return !!sequencer_continue();
if (command == ABORT && argc == 1)
return !!sequencer_remove_state();
if (command == MAKE_SCRIPT && argc > 1)
-   return !!sequencer_make_script(keep_empty, stdout, argc, argv);
-   if (command == SHORTEN_OIDS && argc == 1)
-   return !!transform_todos(1);
-   if (command == EXPAND_OIDS && argc == 1)
-   return !!transform_todos(0);
+   return !!sequencer_make_script(stdout, argc, argv, flags);
+   if ((command == SHORTEN_OIDS || command == EXPAND_OIDS) && argc == 1)
+   return !!transform_todos(flags);
if (command == CHECK_TODO_LIST && argc == 1)
return !!check_todo_list();
if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
diff --git a/sequencer.c b/sequencer.c
index c9a661a8c4bd..8b0dd610c881 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2444,14 +2444,15 @@ void append_signoff(struct strbuf *msgbuf, int 
ignore_footer, unsigned flag)
strbuf_release();
 }
 
-int sequencer_make_script(int keep_empty, FILE *out,
-   int argc, const char **argv)
+int sequencer_make_script(FILE *out, int argc, const char **argv,
+ unsigned flags)
 {
char *format = NULL;
struct pretty_print_context pp = {0};
struct strbuf buf = STRBUF_INIT;
struct rev_info revs;
struct commit *commit;
+   int keep_empty = flags & TODO_LIST_KEEP_EMPTY;
 
init_revisions(, NULL);
revs.verbose_header = 1;
@@ -2494,7 +2495,7 @@ int sequencer_make_script(int keep_empty, FILE *out,
 }
 
 
-int transform_todos(int shorten_ids)
+int transform_todos(unsigned flags)
 {
const char *todo_file = rebase_path_todo();
struct todo_list todo_list = TODO_LIST_INIT;
@@ -2522,7 +2523,7 @@ int transform_todos(int shorten_ids)
 
/* add commit id */
if (item->commit) {
-   const char *oid = shorten_ids ?
+   const char *oid = flags & TODO_LIST_SHORTEN_IDS ?
  short_commit_name(item->commit) :
  oid_to_hex(>commit->object.oid);
 
diff --git a/sequencer.h b/sequencer.h
index 4f7f2c93f83e..68284e9762c8 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -45,10 +45,12 @@ int sequencer_continue(struct replay_opts *opts);
 int sequencer_rollback(struct replay_opts *opts);
 int sequencer_remove_state(struct replay_opts *opts);
 
-int sequencer_make_script(int keep_empty, FILE *out,
-   int argc, const char **argv);
+#define TODO_LIST_KEEP_EMPTY (1U << 0)
+#define TODO_LIST_SHORTEN_IDS (1U << 1)
+int sequencer_make_script(FILE *out, int argc, const char **argv,
+ unsigned flags);
 
-int transform_todos(int shorten_ids);
+int transform_todos(unsigned flags);
 int check_todo_list(void);
 int skip_unnecessary_picks(void);
 int rearrange_squash(void);
-- 
2.15.1.280.g10402c1f5b5c



[PATCH v3 7/9] rebase -i -x: add exec commands via the rebase--helper

2017-12-05 Thread Liam Beguin
Recent work on `git-rebase--interactive` aims to convert shell code to
C. Even if this is most likely not a big performance enhancement, let's
convert it too since a coming change to abbreviate command names
requires it to be updated.

Signed-off-by: Liam Beguin 
---
 builtin/rebase--helper.c   |  7 ++-
 git-rebase--interactive.sh | 23 +-
 sequencer.c| 39 ++
 sequencer.h|  1 +
 4 files changed, 47 insertions(+), 23 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 1102ecb43b67..4229ea0dc122 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -15,7 +15,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
unsigned flags = 0, keep_empty = 0;
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
-   CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH
+   CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH,
+   ADD_EXEC
} command = 0;
struct option options[] = {
OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
@@ -36,6 +37,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
OPT_CMDMODE(0, "rearrange-squash", ,
N_("rearrange fixup/squash lines"), REARRANGE_SQUASH),
+   OPT_CMDMODE(0, "add-exec-commands", ,
+   N_("insert exec commands in todo list"), ADD_EXEC),
OPT_END()
};
 
@@ -65,5 +68,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
return !!skip_unnecessary_picks();
if (command == REARRANGE_SQUASH && argc == 1)
return !!rearrange_squash();
+   if (command == ADD_EXEC && argc == 2)
+   return !!sequencer_add_exec_commands(argv[1]);
usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 437815669f00..e3f5a0abf3c7 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -722,27 +722,6 @@ collapse_todo_ids() {
git rebase--helper --shorten-ids
 }
 
-# Add commands after a pick or after a squash/fixup series
-# in the todo list.
-add_exec_commands () {
-   {
-   first=t
-   while read -r insn rest
-   do
-   case $insn in
-   pick)
-   test -n "$first" ||
-   printf "%s" "$cmd"
-   ;;
-   esac
-   printf "%s %s\n" "$insn" "$rest"
-   first=
-   done
-   printf "%s" "$cmd"
-   } <"$1" >"$1.new" &&
-   mv "$1.new" "$1"
-}
-
 # Switch to the branch in $into and notify it in the reflog
 checkout_onto () {
GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $onto_name"
@@ -982,7 +961,7 @@ fi
 
 test -s "$todo" || echo noop >> "$todo"
 test -z "$autosquash" || git rebase--helper --rearrange-squash || exit
-test -n "$cmd" && add_exec_commands "$todo"
+test -n "$cmd" && git rebase--helper --add-exec-commands "$cmd"
 
 todocount=$(git stripspace --strip-comments <"$todo" | wc -l)
 todocount=${todocount##* }
diff --git a/sequencer.c b/sequencer.c
index 8b0dd610c881..892d242f6966 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2494,6 +2494,45 @@ int sequencer_make_script(FILE *out, int argc, const 
char **argv,
return 0;
 }
 
+/*
+ * Add commands after pick and (series of) squash/fixup commands
+ * in the todo list.
+ */
+int sequencer_add_exec_commands(const char *commands)
+{
+   const char *todo_file = rebase_path_todo();
+   struct todo_list todo_list = TODO_LIST_INIT;
+   struct todo_item *item;
+   struct strbuf *buf = _list.buf;
+   size_t offset = 0, commands_len = strlen(commands);
+   int i, first;
+
+   if (strbuf_read_file(_list.buf, todo_file, 0) < 0)
+   return error(_("could not read '%s'."), todo_file);
+
+   if (parse_insn_buffer(todo_list.buf.buf, _list)) {
+   todo_list_release(_list);
+   return error(_("unusable todo list: '%s'"), todo_file);
+   }
+
+   first = 1;
+   /* insert  before every pick except the first one */
+   for (item = todo_list.items, i = 0; i < todo_list.nr; i++, item++) {
+   if (item->command == TODO_PICK && !first) {
+   strbuf_insert(buf, item->offset_in_buf + offset,
+ commands, commands_len);
+   offset += commands_len;
+   }
+   first = 0;
+   }
+
+   /* append final  */
+   strbuf_add(buf, commands, 

[PATCH v3 4/9] rebase -i: refactor transform_todo_ids

2017-12-05 Thread Liam Beguin
The transform_todo_ids function is a little hard to read. Lets try
to make it easier by using more of the strbuf API. Also, since we'll
soon be adding command abbreviations, let's rename the function so
it's name reflects that change.

Signed-off-by: Liam Beguin 
---
 builtin/rebase--helper.c |  4 +--
 sequencer.c  | 69 
 sequencer.h  |  2 +-
 3 files changed, 31 insertions(+), 44 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index f8519363a393..8ad4779d1650 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -55,9 +55,9 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
if (command == MAKE_SCRIPT && argc > 1)
return !!sequencer_make_script(keep_empty, stdout, argc, argv);
if (command == SHORTEN_SHA1S && argc == 1)
-   return !!transform_todo_ids(1);
+   return !!transform_todos(1);
if (command == EXPAND_SHA1S && argc == 1)
-   return !!transform_todo_ids(0);
+   return !!transform_todos(0);
if (command == CHECK_TODO_LIST && argc == 1)
return !!check_todo_list();
if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
diff --git a/sequencer.c b/sequencer.c
index 5033b049d995..c9a661a8c4bd 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2494,60 +2494,47 @@ int sequencer_make_script(int keep_empty, FILE *out,
 }
 
 
-int transform_todo_ids(int shorten_ids)
+int transform_todos(int shorten_ids)
 {
const char *todo_file = rebase_path_todo();
struct todo_list todo_list = TODO_LIST_INIT;
-   int fd, res, i;
-   FILE *out;
+   struct strbuf buf = STRBUF_INIT;
+   struct todo_item *item;
+   int i;
 
-   strbuf_reset(_list.buf);
-   fd = open(todo_file, O_RDONLY);
-   if (fd < 0)
-   return error_errno(_("could not open '%s'"), todo_file);
-   if (strbuf_read(_list.buf, fd, 0) < 0) {
-   close(fd);
+   if (strbuf_read_file(_list.buf, todo_file, 0) < 0)
return error(_("could not read '%s'."), todo_file);
-   }
-   close(fd);
 
-   res = parse_insn_buffer(todo_list.buf.buf, _list);
-   if (res) {
+   if (parse_insn_buffer(todo_list.buf.buf, _list)) {
todo_list_release(_list);
return error(_("unusable todo list: '%s'"), todo_file);
}
 
-   out = fopen(todo_file, "w");
-   if (!out) {
-   todo_list_release(_list);
-   return error(_("unable to open '%s' for writing"), todo_file);
-   }
-   for (i = 0; i < todo_list.nr; i++) {
-   struct todo_item *item = todo_list.items + i;
-   int bol = item->offset_in_buf;
-   const char *p = todo_list.buf.buf + bol;
-   int eol = i + 1 < todo_list.nr ?
-   todo_list.items[i + 1].offset_in_buf :
-   todo_list.buf.len;
-
-   if (item->command >= TODO_EXEC && item->command != TODO_DROP)
-   fwrite(p, eol - bol, 1, out);
-   else {
-   const char *id = shorten_ids ?
-   short_commit_name(item->commit) :
-   oid_to_hex(>commit->object.oid);
-   int len;
-
-   p += strspn(p, " \t"); /* left-trim command */
-   len = strcspn(p, " \t"); /* length of command */
-
-   fprintf(out, "%.*s %s %.*s\n",
-   len, p, id, item->arg_len, item->arg);
+   for (item = todo_list.items, i = 0; i < todo_list.nr; i++, item++) {
+   /* if the item is not a command write it and continue */
+   if (item->command >= TODO_COMMENT) {
+   strbuf_addf(, "%.*s\n", item->arg_len, item->arg);
+   continue;
+   }
+
+   /* add command to the buffer */
+   strbuf_addstr(, command_to_string(item->command));
+
+   /* add commit id */
+   if (item->commit) {
+   const char *oid = shorten_ids ?
+ short_commit_name(item->commit) :
+ oid_to_hex(>commit->object.oid);
+
+   strbuf_addf(, " %s", oid);
}
+   /* add all the rest */
+   strbuf_addf(, " %.*s\n", item->arg_len, item->arg);
}
-   fclose(out);
+
+   i = write_message(buf.buf, buf.len, todo_file, 0);
todo_list_release(_list);
-   return 0;
+   return i;
 }
 
 enum check_level {
diff --git a/sequencer.h b/sequencer.h
index 6f3d3df82c0a..4f7f2c93f83e 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -48,7 +48,7 @@ int sequencer_remove_state(struct replay_opts *opts);
 int 

[PATCH v3 2/9] Documentation: use preferred name for the 'todo list' script

2017-12-05 Thread Liam Beguin
Use "todo list" instead of "instruction list" or "todo-list" to
reduce further confusion regarding the name of this script.

Signed-off-by: Liam Beguin 
---
 Documentation/rebase-config.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/rebase-config.txt b/Documentation/rebase-config.txt
index dba088d7c68f..30ae08cb5a4b 100644
--- a/Documentation/rebase-config.txt
+++ b/Documentation/rebase-config.txt
@@ -23,10 +23,10 @@ rebase.missingCommitsCheck::
--edit-todo' can then be used to correct the error. If set to
"ignore", no checking is done.
To drop a commit without warning or error, use the `drop`
-   command in the todo-list.
+   command in the todo list.
Defaults to "ignore".
 
 rebase.instructionFormat::
A format string, as specified in linkgit:git-log[1], to be used for the
-   instruction list during an interactive rebase.  The format will
+   todo list during an interactive rebase.  The format will
automatically have the long commit hash prepended to the format.
-- 
2.15.1.280.g10402c1f5b5c



[PATCH v3 8/9] rebase -i: learn to abbreviate command names

2017-12-05 Thread Liam Beguin
`git rebase -i` already know how to interpret single-letter command
names. Teach it to generate the todo list with these same abbreviated
names.

Based-on-patch-by: Johannes Schindelin 
Signed-off-by: Liam Beguin 
---
 Documentation/rebase-config.txt | 20 
 builtin/rebase--helper.c|  3 +++
 sequencer.c | 16 ++--
 sequencer.h |  1 +
 4 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/Documentation/rebase-config.txt b/Documentation/rebase-config.txt
index 30ae08cb5a4b..42e1ba757564 100644
--- a/Documentation/rebase-config.txt
+++ b/Documentation/rebase-config.txt
@@ -30,3 +30,23 @@ rebase.instructionFormat::
A format string, as specified in linkgit:git-log[1], to be used for the
todo list during an interactive rebase.  The format will
automatically have the long commit hash prepended to the format.
+
+rebase.abbreviateCommands::
+   If set to true, `git rebase` will use abbreviated command names in the
+   todo list resulting in something like this:
++
+---
+   p deadbee The oneline of the commit
+   p fa1afe1 The oneline of the next commit
+   ...
+---
++
+instead of:
++
+---
+   pick deadbee The oneline of the commit
+   pick fa1afe1 The oneline of the next commit
+   ...
+---
++
+Defaults to false.
diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 4229ea0dc122..7daee544b7b4 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -13,6 +13,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
 {
struct replay_opts opts = REPLAY_OPTS_INIT;
unsigned flags = 0, keep_empty = 0;
+   int abbreviate_commands = 0;
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH,
@@ -43,6 +44,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
};
 
git_config(git_default_config, NULL);
+   git_config_get_bool("rebase.abbreviatecommands", _commands);
 
opts.action = REPLAY_INTERACTIVE_REBASE;
opts.allow_ff = 1;
@@ -52,6 +54,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
builtin_rebase_helper_usage, PARSE_OPT_KEEP_ARGV0);
 
flags |= keep_empty ? TODO_LIST_KEEP_EMPTY : 0;
+   flags |= abbreviate_commands ? TODO_LIST_ABBREVIATE_CMDS : 0;
flags |= command == SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;
 
if (command == CONTINUE && argc == 1)
diff --git a/sequencer.c b/sequencer.c
index 892d242f6966..115085d39ca8 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -795,6 +795,13 @@ static const char *command_to_string(const enum 
todo_command command)
die("Unknown command: %d", command);
 }
 
+static const char command_to_char(const enum todo_command command)
+{
+   if (command < TODO_COMMENT && todo_command_info[command].c)
+   return todo_command_info[command].c;
+   return comment_line_char;
+}
+
 static int is_noop(const enum todo_command command)
 {
return TODO_NOOP <= command;
@@ -2453,6 +2460,7 @@ int sequencer_make_script(FILE *out, int argc, const char 
**argv,
struct rev_info revs;
struct commit *commit;
int keep_empty = flags & TODO_LIST_KEEP_EMPTY;
+   const char *insn = flags & TODO_LIST_ABBREVIATE_CMDS ? "p" : "pick";
 
init_revisions(, NULL);
revs.verbose_header = 1;
@@ -2485,7 +2493,8 @@ int sequencer_make_script(FILE *out, int argc, const char 
**argv,
strbuf_reset();
if (!keep_empty && is_original_commit_empty(commit))
strbuf_addf(, "%c ", comment_line_char);
-   strbuf_addf(, "pick %s ", oid_to_hex(>object.oid));
+   strbuf_addf(, "%s %s ", insn,
+   oid_to_hex(>object.oid));
pretty_print_commit(, commit, );
strbuf_addch(, '\n');
fputs(buf.buf, out);
@@ -2558,7 +2567,10 @@ int transform_todos(unsigned flags)
}
 
/* add command to the buffer */
-   strbuf_addstr(, command_to_string(item->command));
+   if (flags & TODO_LIST_ABBREVIATE_CMDS)
+   strbuf_addch(, command_to_char(item->command));
+   else
+   strbuf_addstr(, command_to_string(item->command));
 
/* add commit id */
if (item->commit) {
diff --git a/sequencer.h b/sequencer.h
index 212426c44548..81f6d7d393fd 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -47,6 +47,7 @@ int sequencer_remove_state(struct replay_opts 

[PATCH v3 1/9] Documentation: move rebase.* configs to new file

2017-12-05 Thread Liam Beguin
Move all rebase.* configuration variables to a separate file in order to
remove duplicates, and include it in config.txt and git-rebase.txt.  The
new descriptions are mostly taken from config.txt as they are more
verbose.

Signed-off-by: Liam Beguin 
---
 Documentation/config.txt| 31 +--
 Documentation/git-rebase.txt| 19 +--
 Documentation/rebase-config.txt | 32 
 3 files changed, 34 insertions(+), 48 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 531649cb40ea..e424b7de90b5 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2691,36 +2691,7 @@ push.recurseSubmodules::
is retained. You may override this configuration at time of push by
specifying '--recurse-submodules=check|on-demand|no'.
 
-rebase.stat::
-   Whether to show a diffstat of what changed upstream since the last
-   rebase. False by default.
-
-rebase.autoSquash::
-   If set to true enable `--autosquash` option by default.
-
-rebase.autoStash::
-   When set to true, automatically create a temporary stash entry
-   before the operation begins, and apply it after the operation
-   ends.  This means that you can run rebase on a dirty worktree.
-   However, use with care: the final stash application after a
-   successful rebase might result in non-trivial conflicts.
-   Defaults to false.
-
-rebase.missingCommitsCheck::
-   If set to "warn", git rebase -i will print a warning if some
-   commits are removed (e.g. a line was deleted), however the
-   rebase will still proceed. If set to "error", it will print
-   the previous warning and stop the rebase, 'git rebase
-   --edit-todo' can then be used to correct the error. If set to
-   "ignore", no checking is done.
-   To drop a commit without warning or error, use the `drop`
-   command in the todo-list.
-   Defaults to "ignore".
-
-rebase.instructionFormat::
-   A format string, as specified in linkgit:git-log[1], to be used for
-   the instruction list during an interactive rebase.  The format will 
automatically
-   have the long commit hash prepended to the format.
+include::rebase-config.txt[]
 
 receive.advertiseAtomic::
By default, git-receive-pack will advertise the atomic push
diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 3cedfb0fd22b..8a861c1e0d69 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -203,24 +203,7 @@ Alternatively, you can undo the 'git rebase' with
 CONFIGURATION
 -
 
-rebase.stat::
-   Whether to show a diffstat of what changed upstream since the last
-   rebase. False by default.
-
-rebase.autoSquash::
-   If set to true enable `--autosquash` option by default.
-
-rebase.autoStash::
-   If set to true enable `--autostash` option by default.
-
-rebase.missingCommitsCheck::
-   If set to "warn", print warnings about removed commits in
-   interactive mode. If set to "error", print the warnings and
-   stop the rebase. If set to "ignore", no checking is
-   done. "ignore" by default.
-
-rebase.instructionFormat::
-   Custom commit list format to use during an `--interactive` rebase.
+include::rebase-config.txt[]
 
 OPTIONS
 ---
diff --git a/Documentation/rebase-config.txt b/Documentation/rebase-config.txt
new file mode 100644
index ..dba088d7c68f
--- /dev/null
+++ b/Documentation/rebase-config.txt
@@ -0,0 +1,32 @@
+rebase.stat::
+   Whether to show a diffstat of what changed upstream since the last
+   rebase. False by default.
+
+rebase.autoSquash::
+   If set to true enable `--autosquash` option by default.
+
+rebase.autoStash::
+   When set to true, automatically create a temporary stash entry
+   before the operation begins, and apply it after the operation
+   ends.  This means that you can run rebase on a dirty worktree.
+   However, use with care: the final stash application after a
+   successful rebase might result in non-trivial conflicts.
+   This option can be overridden by the `--no-autostash` and
+   `--autostash` options of linkgit:git-rebase[1].
+   Defaults to false.
+
+rebase.missingCommitsCheck::
+   If set to "warn", git rebase -i will print a warning if some
+   commits are removed (e.g. a line was deleted), however the
+   rebase will still proceed. If set to "error", it will print
+   the previous warning and stop the rebase, 'git rebase
+   --edit-todo' can then be used to correct the error. If set to
+   "ignore", no checking is done.
+   To drop a commit without warning or error, use the `drop`
+   command in the todo-list.
+   Defaults to "ignore".
+
+rebase.instructionFormat::
+   A format string, as specified in linkgit:git-log[1], to be used for the
+   

[PATCH v3 3/9] rebase -i: set commit to null in exec commands

2017-12-05 Thread Liam Beguin
Make sure commit is set to NULL when parsing exec instructions
from the todo list. If not, we may try to access an uninitialized
address later while updating the todo list.

Signed-off-by: Liam Beguin 
---
 sequencer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sequencer.c b/sequencer.c
index fa94ed652d2c..5033b049d995 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1268,6 +1268,7 @@ static int parse_insn_line(struct todo_item *item, const 
char *bol, char *eol)
bol += padding;
 
if (item->command == TODO_EXEC) {
+   item->commit = NULL;
item->arg = bol;
item->arg_len = (int)(eol - bol);
return 0;
-- 
2.15.1.280.g10402c1f5b5c



Re: [PATCH] Documentation/git-clone: improve description for submodule recursing

2017-12-05 Thread Junio C Hamano
Stefan Beller  writes:

> There have been a few complaints on the mailing list that git-clone doesn't
> respect the `submodule.recurse` setting, which every other command (that
> potentially knows how to deal with submodules) respects.  In case of clone
> this is not beneficial to respect as the user may not want to obtain all
> submodules (assuming a pathspec of '.').
>
> Improve the documentation such that the pathspec is mentioned in the
> synopsis to alleviate the confusion around the submodule recursion flag
> in git-clone.
>
> While at it clarify that the option can be given multiple times for complex\
> pathspecs.

Well written (modulo the backslash there, which I can easily remove
while queuing).

>
> Signed-off-by: Stefan Beller 
> ---
>  Documentation/git-clone.txt | 19 +++
>  1 file changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
> index 83c8e9b394..42ca7b5095 100644
> --- a/Documentation/git-clone.txt
> +++ b/Documentation/git-clone.txt
> @@ -14,7 +14,7 @@ SYNOPSIS
> [-o ] [-b ] [-u ] [--reference ]
> [--dissociate] [--separate-git-dir ]
> [--depth ] [--[no-]single-branch] [--no-tags]
> -   [--recurse-submodules] [--[no-]shallow-submodules]
> +   [--recurse-submodules[=]] [--[no-]shallow-submodules]
> [--jobs ] [--]  []
>  
>  DESCRIPTION
> @@ -231,14 +231,17 @@ branch of some repository for search indexing.
>   After the clone is created, initialize and clone submodules
>   within based on the provided pathspec.  If no pathspec is
>   provided, all submodules are initialized and cloned.
> - Submodules are initialized and cloned using their default
> - settings.  The resulting clone has `submodule.active` set to
> + This option can be given multiple times for pathspecs consisting
> + of multiple entries.  The resulting clone has `submodule.active` set to
>   the provided pathspec, or "." (meaning all submodules) if no
> - pathspec is provided.  This is equivalent to running
> - `git submodule update --init --recursive` immediately after
> - the clone is finished. This option is ignored if the cloned
> - repository does not have a worktree/checkout (i.e. if any of
> - `--no-checkout`/`-n`, `--bare`, or `--mirror` is given)
> + pathspec is provided.
> ++
> +Submodules are initialized and cloned using their default settings. This is
> +equivalent to running
> +`git submodule update --init --recursive ` immediately after
> +the clone is finished. This option is ignored if the cloned repository does
> +not have a worktree/checkout (i.e. if any of `--no-checkout`/`-n`, `--bare`,
> +or `--mirror` is given)
>  
>  --[no-]shallow-submodules::
>   All submodules which are cloned will be shallow with a depth of 1.


[PATCH v6 01/14] upload-pack: add object filtering for partial clone

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Teach upload-pack to negotiate object filtering over the protocol and
to send filter parameters to pack-objects.  This is intended for partial
clone and fetch.

The idea to make upload-pack configurable using uploadpack.allowFilter
comes from Jonathan Tan's work in [1].

[1] 
https://public-inbox.org/git/f211093280b422c32cc1b7034130072f35c5ed51.1506714999.git.jonathanta...@google.com/

Signed-off-by: Jeff Hostetler 
---
 Documentation/config.txt  |  4 
 Documentation/technical/pack-protocol.txt |  8 +++
 Documentation/technical/protocol-capabilities.txt |  8 +++
 upload-pack.c | 26 ++-
 4 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1ac0ae6..e528210 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3268,6 +3268,10 @@ uploadpack.packObjectsHook::
was run. I.e., `upload-pack` will feed input intended for
`pack-objects` to the hook, and expects a completed packfile on
stdout.
+
+uploadpack.allowFilter::
+   If this option is set, `upload-pack` will advertise partial
+   clone and partial fetch object filtering.
 +
 Note that this configuration variable is ignored if it is seen in the
 repository-level config (this is a safety measure against fetching from
diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index ed1eae8..a43a113 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -212,6 +212,7 @@ out of what the server said it could do with the first 
'want' line.
   upload-request=  want-list
   *shallow-line
   *1depth-request
+  [filter-request]
   flush-pkt
 
   want-list =  first-want
@@ -227,6 +228,8 @@ out of what the server said it could do with the first 
'want' line.
   additional-want   =  PKT-LINE("want" SP obj-id)
 
   depth =  1*DIGIT
+
+  filter-request=  PKT-LINE("filter" SP filter-spec)
 
 
 Clients MUST send all the obj-ids it wants from the reference
@@ -249,6 +252,11 @@ complete those commits. Commits whose parents are not 
received as a
 result are defined as shallow and marked as such in the server. This
 information is sent back to the client in the next step.
 
+The client can optionally request that pack-objects omit various
+objects from the packfile using one of several filtering techniques.
+These are intended for use with partial clone and partial fetch
+operations.  See `rev-list` for possible "filter-spec" values.
+
 Once all the 'want's and 'shallow's (and optional 'deepen') are
 transferred, clients MUST send a flush-pkt, to tell the server side
 that it is done sending the list.
diff --git a/Documentation/technical/protocol-capabilities.txt 
b/Documentation/technical/protocol-capabilities.txt
index 26dcc6f..332d209 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -309,3 +309,11 @@ to accept a signed push certificate, and asks the  
to be
 included in the push certificate.  A send-pack client MUST NOT
 send a push-cert packet unless the receive-pack server advertises
 this capability.
+
+filter
+--
+
+If the upload-pack server advertises the 'filter' capability,
+fetch-pack may send "filter" commands to request a partial clone
+or partial fetch and request that the server omit various objects
+from the packfile.
diff --git a/upload-pack.c b/upload-pack.c
index e25f725..e6d38b9 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -10,6 +10,8 @@
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
+#include "list-objects-filter.h"
+#include "list-objects-filter-options.h"
 #include "run-command.h"
 #include "connect.h"
 #include "sigchain.h"
@@ -18,6 +20,7 @@
 #include "parse-options.h"
 #include "argv-array.h"
 #include "prio-queue.h"
+#include "quote.h"
 
 static const char * const upload_pack_usage[] = {
N_("git upload-pack [] "),
@@ -64,6 +67,10 @@ static int advertise_refs;
 static int stateless_rpc;
 static const char *pack_objects_hook;
 
+static int filter_capability_requested;
+static int filter_advertise;
+static struct list_objects_filter_options filter_options;
+
 static void reset_timeout(void)
 {
alarm(timeout);
@@ -131,6 +138,12 @@ static void create_pack_file(void)
argv_array_push(_objects.args, "--delta-base-offset");
if (use_include_tag)
argv_array_push(_objects.args, "--include-tag");
+   if (filter_options.filter_spec) {
+   struct strbuf buf = STRBUF_INIT;
+   sq_quote_buf(, filter_options.filter_spec);
+   argv_array_pushf(_objects.args, "--filter=%s", buf.buf);
+   

[PATCH v6 04/14] fetch-pack: test support excluding large blobs

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

Created tests to verify fetch-pack and upload-pack support
for excluding large blobs using --filter=blobs:limit=
parameter.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 t/t5500-fetch-pack.sh | 27 +++
 upload-pack.c | 13 +
 2 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 80a1a32..c57916b 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -755,4 +755,31 @@ test_expect_success 'fetching deepen' '
)
 '
 
+test_expect_success 'filtering by size' '
+   rm -rf server client &&
+   test_create_repo server &&
+   test_commit -C server one &&
+   test_config -C server uploadpack.allowfilter 1 &&
+
+   test_create_repo client &&
+   git -C client fetch-pack --filter=blob:limit=0 ../server HEAD &&
+
+   # Ensure that object is not inadvertently fetched
+   test_must_fail git -C client cat-file -e $(git hash-object server/one.t)
+'
+
+test_expect_success 'filtering by size has no effect if support for it is not 
advertised' '
+   rm -rf server client &&
+   test_create_repo server &&
+   test_commit -C server one &&
+
+   test_create_repo client &&
+   git -C client fetch-pack --filter=blob:limit=0 ../server HEAD 2> err &&
+
+   # Ensure that object is fetched
+   git -C client cat-file -e $(git hash-object server/one.t) &&
+
+   test_i18ngrep "filtering not recognized by server" err
+'
+
 test_done
diff --git a/upload-pack.c b/upload-pack.c
index e6d38b9..15b6605 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -139,10 +139,15 @@ static void create_pack_file(void)
if (use_include_tag)
argv_array_push(_objects.args, "--include-tag");
if (filter_options.filter_spec) {
-   struct strbuf buf = STRBUF_INIT;
-   sq_quote_buf(, filter_options.filter_spec);
-   argv_array_pushf(_objects.args, "--filter=%s", buf.buf);
-   strbuf_release();
+   if (pack_objects.use_shell) {
+   struct strbuf buf = STRBUF_INIT;
+   sq_quote_buf(, filter_options.filter_spec);
+   argv_array_pushf(_objects.args, "--filter=%s", 
buf.buf);
+   strbuf_release();
+   } else {
+   argv_array_pushf(_objects.args, "--filter=%s",
+filter_options.filter_spec);
+   }
}
 
pack_objects.in = -1;
-- 
2.9.3



[PATCH v6 08/14] fixup: fetch: update --blob-max-bytes to --fitler

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
 t/t5500-fetch-pack.sh | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 65fc7bb..ec9ba9b 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -782,7 +782,7 @@ test_expect_success 'filtering by size has no effect if 
support for it is not ad
test_i18ngrep "filtering not recognized by server" err
 '
 
-fetch_blob_max_bytes () {
+fetch_filter_blob_limit_zero () {
SERVER="$1"
URL="$2"
 
@@ -804,15 +804,15 @@ fetch_blob_max_bytes () {
test_must_fail git -C client cat-file -e $(git hash-object 
"$SERVER/two.t")
 }
 
-test_expect_success 'fetch with --blob-max-bytes' '
-   fetch_blob_max_bytes server server
+test_expect_success 'fetch with --filter=blob:limit=0' '
+   fetch_filter_blob_limit_zero server server
 '
 
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-test_expect_success 'fetch with --blob-max-bytes and HTTP' '
-   fetch_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server"
+test_expect_success 'fetch with --filter=blob:limit=0 and HTTP' '
+   fetch_filter_blob_limit_zero "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server"
 '
 
 stop_httpd
-- 
2.9.3



[PATCH v6 07/14] fixup: fetch: update error messages from --blob-max-bytes to --filter

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
 builtin/fetch.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 05d0b1a..14aab71 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1372,7 +1372,7 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
deepen = 1;
 
if (filter_options.choice && !repository_format_partial_clone)
-   die("--blob-max-bytes can only be used when 
extensions.partialClone is set");
+   die("--filter can only be used when extensions.partialClone is 
set");
 
if (all) {
if (argc == 1)
@@ -1406,11 +1406,11 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
if (remote) {
if (filter_options.choice &&
strcmp(remote->name, repository_format_partial_clone))
-   die(_("--blob-max-bytes can only be used with the 
remote configured in core.partialClone"));
+   die(_("--filter can only be used with the remote 
configured in core.partialClone"));
result = fetch_one(remote, argc, argv);
} else {
if (filter_options.choice)
-   die(_("--blob-max-bytes can only be used with the 
remote configured in core.partialClone"));
+   die(_("--filter can only be used with the remote 
configured in core.partialClone"));
result = fetch_multiple();
}
 
-- 
2.9.3



[PATCH v6 09/14] fixup: connected: conditionally pass --exclude-promisor-objects to rev-list

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Teach connected.c to only pass --exclude-promisor-objects to rev-list when
partial clone is enabled.

Signed-off-by: Jeff Hostetler 
---
 connected.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/connected.c b/connected.c
index a51c01d..3a5bd67 100644
--- a/connected.c
+++ b/connected.c
@@ -56,7 +56,8 @@ int check_connected(sha1_iterate_fn fn, void *cb_data,
argv_array_push(_list.args,"rev-list");
argv_array_push(_list.args, "--objects");
argv_array_push(_list.args, "--stdin");
-   argv_array_push(_list.args, "--exclude-promisor-objects");
+   if (repository_format_partial_clone)
+   argv_array_push(_list.args, "--exclude-promisor-objects");
argv_array_push(_list.args, "--not");
argv_array_push(_list.args, "--all");
argv_array_push(_list.args, "--quiet");
-- 
2.9.3



[PATCH v6 10/14] partial-clone: define partial clone settings in config

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Create get and set routines for "partial clone" config settings.
These will be used in a future commit by clone and fetch to
remember the promisor remote and the default filter-spec.

Signed-off-by: Jeff Hostetler 
---
 cache.h   |  1 +
 config.c  |  5 +++
 environment.c |  1 +
 list-objects-filter-options.c | 90 +++
 list-objects-filter-options.h |  6 +++
 5 files changed, 88 insertions(+), 15 deletions(-)

diff --git a/cache.h b/cache.h
index 6980072..bccc510 100644
--- a/cache.h
+++ b/cache.h
@@ -861,6 +861,7 @@ extern int grafts_replace_parents;
 #define GIT_REPO_VERSION_READ 1
 extern int repository_format_precious_objects;
 extern char *repository_format_partial_clone;
+extern const char *core_partial_clone_filter_default;
 
 struct repository_format {
int version;
diff --git a/config.c b/config.c
index adb7d7a..adeee04 100644
--- a/config.c
+++ b/config.c
@@ -1241,6 +1241,11 @@ static int git_default_core_config(const char *var, 
const char *value)
return 0;
}
 
+   if (!strcmp(var, "core.partialclonefilter")) {
+   return git_config_string(_partial_clone_filter_default,
+var, value);
+   }
+
/* Add other config variables here and to Documentation/config.txt. */
return 0;
 }
diff --git a/environment.c b/environment.c
index e52aab3..7537565 100644
--- a/environment.c
+++ b/environment.c
@@ -28,6 +28,7 @@ int warn_on_object_refname_ambiguity = 1;
 int ref_paranoia = -1;
 int repository_format_precious_objects;
 char *repository_format_partial_clone;
+const char *core_partial_clone_filter_default;
 const char *git_commit_encoding;
 const char *git_log_output_encoding;
 const char *apply_default_whitespace;
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 4c5b34e..5c47e2b 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -21,29 +21,36 @@
  * subordinate commands when necessary.  We also "intern" the arg for
  * the convenience of the current command.
  */
-int parse_list_objects_filter(struct list_objects_filter_options 
*filter_options,
- const char *arg)
+static int gently_parse_list_objects_filter(
+   struct list_objects_filter_options *filter_options,
+   const char *arg,
+   struct strbuf *errbuf)
 {
const char *v0;
 
-   if (filter_options->choice)
-   die(_("multiple object filter types cannot be combined"));
+   if (filter_options->choice) {
+   if (errbuf) {
+   strbuf_init(errbuf, 0);
+   strbuf_addstr(
+   errbuf,
+   _("multiple filter-specs cannot be combined"));
+   }
+   return 1;
+   }
 
filter_options->filter_spec = strdup(arg);
 
if (!strcmp(arg, "blob:none")) {
filter_options->choice = LOFC_BLOB_NONE;
return 0;
-   }
 
-   if (skip_prefix(arg, "blob:limit=", )) {
-   if (!git_parse_ulong(v0, _options->blob_limit_value))
-   die(_("invalid filter-spec expression '%s'"), arg);
-   filter_options->choice = LOFC_BLOB_LIMIT;
-   return 0;
-   }
+   } else if (skip_prefix(arg, "blob:limit=", )) {
+   if (git_parse_ulong(v0, _options->blob_limit_value)) {
+   filter_options->choice = LOFC_BLOB_LIMIT;
+   return 0;
+   }
 
-   if (skip_prefix(arg, "sparse:oid=", )) {
+   } else if (skip_prefix(arg, "sparse:oid=", )) {
struct object_context oc;
struct object_id sparse_oid;
 
@@ -57,15 +64,27 @@ int parse_list_objects_filter(struct 
list_objects_filter_options *filter_options
filter_options->sparse_oid_value = oiddup(_oid);
filter_options->choice = LOFC_SPARSE_OID;
return 0;
-   }
 
-   if (skip_prefix(arg, "sparse:path=", )) {
+   } else if (skip_prefix(arg, "sparse:path=", )) {
filter_options->choice = LOFC_SPARSE_PATH;
filter_options->sparse_path_value = strdup(v0);
return 0;
}
 
-   die(_("invalid filter-spec expression '%s'"), arg);
+   if (errbuf) {
+   strbuf_init(errbuf, 0);
+   strbuf_addf(errbuf, "invalid filter-spec '%s'", arg);
+   }
+   memset(filter_options, 0, sizeof(*filter_options));
+   return 1;
+}
+
+int parse_list_objects_filter(struct list_objects_filter_options 
*filter_options,
+ const char *arg)
+{
+   struct strbuf buf = STRBUF_INIT;
+   if (gently_parse_list_objects_filter(filter_options, arg, ))
+   

[PATCH v6 05/14] fetch: refactor calculation of remote list

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

Separate out the calculation of remotes to be fetched from and the
actual fetching. This will allow us to include an additional step before
the actual fetching in a subsequent commit.

Signed-off-by: Jonathan Tan 
---
 builtin/fetch.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 225c734..1b1f039 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1322,7 +1322,7 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
 {
int i;
struct string_list list = STRING_LIST_INIT_DUP;
-   struct remote *remote;
+   struct remote *remote = NULL;
int result = 0;
struct argv_array argv_gc_auto = ARGV_ARRAY_INIT;
 
@@ -1367,17 +1367,14 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
else if (argc > 1)
die(_("fetch --all does not make sense with refspecs"));
(void) for_each_remote(get_one_remote_for_fetch, );
-   result = fetch_multiple();
} else if (argc == 0) {
/* No arguments -- use default remote */
remote = remote_get(NULL);
-   result = fetch_one(remote, argc, argv);
} else if (multiple) {
/* All arguments are assumed to be remotes or groups */
for (i = 0; i < argc; i++)
if (!add_remote_or_group(argv[i], ))
die(_("No such remote or remote group: %s"), 
argv[i]);
-   result = fetch_multiple();
} else {
/* Single remote or group */
(void) add_remote_or_group(argv[0], );
@@ -1385,14 +1382,19 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
/* More than one remote */
if (argc > 1)
die(_("Fetching a group and specifying refspecs 
does not make sense"));
-   result = fetch_multiple();
} else {
/* Zero or one remotes */
remote = remote_get(argv[0]);
-   result = fetch_one(remote, argc-1, argv+1);
+   argc--;
+   argv++;
}
}
 
+   if (remote)
+   result = fetch_one(remote, argc, argv);
+   else
+   result = fetch_multiple();
+
if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
struct argv_array options = ARGV_ARRAY_INIT;
 
-- 
2.9.3



[PATCH v6 06/14] fetch: support filters

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Teach fetch to support filters. This is only allowed for the remote
configured in extensions.partialcloneremote.

Signed-off-by: Jonathan Tan 
---
 builtin/fetch.c   | 23 +--
 connected.c   |  1 +
 remote-curl.c |  6 ++
 t/t5500-fetch-pack.sh | 36 
 4 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 1b1f039..05d0b1a 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -18,6 +18,7 @@
 #include "argv-array.h"
 #include "utf8.h"
 #include "packfile.h"
+#include "list-objects-filter-options.h"
 
 static const char * const builtin_fetch_usage[] = {
N_("git fetch [] [ [...]]"),
@@ -55,6 +56,7 @@ static int recurse_submodules_default = 
RECURSE_SUBMODULES_ON_DEMAND;
 static int shown_url = 0;
 static int refmap_alloc, refmap_nr;
 static const char **refmap_array;
+static struct list_objects_filter_options filter_options;
 
 static int git_fetch_config(const char *k, const char *v, void *cb)
 {
@@ -160,6 +162,7 @@ static struct option builtin_fetch_options[] = {
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
TRANSPORT_FAMILY_IPV6),
+   OPT_PARSE_LIST_OBJECTS_FILTER(_options),
OPT_END()
 };
 
@@ -1044,6 +1047,11 @@ static struct transport *prepare_transport(struct remote 
*remote, int deepen)
set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, "yes");
if (update_shallow)
set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes");
+   if (filter_options.choice) {
+   set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
+  filter_options.filter_spec);
+   set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
+   }
return transport;
 }
 
@@ -1328,6 +1336,8 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
 
packet_trace_identity("fetch");
 
+   fetch_if_missing = 0;
+
/* Record the command line for the reflog */
strbuf_addstr(_rla, "fetch");
for (i = 1; i < argc; i++)
@@ -1361,6 +1371,9 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
if (depth || deepen_since || deepen_not.nr)
deepen = 1;
 
+   if (filter_options.choice && !repository_format_partial_clone)
+   die("--blob-max-bytes can only be used when 
extensions.partialClone is set");
+
if (all) {
if (argc == 1)
die(_("fetch --all does not take a repository 
argument"));
@@ -1390,10 +1403,16 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
}
}
 
-   if (remote)
+   if (remote) {
+   if (filter_options.choice &&
+   strcmp(remote->name, repository_format_partial_clone))
+   die(_("--blob-max-bytes can only be used with the 
remote configured in core.partialClone"));
result = fetch_one(remote, argc, argv);
-   else
+   } else {
+   if (filter_options.choice)
+   die(_("--blob-max-bytes can only be used with the 
remote configured in core.partialClone"));
result = fetch_multiple();
+   }
 
if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
struct argv_array options = ARGV_ARRAY_INIT;
diff --git a/connected.c b/connected.c
index f416b05..a51c01d 100644
--- a/connected.c
+++ b/connected.c
@@ -56,6 +56,7 @@ int check_connected(sha1_iterate_fn fn, void *cb_data,
argv_array_push(_list.args,"rev-list");
argv_array_push(_list.args, "--objects");
argv_array_push(_list.args, "--stdin");
+   argv_array_push(_list.args, "--exclude-promisor-objects");
argv_array_push(_list.args, "--not");
argv_array_push(_list.args, "--all");
argv_array_push(_list.args, "--quiet");
diff --git a/remote-curl.c b/remote-curl.c
index 4318391..6ec5352 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -24,6 +24,7 @@ struct options {
char *deepen_since;
struct string_list deepen_not;
struct string_list push_options;
+   char *filter;
unsigned progress : 1,
check_self_contained_and_connected : 1,
cloning : 1,
@@ -165,6 +166,9 @@ static int set_option(const char *name, const char *value)
} else if (!strcmp(name, "no-dependents")) {
options.no_dependents = 1;
return 0;
+   } else if (!strcmp(name, "filter")) {
+   options.filter = xstrdup(value);;
+   return 0;
} else {
return 1 /* unsupported */;
}
@@ -834,6 +838,8 @@ static int fetch_git(struct discovery *heads,
argv_array_push(, "--from-promisor");
  

[PATCH v6 03/14] fetch-pack: add --no-filter

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Fixup fetch-pack to accept --no-filter to be consistent with
rev-list and pack-objects.

Signed-off-by: Jeff Hostetler 
---
 builtin/fetch-pack.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 7957807..cbf5035 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -157,6 +157,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
parse_list_objects_filter(_options, arg);
continue;
}
+   if (!strcmp(arg, ("--no-" CL_ARG__FILTER))) {
+   list_objects_filter_release(_options);
+   continue;
+   }
usage(fetch_pack_usage);
}
if (deepen_not.nr)
-- 
2.9.3



[PATCH v6 02/14] fetch-pack, index-pack, transport: partial clone

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
 builtin/fetch-pack.c |  4 
 fetch-pack.c | 13 +
 fetch-pack.h |  2 ++
 transport-helper.c   |  5 +
 transport.c  |  4 
 transport.h  |  5 +
 6 files changed, 33 insertions(+)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 15eeed7..7957807 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -153,6 +153,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
args.no_dependents = 1;
continue;
}
+   if (skip_prefix(arg, ("--" CL_ARG__FILTER "="), )) {
+   parse_list_objects_filter(_options, arg);
+   continue;
+   }
usage(fetch_pack_usage);
}
if (deepen_not.nr)
diff --git a/fetch-pack.c b/fetch-pack.c
index 0798e0b..3c5f064 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -29,6 +29,7 @@ static int deepen_not_ok;
 static int fetch_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
 static int agent_supported;
+static int server_supports_filtering;
 static struct lock_file shallow_lock;
 static const char *alternate_shallow_file;
 
@@ -379,6 +380,8 @@ static int find_common(struct fetch_pack_args *args,
if (deepen_not_ok)  strbuf_addstr(, " 
deepen-not");
if (agent_supported)strbuf_addf(, " agent=%s",

git_user_agent_sanitized());
+   if (args->filter_options.choice)
+   strbuf_addstr(, " filter");
packet_buf_write(_buf, "want %s%s\n", remote_hex, 
c.buf);
strbuf_release();
} else
@@ -407,6 +410,9 @@ static int find_common(struct fetch_pack_args *args,
packet_buf_write(_buf, "deepen-not %s", s->string);
}
}
+   if (server_supports_filtering && args->filter_options.choice)
+   packet_buf_write(_buf, "filter %s",
+args->filter_options.filter_spec);
packet_buf_flush(_buf);
state_len = req_buf.len;
 
@@ -969,6 +975,13 @@ static struct ref *do_fetch_pack(struct fetch_pack_args 
*args,
else
prefer_ofs_delta = 0;
 
+   if (server_supports("filter")) {
+   server_supports_filtering = 1;
+   print_verbose(args, _("Server supports filter"));
+   } else if (args->filter_options.choice) {
+   warning("filtering not recognized by server, ignoring");
+   }
+
if ((agent_feature = server_feature_value("agent", _len))) {
agent_supported = 1;
if (agent_len)
diff --git a/fetch-pack.h b/fetch-pack.h
index aeac152..3e224a1 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -3,6 +3,7 @@
 
 #include "string-list.h"
 #include "run-command.h"
+#include "list-objects-filter-options.h"
 
 struct oid_array;
 
@@ -12,6 +13,7 @@ struct fetch_pack_args {
int depth;
const char *deepen_since;
const struct string_list *deepen_not;
+   struct list_objects_filter_options filter_options;
unsigned deepen_relative:1;
unsigned quiet:1;
unsigned keep_pack:1;
diff --git a/transport-helper.c b/transport-helper.c
index c948d52..0650ca0 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -671,6 +671,11 @@ static int fetch(struct transport *transport,
if (data->transport_options.update_shallow)
set_helper_option(transport, "update-shallow", "true");
 
+   if (data->transport_options.filter_options.choice)
+   set_helper_option(
+   transport, "filter",
+   data->transport_options.filter_options.filter_spec);
+
if (data->fetch)
return fetch_with_fetch(transport, nr_heads, to_fetch);
 
diff --git a/transport.c b/transport.c
index f2fbc6f..cce50f5 100644
--- a/transport.c
+++ b/transport.c
@@ -166,6 +166,9 @@ static int set_git_option(struct git_transport_options 
*opts,
} else if (!strcmp(name, TRANS_OPT_NO_DEPENDENTS)) {
opts->no_dependents = !!value;
return 0;
+   } else if (!strcmp(name, TRANS_OPT_LIST_OBJECTS_FILTER)) {
+   parse_list_objects_filter(>filter_options, value);
+   return 0;
}
return 1;
 }
@@ -236,6 +239,7 @@ static int fetch_refs_via_pack(struct transport *transport,
args.update_shallow = data->options.update_shallow;
args.from_promisor = data->options.from_promisor;
args.no_dependents = data->options.no_dependents;
+   args.filter_options = data->options.filter_options;
 
if (!data->got_remote_heads) {
connect_setup(transport, 0);

[PATCH v6 13/14] fetch-pack: restore save_commit_buffer after use

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

In fetch-pack, the global variable save_commit_buffer is set to 0, but
not restored to its original value after use.

In particular, if show_log() (in log-tree.c) is invoked after
fetch_pack() in the same process, show_log() will return before printing
out the commit message (because the invocation to
get_cached_commit_buffer() returns NULL, because the commit buffer was
not saved). I discovered this when attempting to run "git log -S" in a
partial clone, triggering the case where revision walking lazily loads
missing objects.

Therefore, restore save_commit_buffer to its original value after use.

An alternative to solve the problem I had is to replace
get_cached_commit_buffer() with get_commit_buffer(). That invocation was
introduced in commit a97934d ("use get_cached_commit_buffer where
appropriate", 2014-06-13) to replace "commit->buffer" introduced in
commit 3131b71 ("Add "--show-all" revision walker flag for debugging",
2008-02-13). In the latter commit, the commit author seems to be
deciding between not showing an unparsed commit at all and showing an
unparsed commit without the message (which is what the commit does), and
did not mention parsing the unparsed commit, so I prefer to preserve the
existing behavior.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 fetch-pack.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/fetch-pack.c b/fetch-pack.c
index 3c5f064..773453c 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -717,6 +717,7 @@ static int everything_local(struct fetch_pack_args *args,
 {
struct ref *ref;
int retval;
+   int old_save_commit_buffer = save_commit_buffer;
timestamp_t cutoff = 0;
 
save_commit_buffer = 0;
@@ -786,6 +787,9 @@ static int everything_local(struct fetch_pack_args *args,
print_verbose(args, _("already have %s (%s)"), 
oid_to_hex(remote),
  ref->name);
}
+
+   save_commit_buffer = old_save_commit_buffer;
+
return retval;
 }
 
-- 
2.9.3



[PATCH v6 14/14] t5616: end-to-end tests for partial clone

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Additional end-to-end tests for partial clone.

Signed-off-by: Jeff Hostetler 
---
 t/t5616-partial-clone.sh | 125 +++
 1 file changed, 125 insertions(+)
 create mode 100755 t/t5616-partial-clone.sh

diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
new file mode 100755
index 000..fa573f8
--- /dev/null
+++ b/t/t5616-partial-clone.sh
@@ -0,0 +1,125 @@
+#!/bin/sh
+
+test_description='git partial clone'
+
+. ./test-lib.sh
+
+# create a normal "src" repo where we can later create new commits.
+# expect_1.oids will contain a list of the OIDs of all blobs.
+test_expect_success 'setup normal src repo' '
+   echo "{print \$1}" >print_1.awk &&
+   echo "{print \$2}" >print_2.awk &&
+
+   git init src &&
+   for n in 1 2 3 4
+   do
+   echo "This is file: $n" > src/file.$n.txt
+   git -C src add file.$n.txt
+   git -C src commit -m "file $n"
+   git -C src ls-files -s file.$n.txt >>temp
+   done &&
+   awk -f print_2.awk expect_1.oids &&
+   test "$(wc -l observed.oids &&
+   test_cmp expect_1.oids observed.oids &&
+   test "$(git -C pc1 config --local core.repositoryformatversion)" = "1" 
&&
+   test "$(git -C pc1 config --local extensions.partialclone)" = "origin" 
&&
+   test "$(git -C pc1 config --local core.partialclonefilter)" = 
"blob:none"
+'
+
+# checkout master to force dynamic object fetch of blobs at HEAD.
+# confirm we now have the expected blobs in a new packfile.
+test_expect_success 'verify checkout with dynamic object fetch' '
+   git -C pc1 checkout master &&
+   (   cd pc1/.git/objects/pack;
+   git verify-pack -v *.pack
+   ) >temp &&
+   grep blob observed.oids &&
+   test_cmp expect_1.oids observed.oids
+'
+
+# create new commits in "src" repo and push to "srv.bare".
+# repack srv.bare just to make it easy to count the blobs.
+# expect_2.oids will contain a list of the OIDs of all blobs.
+test_expect_success 'push new commits to server' '
+   git -C src remote add srv "file://$(pwd)/srv.bare" &&
+   for x in a b c d
+   do
+   echo "Mod $x" >>src/file.1.txt
+   git -C src add file.1.txt
+   git -C src commit -m "mod $x"
+   done &&
+   git -C src push -u srv master &&
+   git -C srv.bare repack &&
+   (   cd srv.bare/objects/pack;
+   git verify-pack -v *.pack
+   ) >temp &&
+   grep blob expect_2.oids &&
+   test "$(wc -l expect.blame
+'
+
+# fetch in the partial clone repo from the server (the promisor remote).
+# verify that fetch was a "partial fetch".
+# [] that it inherited the filter settings
+# [] that is DOES NOT have the new blobs.
+test_expect_success 'partial fetch inherits filter settings' '
+   git -C pc1 fetch origin &&
+   (   cd pc1/.git/objects/pack;
+   git verify-pack -v *.pack
+   ) >temp &&
+   grep blob observed.oids &&
+   test_cmp expect_1.oids observed.oids
+'
+
+# force dynamic object fetch using diff.
+# we should only get 1 new blob (for the file in origin/master).
+# it should be in a new packfile (since the promisor boundary is
+# currently a packfile, it should not get unpacked upon receipt.)
+test_expect_success 'verify diff causes dynamic object fetch' '
+   test "$(wc -l temp &&
+   grep blob observed.oids &&
+   cat observed.oids &&
+   test "$(wc -l 

[PATCH v6 00/14] Parial clone part 3: clone, fetch, fetch-pack, upload-pack, and tests

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

This is V6 of part 3 of partial clone.  It assumes that V6
of parts 1 and 2 are already present.

This version is refactors and reorders commits to better combine
new functionality and tests.  It contains 3 fixup commits that
should be squashed after some discussion.

Jeff Hostetler (9):
  upload-pack: add object filtering for partial clone
  fetch-pack, index-pack, transport: partial clone
  fetch-pack: add --no-filter
  fetch: support filters
  fixup: fetch: update error messages from --blob-max-bytes to --filter
  fixup: fetch: update --blob-max-bytes to --fitler
  fixup: connected: conditionally pass --exclude-promisor-objects to
rev-list
  partial-clone: define partial clone settings in config
  t5616: end-to-end tests for partial clone

Jonathan Tan (5):
  fetch-pack: test support excluding large blobs
  fetch: refactor calculation of remote list
  clone: partial clone
  unpack-trees: batch fetching of missing blobs
  fetch-pack: restore save_commit_buffer after use

 Documentation/config.txt  |   4 +
 Documentation/technical/pack-protocol.txt |   8 ++
 Documentation/technical/protocol-capabilities.txt |   8 ++
 builtin/clone.c   |  22 +++-
 builtin/fetch-pack.c  |   8 ++
 builtin/fetch.c   |  33 --
 cache.h   |   1 +
 config.c  |   5 +
 connected.c   |   2 +
 environment.c |   1 +
 fetch-object.c|  26 -
 fetch-object.h|   5 +
 fetch-pack.c  |  17 +++
 fetch-pack.h  |   2 +
 list-objects-filter-options.c |  90 +---
 list-objects-filter-options.h |   6 ++
 remote-curl.c |   6 ++
 t/t5500-fetch-pack.sh |  63 +++
 t/t5601-clone.sh  | 101 +
 t/t5616-partial-clone.sh  | 125 ++
 transport-helper.c|   5 +
 transport.c   |   4 +
 transport.h   |   5 +
 unpack-trees.c|  22 
 upload-pack.c |  31 +-
 25 files changed, 572 insertions(+), 28 deletions(-)
 create mode 100755 t/t5616-partial-clone.sh

-- 
2.9.3



[PATCH v6 12/14] unpack-trees: batch fetching of missing blobs

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

When running checkout, first prefetch all blobs that are to be updated
but are missing. This means that only one pack is downloaded during such
operations, instead of one per missing blob.

This operates only on the blob level - if a repository has a missing
tree, they are still fetched one at a time.

This does not use the delayed checkout mechanism introduced in commit
2841e8f ("convert: add "status=delayed" to filter process protocol",
2017-06-30) due to significant conceptual differences - in particular,
for partial clones, we already know what needs to be fetched based on
the contents of the local repo alone, whereas for status=delayed, it is
the filter process that tells us what needs to be checked in the end.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 fetch-object.c   | 26 ++
 fetch-object.h   |  5 +
 t/t5601-clone.sh | 52 
 unpack-trees.c   | 22 ++
 4 files changed, 101 insertions(+), 4 deletions(-)

diff --git a/fetch-object.c b/fetch-object.c
index 258fcfa..853624f 100644
--- a/fetch-object.c
+++ b/fetch-object.c
@@ -5,11 +5,10 @@
 #include "transport.h"
 #include "fetch-object.h"
 
-void fetch_object(const char *remote_name, const unsigned char *sha1)
+static void fetch_refs(const char *remote_name, struct ref *ref)
 {
struct remote *remote;
struct transport *transport;
-   struct ref *ref;
int original_fetch_if_missing = fetch_if_missing;
 
fetch_if_missing = 0;
@@ -18,10 +17,29 @@ void fetch_object(const char *remote_name, const unsigned 
char *sha1)
die(_("Remote with no URL"));
transport = transport_get(remote, remote->url[0]);
 
-   ref = alloc_ref(sha1_to_hex(sha1));
-   hashcpy(ref->old_oid.hash, sha1);
transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
transport_fetch_refs(transport, ref);
fetch_if_missing = original_fetch_if_missing;
 }
+
+void fetch_object(const char *remote_name, const unsigned char *sha1)
+{
+   struct ref *ref = alloc_ref(sha1_to_hex(sha1));
+   hashcpy(ref->old_oid.hash, sha1);
+   fetch_refs(remote_name, ref);
+}
+
+void fetch_objects(const char *remote_name, const struct oid_array *to_fetch)
+{
+   struct ref *ref = NULL;
+   int i;
+
+   for (i = 0; i < to_fetch->nr; i++) {
+   struct ref *new_ref = alloc_ref(oid_to_hex(_fetch->oid[i]));
+   oidcpy(_ref->old_oid, _fetch->oid[i]);
+   new_ref->next = ref;
+   ref = new_ref;
+   }
+   fetch_refs(remote_name, ref);
+}
diff --git a/fetch-object.h b/fetch-object.h
index f371300..4b269d0 100644
--- a/fetch-object.h
+++ b/fetch-object.h
@@ -1,6 +1,11 @@
 #ifndef FETCH_OBJECT_H
 #define FETCH_OBJECT_H
 
+#include "sha1-array.h"
+
 extern void fetch_object(const char *remote_name, const unsigned char *sha1);
 
+extern void fetch_objects(const char *remote_name,
+ const struct oid_array *to_fetch);
+
 #endif
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 6d37c6d..13610b7 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -611,6 +611,58 @@ test_expect_success 'partial clone: warn if server does 
not support object filte
test_i18ngrep "filtering not recognized by server" err
 '
 
+test_expect_success 'batch missing blob request during checkout' '
+   rm -rf server client &&
+
+   test_create_repo server &&
+   echo a >server/a &&
+   echo b >server/b &&
+   git -C server add a b &&
+
+   git -C server commit -m x &&
+   echo aa >server/a &&
+   echo bb >server/b &&
+   git -C server add a b &&
+   git -C server commit -m x &&
+
+   test_config -C server uploadpack.allowfilter 1 &&
+   test_config -C server uploadpack.allowanysha1inwant 1 &&
+
+   git clone --filter=blob:limit=0 "file://$(pwd)/server" client &&
+
+   # Ensure that there is only one negotiation by checking that there is
+   # only "done" line sent. ("done" marks the end of negotiation.)
+   GIT_TRACE_PACKET="$(pwd)/trace" git -C client checkout HEAD^ &&
+   grep "git> done" trace >done_lines &&
+   test_line_count = 1 done_lines
+'
+
+test_expect_success 'batch missing blob request does not inadvertently try to 
fetch gitlinks' '
+   rm -rf server client &&
+
+   test_create_repo repo_for_submodule &&
+   test_commit -C repo_for_submodule x &&
+
+   test_create_repo server &&
+   echo a >server/a &&
+   echo b >server/b &&
+   git -C server add a b &&
+   git -C server commit -m x &&
+
+   echo aa >server/a &&
+   echo bb >server/b &&
+   # Also add a gitlink pointing to an arbitrary repository
+   git -C server submodule add 

[PATCH v6 11/14] clone: partial clone

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 builtin/clone.c  | 22 --
 t/t5601-clone.sh | 49 +
 2 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index dbddd98..f519bd4 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -26,6 +26,7 @@
 #include "run-command.h"
 #include "connected.h"
 #include "packfile.h"
+#include "list-objects-filter-options.h"
 
 /*
  * Overall FIXMEs:
@@ -60,6 +61,7 @@ static struct string_list option_optional_reference = 
STRING_LIST_INIT_NODUP;
 static int option_dissociate;
 static int max_jobs = -1;
 static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
+static struct list_objects_filter_options filter_options;
 
 static int recurse_submodules_cb(const struct option *opt,
 const char *arg, int unset)
@@ -135,6 +137,7 @@ static struct option builtin_clone_options[] = {
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
TRANSPORT_FAMILY_IPV6),
+   OPT_PARSE_LIST_OBJECTS_FILTER(_options),
OPT_END()
 };
 
@@ -886,6 +889,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
struct refspec *refspec;
const char *fetch_pattern;
 
+   fetch_if_missing = 0;
+
packet_trace_identity("clone");
argc = parse_options(argc, argv, prefix, builtin_clone_options,
 builtin_clone_usage, 0);
@@ -1073,6 +1078,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
warning(_("--shallow-since is ignored in local clones; 
use file:// instead."));
if (option_not.nr)
warning(_("--shallow-exclude is ignored in local 
clones; use file:// instead."));
+   if (filter_options.choice)
+   warning(_("--filter is ignored in local clones; use 
file:// instead."));
if (!access(mkpath("%s/shallow", path), F_OK)) {
if (option_local > 0)
warning(_("source repository is shallow, 
ignoring --local"));
@@ -1104,7 +,13 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
transport_set_option(transport, TRANS_OPT_UPLOADPACK,
 option_upload_pack);
 
-   if (transport->smart_options && !deepen)
+   if (filter_options.choice) {
+   transport_set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
+filter_options.filter_spec);
+   transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
+   }
+
+   if (transport->smart_options && !deepen && !filter_options.choice)
transport->smart_options->check_self_contained_and_connected = 
1;
 
refs = transport_get_remote_refs(transport);
@@ -1164,13 +1177,17 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
write_refspec_config(src_ref_prefix, our_head_points_at,
remote_head_points_at, _top);
 
+   if (filter_options.choice)
+   partial_clone_register("origin", _options);
+
if (is_local)
clone_local(path, git_dir);
else if (refs && complete_refs_before_fetch)
transport_fetch_refs(transport, mapped_refs);
 
update_remote_refs(refs, mapped_refs, remote_head_points_at,
-  branch_top.buf, reflog_msg.buf, transport, 
!is_local);
+  branch_top.buf, reflog_msg.buf, transport,
+  !is_local && !filter_options.choice);
 
update_head(our_head_points_at, remote_head, reflog_msg.buf);
 
@@ -1191,6 +1208,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
}
 
junk_mode = JUNK_LEAVE_REPO;
+   fetch_if_missing = 1;
err = checkout(submodule_progress);
 
strbuf_release(_msg);
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 9c56f77..6d37c6d 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -571,4 +571,53 @@ test_expect_success 'GIT_TRACE_PACKFILE produces a usable 
pack' '
git -C replay.git index-pack -v --stdin  err &&
+
+   test_i18ngrep "filtering not recognized by server" err
+'
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'partial clone using HTTP' '
+   partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server"
+'
+
+stop_httpd
+
 test_done
-- 
2.9.3



[PATCH v6 10/12] fixup: sha1_file: add TODO

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
 sha1_file.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/sha1_file.c b/sha1_file.c
index ce67f27..dd956e2 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1196,6 +1196,10 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
/* Check if it is a missing object */
if (fetch_if_missing && repository_format_partial_clone &&
!already_retried) {
+   /*
+* TODO Investigate haveing fetch_object() return
+* TODO error/success and stopping the music here.
+*/
fetch_object(repository_format_partial_clone, real);
already_retried = 1;
continue;
-- 
2.9.3



[PATCH v6 06/12] index-pack: refactor writing of .keep files

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

In a subsequent commit, index-pack will be taught to write ".promisor"
files which are similar to the ".keep" files it knows how to write.
Refactor the writing of ".keep" files, so that the implementation of
writing ".promisor" files becomes easier.

Signed-off-by: Jonathan Tan 
---
 builtin/index-pack.c | 99 
 1 file changed, 53 insertions(+), 46 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 8ec459f..4f305a7 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1389,15 +1389,58 @@ static void fix_unresolved_deltas(struct sha1file *f)
free(sorted_by_pos);
 }
 
+static const char *derive_filename(const char *pack_name, const char *suffix,
+  struct strbuf *buf)
+{
+   size_t len;
+   if (!strip_suffix(pack_name, ".pack", ))
+   die(_("packfile name '%s' does not end with '.pack'"),
+   pack_name);
+   strbuf_add(buf, pack_name, len);
+   strbuf_addch(buf, '.');
+   strbuf_addstr(buf, suffix);
+   return buf->buf;
+}
+
+static void write_special_file(const char *suffix, const char *msg,
+  const char *pack_name, const unsigned char *sha1,
+  const char **report)
+{
+   struct strbuf name_buf = STRBUF_INIT;
+   const char *filename;
+   int fd;
+   int msg_len = strlen(msg);
+
+   if (pack_name)
+   filename = derive_filename(pack_name, suffix, _buf);
+   else
+   filename = odb_pack_name(_buf, sha1, suffix);
+
+   fd = odb_pack_keep(filename);
+   if (fd < 0) {
+   if (errno != EEXIST)
+   die_errno(_("cannot write %s file '%s'"),
+ suffix, filename);
+   } else {
+   if (msg_len > 0) {
+   write_or_die(fd, msg, msg_len);
+   write_or_die(fd, "\n", 1);
+   }
+   if (close(fd) != 0)
+   die_errno(_("cannot close written %s file '%s'"),
+ suffix, filename);
+   *report = suffix;
+   }
+   strbuf_release(_buf);
+}
+
 static void final(const char *final_pack_name, const char *curr_pack_name,
  const char *final_index_name, const char *curr_index_name,
- const char *keep_name, const char *keep_msg,
- unsigned char *sha1)
+ const char *keep_msg, unsigned char *sha1)
 {
const char *report = "pack";
struct strbuf pack_name = STRBUF_INIT;
struct strbuf index_name = STRBUF_INIT;
-   struct strbuf keep_name_buf = STRBUF_INIT;
int err;
 
if (!from_stdin) {
@@ -1409,28 +1452,9 @@ static void final(const char *final_pack_name, const 
char *curr_pack_name,
die_errno(_("error while closing pack file"));
}
 
-   if (keep_msg) {
-   int keep_fd, keep_msg_len = strlen(keep_msg);
-
-   if (!keep_name)
-   keep_name = odb_pack_name(_name_buf, sha1, "keep");
-
-   keep_fd = odb_pack_keep(keep_name);
-   if (keep_fd < 0) {
-   if (errno != EEXIST)
-   die_errno(_("cannot write keep file '%s'"),
- keep_name);
-   } else {
-   if (keep_msg_len > 0) {
-   write_or_die(keep_fd, keep_msg, keep_msg_len);
-   write_or_die(keep_fd, "\n", 1);
-   }
-   if (close(keep_fd) != 0)
-   die_errno(_("cannot close written keep file 
'%s'"),
- keep_name);
-   report = "keep";
-   }
-   }
+   if (keep_msg)
+   write_special_file("keep", keep_msg, final_pack_name, sha1,
+  );
 
if (final_pack_name != curr_pack_name) {
if (!final_pack_name)
@@ -1472,7 +1496,6 @@ static void final(const char *final_pack_name, const char 
*curr_pack_name,
 
strbuf_release(_name);
strbuf_release(_name);
-   strbuf_release(_name_buf);
 }
 
 static int git_index_pack_config(const char *k, const char *v, void *cb)
@@ -1615,26 +1638,13 @@ static void show_pack_info(int stat_only)
}
 }
 
-static const char *derive_filename(const char *pack_name, const char *suffix,
-  struct strbuf *buf)
-{
-   size_t len;
-   if (!strip_suffix(pack_name, ".pack", ))
-   die(_("packfile name '%s' does not end with '.pack'"),
-   pack_name);
-   strbuf_add(buf, pack_name, len);
-   strbuf_addstr(buf, suffix);
-   

[PATCH v6 07/12] introduce fetch-object: fetch one promisor object

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

Introduce fetch-object, providing the ability to fetch one object from a
promisor remote.

This uses fetch-pack. To do this, the transport mechanism has been
updated with 2 flags, "from-promisor" to indicate that the resulting
pack comes from a promisor remote (and thus should be annotated as such
by index-pack), and "no-dependents" to indicate that only the objects
themselves need to be fetched (but fetching additional objects is
nevertheless safe).

Whenever "no-dependents" is used, fetch-pack will refrain from using any
object flags, because it is most likely invoked as part of a dynamic
object fetch by another Git command (which may itself use object flags).
An alternative to this is to leave fetch-pack alone, and instead update
the allocation of flags so that fetch-pack's flags never overlap with
any others, but this will end up shrinking the number of flags available
to nearly every other Git command (that is, every Git command that
accesses objects), so the approach in this commit was used instead.

This will be tested in a subsequent commit.

Signed-off-by: Jonathan Tan 
---
 Documentation/gitremote-helpers.txt |  7 ++
 Makefile|  1 +
 builtin/fetch-pack.c|  8 +++
 builtin/index-pack.c| 16 ++---
 fetch-object.c  | 24 +++
 fetch-object.h  |  6 +
 fetch-pack.c| 48 +
 fetch-pack.h|  8 +++
 remote-curl.c   | 14 ++-
 transport.c |  8 +++
 transport.h | 11 +
 11 files changed, 126 insertions(+), 25 deletions(-)
 create mode 100644 fetch-object.c
 create mode 100644 fetch-object.h

diff --git a/Documentation/gitremote-helpers.txt 
b/Documentation/gitremote-helpers.txt
index 4a584f3..4b8c93e 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -466,6 +466,13 @@ set by Git if the remote helper has the 'option' 
capability.
Transmit  as a push option. As the push option
must not contain LF or NUL characters, the string is not encoded.
 
+'option from-promisor' {'true'|'false'}::
+   Indicate that these objects are being fetched from a promisor.
+
+'option no-dependents' {'true'|'false'}::
+   Indicate that only the objects wanted need to be fetched, not
+   their dependents.
+
 SEE ALSO
 
 linkgit:git-remote[1]
diff --git a/Makefile b/Makefile
index ca378a4..795e0c7 100644
--- a/Makefile
+++ b/Makefile
@@ -792,6 +792,7 @@ LIB_OBJS += ewah/ewah_bitmap.o
 LIB_OBJS += ewah/ewah_io.o
 LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
+LIB_OBJS += fetch-object.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
 LIB_OBJS += gettext.o
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 366b9d1..02abe72 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -143,6 +143,14 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
args.update_shallow = 1;
continue;
}
+   if (!strcmp("--from-promisor", arg)) {
+   args.from_promisor = 1;
+   continue;
+   }
+   if (!strcmp("--no-dependents", arg)) {
+   args.no_dependents = 1;
+   continue;
+   }
usage(fetch_pack_usage);
}
if (deepen_not.nr)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 4f305a7..24c2f05 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1429,14 +1429,16 @@ static void write_special_file(const char *suffix, 
const char *msg,
if (close(fd) != 0)
die_errno(_("cannot close written %s file '%s'"),
  suffix, filename);
-   *report = suffix;
+   if (report)
+   *report = suffix;
}
strbuf_release(_buf);
 }
 
 static void final(const char *final_pack_name, const char *curr_pack_name,
  const char *final_index_name, const char *curr_index_name,
- const char *keep_msg, unsigned char *sha1)
+ const char *keep_msg, const char *promisor_msg,
+ unsigned char *sha1)
 {
const char *report = "pack";
struct strbuf pack_name = STRBUF_INIT;
@@ -1455,6 +1457,9 @@ static void final(const char *final_pack_name, const char 
*curr_pack_name,
if (keep_msg)
write_special_file("keep", keep_msg, final_pack_name, sha1,
   );
+   if (promisor_msg)
+   write_special_file("promisor", promisor_msg, final_pack_name,
+  sha1, NULL);
 

[PATCH v6 12/12] gc: do not repack promisor packfiles

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

Teach gc to stop traversal at promisor objects, and to leave promisor
packfiles alone. This has the effect of only repacking non-promisor
packfiles, and preserves the distinction between promisor packfiles and
non-promisor packfiles.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 Documentation/git-pack-objects.txt | 11 
 builtin/gc.c   |  3 +++
 builtin/pack-objects.c | 37 --
 builtin/prune.c|  7 +
 builtin/repack.c   |  8 --
 t/t0410-partial-clone.sh   | 54 --
 6 files changed, 114 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index aa403d0..81bc490 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -255,6 +255,17 @@ a missing object is encountered.  This is the default 
action.
 The form '--missing=allow-any' will allow object traversal to continue
 if a missing object is encountered.  Missing objects will silently be
 omitted from the results.
++
+The form '--missing=allow-promisor' is like 'allow-any', but will only
+allow object traversal to continue for EXPECTED promisor missing objects.
+Unexpected missing object will raise an error.
+
+--exclude-promisor-objects::
+   Omit objects that are known to be in the promisor remote.  (This
+   option has the purpose of operating only on locally created objects,
+   so that when we repack, we still maintain a distinction between
+   locally created objects [without .promisor] and objects from the
+   promisor remote [with .promisor].)  This is used with partial clone.
 
 SEE ALSO
 
diff --git a/builtin/gc.c b/builtin/gc.c
index 3c5eae0..77fa720 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -458,6 +458,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
argv_array_push(, prune_expire);
if (quiet)
argv_array_push(, "--no-progress");
+   if (repository_format_partial_clone)
+   argv_array_push(,
+   "--exclude-promisor-objects");
if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
return error(FAILED_RUN, prune.argv[0]);
}
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 45ad35d..f5fc401 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -75,6 +75,8 @@ static int use_bitmap_index = -1;
 static int write_bitmap_index;
 static uint16_t write_bitmap_options;
 
+static int exclude_promisor_objects;
+
 static unsigned long delta_cache_size = 0;
 static unsigned long max_delta_cache_size = 256 * 1024 * 1024;
 static unsigned long cache_max_small_delta_size = 1000;
@@ -84,8 +86,9 @@ static unsigned long window_memory_limit = 0;
 static struct list_objects_filter_options filter_options;
 
 enum missing_action {
-   MA_ERROR = 0,/* fail if any missing objects are encountered */
-   MA_ALLOW_ANY,/* silently allow ALL missing objects */
+   MA_ERROR = 0,  /* fail if any missing objects are encountered */
+   MA_ALLOW_ANY,  /* silently allow ALL missing objects */
+   MA_ALLOW_PROMISOR, /* silently allow all missing PROMISOR objects */
 };
 static enum missing_action arg_missing_action;
 static show_object_fn fn_show_object;
@@ -2577,6 +2580,20 @@ static void show_object__ma_allow_any(struct object 
*obj, const char *name, void
show_object(obj, name, data);
 }
 
+static void show_object__ma_allow_promisor(struct object *obj, const char 
*name, void *data)
+{
+   assert(arg_missing_action == MA_ALLOW_PROMISOR);
+
+   /*
+* Quietly ignore EXPECTED missing objects.  This avoids problems with
+* staging them now and getting an odd error later.
+*/
+   if (!has_object_file(>oid) && is_promisor_object(>oid))
+   return;
+
+   show_object(obj, name, data);
+}
+
 static int option_parse_missing_action(const struct option *opt,
   const char *arg, int unset)
 {
@@ -2591,10 +2608,18 @@ static int option_parse_missing_action(const struct 
option *opt,
 
if (!strcmp(arg, "allow-any")) {
arg_missing_action = MA_ALLOW_ANY;
+   fetch_if_missing = 0;
fn_show_object = show_object__ma_allow_any;
return 0;
}
 
+   if (!strcmp(arg, "allow-promisor")) {
+   arg_missing_action = MA_ALLOW_PROMISOR;
+   fetch_if_missing = 0;
+   fn_show_object = show_object__ma_allow_promisor;
+   return 0;
+   }
+
die(_("invalid value for --missing"));
   

[PATCH v6 11/12] rev-list: support termination at promisor objects

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

Teach rev-list to support termination of an object traversal at any
object from a promisor remote (whether one that the local repo also has,
or one that the local repo knows about because it has another promisor
object that references it).

This will be used subsequently in gc and in the connectivity check used
by fetch.

For efficiency, if an object is referenced by a promisor object, and is
in the local repo only as a non-promisor object, object traversal will
not stop there. This is to avoid building the list of promisor object
references.

(In list-objects.c, the case where obj is NULL in process_blob() and
process_tree() do not need to be changed because those happen only when
there is a conflict between the expected type and the existing object.
If the object doesn't exist, an object will be synthesized, which is
fine.)

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 Documentation/rev-list-options.txt |  11 
 builtin/rev-list.c |  71 +++---
 list-objects.c |  29 ++-
 object.c   |   2 +-
 revision.c |  33 +++-
 revision.h |   5 +-
 t/t0410-partial-clone.sh   | 101 +
 7 files changed, 240 insertions(+), 12 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 8d8b7f4..0ce8ccd 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -745,10 +745,21 @@ The form '--missing=allow-any' will allow object 
traversal to continue
 if a missing object is encountered.  Missing objects will silently be
 omitted from the results.
 +
+The form '--missing=allow-promisor' is like 'allow-any', but will only
+allow object traversal to continue for EXPECTED promisor missing objects.
+Unexpected missing objects will raise an error.
++
 The form '--missing=print' is like 'allow-any', but will also print a
 list of the missing objects.  Object IDs are prefixed with a ``?'' character.
 endif::git-rev-list[]
 
+--exclude-promisor-objects::
+   (For internal use only.)  Prefilter object traversal at
+   promisor boundary.  This is used with partial clone.  This is
+   stronger than `--missing=allow-promisor` because it limits the
+   traversal, rather than just silencing errors about missing
+   objects.
+
 --no-walk[=(sorted|unsorted)]::
Only show the given commits, but do not traverse their ancestors.
This has no effect if a range is specified. If the argument
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 547a3e0..48f922d 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -15,6 +15,7 @@
 #include "progress.h"
 #include "reflog-walk.h"
 #include "oidset.h"
+#include "packfile.h"
 
 static const char rev_list_usage[] =
 "git rev-list [OPTION] ... [ -- paths... ]\n"
@@ -67,6 +68,7 @@ enum missing_action {
MA_ERROR = 0,/* fail if any missing objects are encountered */
MA_ALLOW_ANY,/* silently allow ALL missing objects */
MA_PRINT,/* print ALL missing objects in special section */
+   MA_ALLOW_PROMISOR, /* silently allow all missing PROMISOR objects */
 };
 static enum missing_action arg_missing_action;
 
@@ -197,6 +199,12 @@ static void finish_commit(struct commit *commit, void 
*data)
 
 static inline void finish_object__ma(struct object *obj)
 {
+   /*
+* Whether or not we try to dynamically fetch missing objects
+* from the server, we currently DO NOT have the object.  We
+* can either print, allow (ignore), or conditionally allow
+* (ignore) them.
+*/
switch (arg_missing_action) {
case MA_ERROR:
die("missing blob object '%s'", oid_to_hex(>oid));
@@ -209,25 +217,36 @@ static inline void finish_object__ma(struct object *obj)
oidset_insert(_objects, >oid);
return;
 
+   case MA_ALLOW_PROMISOR:
+   if (is_promisor_object(>oid))
+   return;
+   die("unexpected missing blob object '%s'",
+   oid_to_hex(>oid));
+   return;
+
default:
BUG("unhandled missing_action");
return;
}
 }
 
-static void finish_object(struct object *obj, const char *name, void *cb_data)
+static int finish_object(struct object *obj, const char *name, void *cb_data)
 {
struct rev_list_info *info = cb_data;
-   if (obj->type == OBJ_BLOB && !has_object_file(>oid))
+   if (obj->type == OBJ_BLOB && !has_object_file(>oid)) {
finish_object__ma(obj);
+   return 1;
+   }
if (info->revs->verify_objects && !obj->parsed && obj->type != 
OBJ_COMMIT)
parse_object(>oid);
+   return 0;
 

[PATCH v6 08/12] sha1_file: support lazily fetching missing objects

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

Teach sha1_file to fetch objects from the remote configured in
extensions.partialclone whenever an object is requested but missing.

The fetching of objects can be suppressed through a global variable.
This is used by fsck and index-pack.

However, by default, such fetching is not suppressed. This is meant as a
temporary measure to ensure that all Git commands work in such a
situation. Future patches will update some commands to either tolerate
missing objects (without fetching them) or be more efficient in fetching
them.

In order to determine the code changes in sha1_file.c necessary, I
investigated the following:
 (1) functions in sha1_file.c that take in a hash, without the user
 regarding how the object is stored (loose or packed)
 (2) functions in packfile.c (because I need to check callers that know
 about the loose/packed distinction and operate on both differently,
 and ensure that they can handle the concept of objects that are
 neither loose nor packed)

(1) is handled by the modification to sha1_object_info_extended().

For (2), I looked at for_each_packed_object and others.  For
for_each_packed_object, the callers either already work or are fixed in
this patch:
 - reachable - only to find recent objects
 - builtin/fsck - already knows about missing objects
 - builtin/cat-file - warning message added in this commit

Callers of the other functions do not need to be changed:
 - parse_pack_index
   - http - indirectly from http_get_info_packs
   - find_pack_entry_one
 - this searches a single pack that is provided as an argument; the
   caller already knows (through other means) that the sought object
   is in a specific pack
 - find_sha1_pack
   - fast-import - appears to be an optimization to not store a file if
 it is already in a pack
   - http-walker - to search through a struct alt_base
   - http-push - to search through remote packs
 - has_sha1_pack
   - builtin/fsck - already knows about promisor objects
   - builtin/count-objects - informational purposes only (check if loose
 object is also packed)
   - builtin/prune-packed - check if object to be pruned is packed (if
 not, don't prune it)
   - revision - used to exclude packed objects if requested by user
   - diff - just for optimization

Signed-off-by: Jonathan Tan 
---
 builtin/cat-file.c   |  2 ++
 builtin/fetch-pack.c |  2 ++
 builtin/fsck.c   |  3 +++
 builtin/index-pack.c |  6 ++
 cache.h  |  8 
 fetch-object.c   |  3 +++
 sha1_file.c  | 38 
 t/t0410-partial-clone.sh | 51 
 8 files changed, 100 insertions(+), 13 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index f5fa4fd..cf9ea5c 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -475,6 +475,8 @@ static int batch_objects(struct batch_options *opt)
 
for_each_loose_object(batch_loose_object, , 0);
for_each_packed_object(batch_packed_object, , 0);
+   if (repository_format_partial_clone)
+   warning("This repository has extensions.partialClone 
set. Some objects may not be loaded.");
 
cb.opt = opt;
cb.expand = 
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 02abe72..15eeed7 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -53,6 +53,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
struct oid_array shallow = OID_ARRAY_INIT;
struct string_list deepen_not = STRING_LIST_INIT_DUP;
 
+   fetch_if_missing = 0;
+
packet_trace_identity("fetch-pack");
 
memset(, 0, sizeof(args));
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 578a7c8..3b76c0e 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -678,6 +678,9 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
int i;
struct alternate_object_database *alt;
 
+   /* fsck knows how to handle missing promisor objects */
+   fetch_if_missing = 0;
+
errors_found = 0;
check_replace_refs = 0;
 
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 24c2f05..a0a35e6 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1657,6 +1657,12 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
unsigned foreign_nr = 1;/* zero is a "good" value, assume bad */
int report_end_of_input = 0;
 
+   /*
+* index-pack never needs to fetch missing objects, since it only
+* accesses the repo to do hash collision checks
+*/
+   fetch_if_missing = 0;
+
if (argc == 2 && !strcmp(argv[1], "-h"))
usage(index_pack_usage);
 
diff --git a/cache.h b/cache.h
index c76f2e9..6980072 100644
--- a/cache.h
+++ b/cache.h
@@ -1727,6 +1727,14 

[PATCH v6 09/12] fixup: sha1_file: convert gotos to break/continue

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
 sha1_file.c | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index fc7718a..ce67f27 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1180,30 +1180,30 @@ int sha1_object_info_extended(const unsigned char 
*sha1, struct object_info *oi,
}
}
 
-retry:
-   if (find_pack_entry(real, ))
-   goto found_packed;
+   while (1) {
+   if (find_pack_entry(real, ))
+   break;
 
-   /* Most likely it's a loose object. */
-   if (!sha1_loose_object_info(real, oi, flags))
-   return 0;
+   /* Most likely it's a loose object. */
+   if (!sha1_loose_object_info(real, oi, flags))
+   return 0;
 
-   /* Not a loose object; someone else may have just packed it. */
-   reprepare_packed_git();
-   if (find_pack_entry(real, ))
-   goto found_packed;
-
-   /* Check if it is a missing object */
-   if (fetch_if_missing && repository_format_partial_clone &&
-   !already_retried) {
-   fetch_object(repository_format_partial_clone, real);
-   already_retried = 1;
-   goto retry;
-   }
+   /* Not a loose object; someone else may have just packed it. */
+   reprepare_packed_git();
+   if (find_pack_entry(real, ))
+   break;
 
-   return -1;
+   /* Check if it is a missing object */
+   if (fetch_if_missing && repository_format_partial_clone &&
+   !already_retried) {
+   fetch_object(repository_format_partial_clone, real);
+   already_retried = 1;
+   continue;
+   }
+
+   return -1;
+   }
 
-found_packed:
if (oi == _oi)
/*
 * We know that the caller doesn't actually need the
-- 
2.9.3



[PATCH v6 03/12] fsck: support refs pointing to promisor objects

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

Teach fsck to not treat refs referring to missing promisor objects as an
error when extensions.partialclone is set.

For the purposes of warning about no default refs, such refs are still
treated as legitimate refs.

Signed-off-by: Jonathan Tan 
---
 builtin/fsck.c   |  8 
 t/t0410-partial-clone.sh | 24 
 2 files changed, 32 insertions(+)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2934299..ee937bb 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -434,6 +434,14 @@ static int fsck_handle_ref(const char *refname, const 
struct object_id *oid,
 
obj = parse_object(oid);
if (!obj) {
+   if (is_promisor_object(oid)) {
+   /*
+* Increment default_refs anyway, because this is a
+* valid ref.
+*/
+default_refs++;
+return 0;
+   }
error("%s: invalid sha1 pointer %s", refname, oid_to_hex(oid));
errors_found |= ERROR_REACHABLE;
/* We'll continue with the rest despite the error.. */
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 3ddb3b9..bf75162 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -13,6 +13,14 @@ pack_as_from_promisor () {
>repo/.git/objects/pack/pack-$HASH.promisor
 }
 
+promise_and_delete () {
+   HASH=$(git -C repo rev-parse "$1") &&
+   git -C repo tag -a -m message my_annotated_tag "$HASH" &&
+   git -C repo rev-parse my_annotated_tag | pack_as_from_promisor &&
+   git -C repo tag -d my_annotated_tag &&
+   delete_object repo "$HASH"
+}
+
 test_expect_success 'missing reflog object, but promised by a commit, passes 
fsck' '
test_create_repo repo &&
test_commit -C repo my_commit &&
@@ -78,4 +86,20 @@ test_expect_success 'missing reflog object alone fails fsck, 
even with extension
test_must_fail git -C repo fsck
 '
 
+test_expect_success 'missing ref object, but promised, passes fsck' '
+   rm -rf repo &&
+   test_create_repo repo &&
+   test_commit -C repo my_commit &&
+
+   A=$(git -C repo commit-tree -m a HEAD^{tree}) &&
+
+   # Reference $A only from ref
+   git -C repo branch my_branch "$A" &&
+   promise_and_delete "$A" &&
+
+   git -C repo config core.repositoryformatversion 1 &&
+   git -C repo config extensions.partialclone "arbitrary string" &&
+   git -C repo fsck
+'
+
 test_done
-- 
2.9.3



[PATCH v6 05/12] fsck: support promisor objects as CLI argument

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

Teach fsck to not treat missing promisor objects provided on the CLI as
an error when extensions.partialclone is set.

Signed-off-by: Jonathan Tan 
---
 builtin/fsck.c   |  2 ++
 t/t0410-partial-clone.sh | 13 +
 2 files changed, 15 insertions(+)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 4c2a56d..578a7c8 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -750,6 +750,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
struct object *obj = lookup_object(oid.hash);
 
if (!obj || !(obj->flags & HAS_OBJ)) {
+   if (is_promisor_object())
+   continue;
error("%s: object missing", oid_to_hex());
errors_found |= ERROR_OBJECT;
continue;
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 4f9931f..e96f436 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -125,4 +125,17 @@ test_expect_success 'missing object, but promised, passes 
fsck' '
git -C repo fsck
 '
 
+test_expect_success 'missing CLI object, but promised, passes fsck' '
+   rm -rf repo &&
+   test_create_repo repo &&
+   test_commit -C repo my_commit &&
+
+   A=$(git -C repo commit-tree -m a HEAD^{tree}) &&
+   promise_and_delete "$A" &&
+
+   git -C repo config core.repositoryformatversion 1 &&
+   git -C repo config extensions.partialclone "arbitrary string" &&
+   git -C repo fsck "$A"
+'
+
 test_done
-- 
2.9.3



[PATCH v6 04/12] fsck: support referenced promisor objects

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

Teach fsck to not treat missing promisor objects indirectly pointed to
by refs as an error when extensions.partialclone is set.

Signed-off-by: Jonathan Tan 
---
 builtin/fsck.c   | 11 +++
 t/t0410-partial-clone.sh | 23 +++
 2 files changed, 34 insertions(+)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index ee937bb..4c2a56d 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -149,6 +149,15 @@ static int mark_object(struct object *obj, int type, void 
*data, struct fsck_opt
if (obj->flags & REACHABLE)
return 0;
obj->flags |= REACHABLE;
+
+   if (is_promisor_object(>oid))
+   /*
+* Further recursion does not need to be performed on this
+* object since it is a promisor object (so it does not need to
+* be added to "pending").
+*/
+   return 0;
+
if (!(obj->flags & HAS_OBJ)) {
if (parent && !has_object_file(>oid)) {
printf("broken link from %7s %s\n",
@@ -208,6 +217,8 @@ static void check_reachable_object(struct object *obj)
 * do a full fsck
 */
if (!(obj->flags & HAS_OBJ)) {
+   if (is_promisor_object(>oid))
+   return;
if (has_sha1_pack(obj->oid.hash))
return; /* it is in pack - forget about it */
printf("missing %s %s\n", printable_type(obj),
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index bf75162..4f9931f 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -102,4 +102,27 @@ test_expect_success 'missing ref object, but promised, 
passes fsck' '
git -C repo fsck
 '
 
+test_expect_success 'missing object, but promised, passes fsck' '
+   rm -rf repo &&
+   test_create_repo repo &&
+   test_commit -C repo 1 &&
+   test_commit -C repo 2 &&
+   test_commit -C repo 3 &&
+   git -C repo tag -a annotated_tag -m "annotated tag" &&
+
+   C=$(git -C repo rev-parse 1) &&
+   T=$(git -C repo rev-parse 2^{tree}) &&
+   B=$(git hash-object repo/3.t) &&
+   AT=$(git -C repo rev-parse annotated_tag) &&
+
+   promise_and_delete "$C" &&
+   promise_and_delete "$T" &&
+   promise_and_delete "$B" &&
+   promise_and_delete "$AT" &&
+
+   git -C repo config core.repositoryformatversion 1 &&
+   git -C repo config extensions.partialclone "arbitrary string" &&
+   git -C repo fsck
+'
+
 test_done
-- 
2.9.3



[PATCH v6 00/12] Partial clone part 2: fsck and promisors

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

This is V6 of part 2 of partial clone.  This assumes V6 of part 1
is already present.  This version fixes a problem in fetch-pack
observed in V5.  It also contains 2 "fixup" commits that are
WIP responses to comments on V5.

Part 2 is concerned with fsck, gc, initial support for dynamic
object fetching, and tracking promisor objects.  Jonathan Tan
originally developed this code.  I have moved it on top of
part 1 and updated it slightly.


Jeff Hostetler (2):
  fixup: sha1_file: convert gotos to break/continue
  fixup: sha1_file: add TODO

Jonathan Tan (10):
  extension.partialclone: introduce partial clone extension
  fsck: introduce partialclone extension
  fsck: support refs pointing to promisor objects
  fsck: support referenced promisor objects
  fsck: support promisor objects as CLI argument
  index-pack: refactor writing of .keep files
  introduce fetch-object: fetch one promisor object
  sha1_file: support lazily fetching missing objects
  rev-list: support termination at promisor objects
  gc: do not repack promisor packfiles

 Documentation/git-pack-objects.txt |  11 +
 Documentation/gitremote-helpers.txt|   7 +
 Documentation/rev-list-options.txt |  11 +
 Documentation/technical/repository-version.txt |  12 +
 Makefile   |   1 +
 builtin/cat-file.c |   2 +
 builtin/fetch-pack.c   |  10 +
 builtin/fsck.c |  26 +-
 builtin/gc.c   |   3 +
 builtin/index-pack.c   | 113 
 builtin/pack-objects.c |  37 ++-
 builtin/prune.c|   7 +
 builtin/repack.c   |   8 +-
 builtin/rev-list.c |  71 -
 cache.h|  13 +-
 environment.c  |   1 +
 fetch-object.c |  27 ++
 fetch-object.h |   6 +
 fetch-pack.c   |  48 ++--
 fetch-pack.h   |   8 +
 list-objects.c |  29 ++-
 object.c   |   2 +-
 packfile.c |  77 +-
 packfile.h |  13 +
 remote-curl.c  |  14 +-
 revision.c |  33 ++-
 revision.h |   5 +-
 setup.c|   7 +-
 sha1_file.c|  32 ++-
 t/t0410-partial-clone.sh   | 343 +
 transport.c|   8 +
 transport.h|  11 +
 32 files changed, 899 insertions(+), 97 deletions(-)
 create mode 100644 fetch-object.c
 create mode 100644 fetch-object.h
 create mode 100755 t/t0410-partial-clone.sh

-- 
2.9.3



[PATCH v6 02/12] fsck: introduce partialclone extension

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

Currently, Git does not support repos with very large numbers of objects
or repos that wish to minimize manipulation of certain blobs (for
example, because they are very large) very well, even if the user
operates mostly on part of the repo, because Git is designed on the
assumption that every referenced object is available somewhere in the
repo storage. In such an arrangement, the full set of objects is usually
available in remote storage, ready to be lazily downloaded.

Teach fsck about the new state of affairs. In this commit, teach fsck
that missing promisor objects referenced from the reflog are not an
error case; in future commits, fsck will be taught about other cases.

Signed-off-by: Jonathan Tan 
---
 builtin/fsck.c   |  2 +-
 cache.h  |  3 +-
 packfile.c   | 77 +++--
 packfile.h   | 13 
 t/t0410-partial-clone.sh | 81 
 5 files changed, 171 insertions(+), 5 deletions(-)
 create mode 100755 t/t0410-partial-clone.sh

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 56afe40..2934299 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -398,7 +398,7 @@ static void fsck_handle_reflog_oid(const char *refname, 
struct object_id *oid,
xstrfmt("%s@{%"PRItime"}", refname, 
timestamp));
obj->flags |= USED;
mark_object_reachable(obj);
-   } else {
+   } else if (!is_promisor_object(oid)) {
error("%s: invalid reflog entry %s", refname, 
oid_to_hex(oid));
errors_found |= ERROR_REACHABLE;
}
diff --git a/cache.h b/cache.h
index 35e3f5e..c76f2e9 100644
--- a/cache.h
+++ b/cache.h
@@ -1587,7 +1587,8 @@ extern struct packed_git {
unsigned pack_local:1,
 pack_keep:1,
 freshened:1,
-do_not_close:1;
+do_not_close:1,
+pack_promisor:1;
unsigned char sha1[20];
struct revindex_entry *revindex;
/* something like ".git/objects/pack/x.pack" */
diff --git a/packfile.c b/packfile.c
index 4a5fe7a..234797c 100644
--- a/packfile.c
+++ b/packfile.c
@@ -8,6 +8,11 @@
 #include "list.h"
 #include "streaming.h"
 #include "sha1-lookup.h"
+#include "commit.h"
+#include "object.h"
+#include "tag.h"
+#include "tree-walk.h"
+#include "tree.h"
 
 char *odb_pack_name(struct strbuf *buf,
const unsigned char *sha1,
@@ -643,10 +648,10 @@ struct packed_git *add_packed_git(const char *path, 
size_t path_len, int local)
return NULL;
 
/*
-* ".pack" is long enough to hold any suffix we're adding (and
+* ".promisor" is long enough to hold any suffix we're adding (and
 * the use xsnprintf double-checks that)
 */
-   alloc = st_add3(path_len, strlen(".pack"), 1);
+   alloc = st_add3(path_len, strlen(".promisor"), 1);
p = alloc_packed_git(alloc);
memcpy(p->pack_name, path, path_len);
 
@@ -654,6 +659,10 @@ struct packed_git *add_packed_git(const char *path, size_t 
path_len, int local)
if (!access(p->pack_name, F_OK))
p->pack_keep = 1;
 
+   xsnprintf(p->pack_name + path_len, alloc - path_len, ".promisor");
+   if (!access(p->pack_name, F_OK))
+   p->pack_promisor = 1;
+
xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
if (stat(p->pack_name, ) || !S_ISREG(st.st_mode)) {
free(p);
@@ -781,7 +790,8 @@ static void prepare_packed_git_one(char *objdir, int local)
if (ends_with(de->d_name, ".idx") ||
ends_with(de->d_name, ".pack") ||
ends_with(de->d_name, ".bitmap") ||
-   ends_with(de->d_name, ".keep"))
+   ends_with(de->d_name, ".keep") ||
+   ends_with(de->d_name, ".promisor"))
string_list_append(, path.buf);
else
report_garbage(PACKDIR_FILE_GARBAGE, path.buf);
@@ -1889,6 +1899,9 @@ int for_each_packed_object(each_packed_object_fn cb, void 
*data, unsigned flags)
for (p = packed_git; p; p = p->next) {
if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
continue;
+   if ((flags & FOR_EACH_OBJECT_PROMISOR_ONLY) &&
+   !p->pack_promisor)
+   continue;
if (open_pack_index(p)) {
pack_errors = 1;
continue;
@@ -1899,3 +1912,61 @@ int for_each_packed_object(each_packed_object_fn cb, 
void *data, unsigned flags)
}
return r ? r : pack_errors;
 }
+
+static int add_promisor_object(const struct object_id *oid,
+   

[PATCH v6 01/12] extension.partialclone: introduce partial clone extension

2017-12-05 Thread Jeff Hostetler
From: Jonathan Tan 

Introduce new repository extension option:
`extensions.partialclone`

See the update to Documentation/technical/repository-version.txt
in this patch for more information.

Signed-off-by: Jonathan Tan 
---
 Documentation/technical/repository-version.txt | 12 
 cache.h|  2 ++
 environment.c  |  1 +
 setup.c|  7 ++-
 4 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/Documentation/technical/repository-version.txt 
b/Documentation/technical/repository-version.txt
index 00ad379..e03eacc 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -86,3 +86,15 @@ for testing format-1 compatibility.
 When the config key `extensions.preciousObjects` is set to `true`,
 objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
 `git repack -d`).
+
+`partialclone`
+~~
+
+When the config key `extensions.partialclone` is set, it indicates
+that the repo was created with a partial clone (or later performed
+a partial fetch) and that the remote may have omitted sending
+certain unwanted objects.  Such a remote is called a "promisor remote"
+and it promises that all such omitted objects can be fetched from it
+in the future.
+
+The value of this key is the name of the promisor remote.
diff --git a/cache.h b/cache.h
index 6440e2b..35e3f5e 100644
--- a/cache.h
+++ b/cache.h
@@ -860,10 +860,12 @@ extern int grafts_replace_parents;
 #define GIT_REPO_VERSION 0
 #define GIT_REPO_VERSION_READ 1
 extern int repository_format_precious_objects;
+extern char *repository_format_partial_clone;
 
 struct repository_format {
int version;
int precious_objects;
+   char *partial_clone; /* value of extensions.partialclone */
int is_bare;
char *work_tree;
struct string_list unknown_extensions;
diff --git a/environment.c b/environment.c
index 8289c25..e52aab3 100644
--- a/environment.c
+++ b/environment.c
@@ -27,6 +27,7 @@ int warn_ambiguous_refs = 1;
 int warn_on_object_refname_ambiguity = 1;
 int ref_paranoia = -1;
 int repository_format_precious_objects;
+char *repository_format_partial_clone;
 const char *git_commit_encoding;
 const char *git_log_output_encoding;
 const char *apply_default_whitespace;
diff --git a/setup.c b/setup.c
index 03f51e0..58536bd 100644
--- a/setup.c
+++ b/setup.c
@@ -420,7 +420,11 @@ static int check_repo_format(const char *var, const char 
*value, void *vdata)
;
else if (!strcmp(ext, "preciousobjects"))
data->precious_objects = git_config_bool(var, value);
-   else
+   else if (!strcmp(ext, "partialclone")) {
+   if (!value)
+   return config_error_nonbool(var);
+   data->partial_clone = xstrdup(value);
+   } else
string_list_append(>unknown_extensions, ext);
} else if (strcmp(var, "core.bare") == 0) {
data->is_bare = git_config_bool(var, value);
@@ -463,6 +467,7 @@ static int check_repository_format_gently(const char 
*gitdir, int *nongit_ok)
}
 
repository_format_precious_objects = candidate.precious_objects;
+   repository_format_partial_clone = candidate.partial_clone;
string_list_clear(_extensions, 0);
if (!has_common) {
if (candidate.is_bare != -1) {
-- 
2.9.3



[PATCH v6 1/3] list-objects-filter-options: fix 'keword' typo in comment

2017-12-05 Thread Jeff Hostetler
From: Christian Couder 

Signed-off-by: Christian Couder 
Signed-off-by: Jeff Hostetler 
---
 list-objects-filter-options.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 9b28322..52bdec7 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -8,7 +8,7 @@
 #include "list-objects-filter-options.h"
 
 /*
- * Parse value of the argument to the "filter" keword.
+ * Parse value of the argument to the "filter" keyword.
  * On the command line this looks like:
  *   --filter=
  * and in the pack protocol as:
-- 
2.9.3



[PATCH v6 2/3] list-objects-filter-options: support --no-filter

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Teach opt_parse_list_objects_filter() to take --no-filter
option and to free the contents of struct filter_options.
This command line argument will be automatically inherited
by commands using OPT_PARSE_LIST_OBJECTS_FILTER(); this
includes pack-objects.

Signed-off-by: Jeff Hostetler 
---
 Documentation/git-pack-objects.txt |  3 +++
 list-objects-filter-options.c  | 15 +--
 list-objects-filter-options.h  |  5 -
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index b924c6c..aa403d0 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -242,6 +242,9 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it 
creates a bundle.
the resulting packfile.  See linkgit:git-rev-list[1] for valid
`` forms.
 
+--no-filter::
+   Turns off any previous `--filter=` argument.
+
 --missing=::
A debug option to help with future "partial clone" development.
This option specifies how missing objects are handled.
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 52bdec7..4c5b34e 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -74,8 +74,19 @@ int opt_parse_list_objects_filter(const struct option *opt,
 {
struct list_objects_filter_options *filter_options = opt->value;
 
-   assert(arg);
-   assert(!unset);
+   if (unset || !arg) {
+   list_objects_filter_release(filter_options);
+   return 0;
+   }
 
return parse_list_objects_filter(filter_options, arg);
 }
+
+void list_objects_filter_release(
+   struct list_objects_filter_options *filter_options)
+{
+   free(filter_options->filter_spec);
+   free(filter_options->sparse_oid_value);
+   free(filter_options->sparse_path_value);
+   memset(filter_options, 0, sizeof(*filter_options));
+}
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index dd7e5db..eea44a1 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -52,7 +52,10 @@ int opt_parse_list_objects_filter(const struct option *opt,
 
 #define OPT_PARSE_LIST_OBJECTS_FILTER(fo) \
{ OPTION_CALLBACK, 0, CL_ARG__FILTER, fo, N_("args"), \
- N_("object filtering"), PARSE_OPT_NONEG, \
+ N_("object filtering"), 0, \
  opt_parse_list_objects_filter }
 
+void list_objects_filter_release(
+   struct list_objects_filter_options *filter_options);
+
 #endif /* LIST_OBJECTS_FILTER_OPTIONS_H */
-- 
2.9.3



[PATCH v6 3/3] rev-list: support --no-filter argument

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Teach rev-list to support --no-filter to override a
previous --filter= argument.  This is
to be consistent with commands that use OPT_PARSE
macros.

Signed-off-by: Jeff Hostetler 
---
 Documentation/rev-list-options.txt | 15 ++-
 builtin/rev-list.c |  4 
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 11bb87f..8d8b7f4 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -715,16 +715,21 @@ ifdef::git-rev-list[]
 The form '--filter=blob:none' omits all blobs.
 +
 The form '--filter=blob:limit=[kmg]' omits blobs larger than n bytes
-or units.  The value may be zero.
+or units.  n may be zero.  The suffixes k, m, and g can be used to name
+units in KiB, MiB, or GiB.  For example, 'blob:limit=1k' is the same
+as 'blob:limit=1024'.
 +
-The form '--filter=sparse:oid=' uses a sparse-checkout
-specification contained in the object (or the object that the expression
-evaluates to) to omit blobs that would not be not required for a
-sparse checkout on the requested refs.
+The form '--filter=sparse:oid=' uses a sparse-checkout
+specification contained in the blob (or blob-expression) ''
+to omit blobs that would not be not required for a sparse checkout on
+the requested refs.
 +
 The form '--filter=sparse:path=' similarly uses a sparse-checkout
 specification contained in .
 
+--no-filter::
+   Turn off any previous `--filter=` argument.
+
 --filter-print-omitted::
Only useful with `--filter=`; prints a list of the objects omitted
by the filter.  Object IDs are prefixed with a ``~'' character.
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 4700473..547a3e0 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -403,6 +403,10 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
filter_options.filter_spec);
continue;
}
+   if (!strcmp(arg, ("--no-" CL_ARG__FILTER))) {
+   list_objects_filter_release(_options);
+   continue;
+   }
if (!strcmp(arg, "--filter-print-omitted")) {
arg_print_omitted = 1;
continue;
-- 
2.9.3



[PATCH v6 0/3] Partial clone part 1: object filtering

2017-12-05 Thread Jeff Hostetler
From: Jeff Hostetler 

Here is V6 of the list-object filtering, rev-list, and pack-objects.

This is an incremental patch series to be applied on top of V5
which is already in 'next'.

This version fixes a typo, add the --no-filter option, eliminates
a couple of asserts(), and updates the documentation.

Christian Couder (1):
  list-objects-filter-options: fix 'keword' typo in comment

Jeff Hostetler (2):
  list-objects-filter-options: support --no-filter
  rev-list: support --no-filter argument

 Documentation/git-pack-objects.txt |  3 +++
 Documentation/rev-list-options.txt | 15 ++-
 builtin/rev-list.c |  4 
 list-objects-filter-options.c  | 17 ++---
 list-objects-filter-options.h  |  5 -
 5 files changed, 35 insertions(+), 9 deletions(-)

-- 
2.9.3



Re: [PATCH 3/4] Makefile: use the sha1collisiondetection submodule by default

2017-12-05 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> Change the build process so that instead of needing to supply
> DC_SHA1_SUBMODULE=YesPlease to use the sha1collisiondetection
> submodule instead of the copy of the same code shipped in the sha1dc
> directory, it uses the submodule by default unless
> NO_DC_SHA1_SUBMODULE=UnfortunatelyYes is supplied.
> ...
> This change removes the "auto" logic Junio added in
> cac87dc01d ("sha1collisiondetection: automatically enable when
> submodule is populated", 2017-07-01), I feel that automatically
> falling back to using sha1dc would defeat the point, which is to smoke
> out any remaining users of git.git who have issues cloning the
> submodule for whatever reason.

I think it makes sense to drop 'auto' if we were to go this route.
I do not think the right value for NO_DC_SHA1_SUBMODULE should be
"unfortunately yes"; it should be spelled NoThanks or something.
It's not like an external reason "unfortunately" prevents you from
using the code from the submodule---the person sets it deliberately
and by choice.

> Makefile:1031: *** The sha1collisiondetection submodule is not
> checked out. Please make it available, either by cloning with
> --recurse-submodules, or by running "git submodule update
> --init". If you can't use it for whatever reason you can define
> NO_DC_SHA1_SUBMODULE=UnfortunatelyYes.  Stop.

Likewise here.


Re: [PATCH v5 07/10] introduce fetch-object: fetch one promisor object

2017-12-05 Thread Jeff Hostetler



On 12/2/2017 3:33 PM, Christian Couder wrote:

On Tue, Nov 21, 2017 at 10:07 PM, Jeff Hostetler  wrote:

From: Jonathan Tan 



+void fetch_object(const char *remote_name, const unsigned char *sha1)
+{
+   struct remote *remote;
+   struct transport *transport;
+   struct ref *ref;
+
+   remote = remote_get(remote_name);
+   if (!remote->url[0])
+   die(_("Remote with no URL"));
+   transport = transport_get(remote, remote->url[0]);
+
+   ref = alloc_ref(sha1_to_hex(sha1));
+   hashcpy(ref->old_oid.hash, sha1);
+   transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
+   transport_set_option(transport, TRANS_OPT_NO_HAVES, "1");
+   transport_fetch_refs(transport, ref);
+}


I think it would be interesting to return what transport_fetch_refs()
returns, so that a caller could know if an error happened.



That might help the retry/found_packed loop mentioned
in your response on the next patch in this series.

I'll make a TODO item for this to investigate.

Thanks
Jeff


RE: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter

2017-12-05 Thread Kerry, Richard
Indeed so.
In which case it is short for "selling short", or possibly "short selling".

Of course a little searching shows that "shorted" could mean some other things, 
including possibly the meaning originally suggested.
Nevertheless it seems to me that "shortened" is the most appropriate word in 
modern English.

R.


Richard Kerry
BNCS Engineer, SI SOL Telco & Media Vertical Practice

T: +44 (0)20 3618 2669
M: +44 (0)7812 325518
Lync: +44 (0) 20 3618 0778
Room G300, Stadium House, Wood Lane, London, W12 7TA
richard.ke...@atos.net




> -Original Message-
> From: Junio C Hamano [mailto:gits...@pobox.com]
> Sent: Tuesday, December 05, 2017 4:06 PM
> To: Kerry, Richard 
> Cc: git@vger.kernel.org; Johannes Schindelin
> ; p...@peff.net; liam Beguin
> 
> Subject: Re: [PATCH v2 6/9] rebase -i: update functions to use a flags
> parameter
>
> "Kerry, Richard"  writes:
>
> > "Shorted" is what happens when you put a piece of wire across the
> terminals of a battery ... (bang, smoke, etc).
> > It's short for "short-circuited".
>
> Or it is what you do to something that you sell and that you yet do not own,
> expecting that you can later buy it cheaper, allowing you to pocket the
> difference ;-).
>

Atos, Atos Consulting, Worldline and Canopy The Open Cloud Company are trading 
names used by the Atos group. The following trading entities are registered in 
England and Wales: Atos IT Services UK Limited (registered number 01245534), 
Atos Consulting Limited (registered number 04312380), Atos Worldline UK Limited 
(registered number 08514184) and Canopy The Open Cloud Company Limited 
(registration number 08011902). The registered office for each is at 4 Triton 
Square, Regent’s Place, London, NW1 3HG.The VAT No. for each is: GB232327983.

This e-mail and the documents attached are confidential and intended solely for 
the addressee, and may contain confidential or privileged information. If you 
receive this e-mail in error, you are not authorised to copy, disclose, use or 
retain it. Please notify the sender immediately and delete this email from your 
systems. As emails may be intercepted, amended or lost, they are not secure. 
Atos therefore can accept no liability for any errors or their content. 
Although Atos endeavours to maintain a virus-free network, we do not warrant 
that this transmission is virus-free and can accept no liability for any 
damages resulting from any virus transmitted. The risks are deemed to be 
accepted by everyone who communicates with Atos by email.


Re: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter

2017-12-05 Thread Junio C Hamano
"Kerry, Richard"  writes:

> "Shorted" is what happens when you put a piece of wire across the terminals 
> of a battery ... (bang, smoke, etc).
> It's short for "short-circuited".

Or it is what you do to something that you sell and that you yet do
not own, expecting that you can later buy it cheaper, allowing you
to pocket the difference ;-).




Re: [PATCH v6 4/7] checkout: describe_detached_head: remove ellipsis after committish

2017-12-05 Thread Junio C Hamano
Ann T Ropea  writes:

> v6: polish to take Junio's comments from 
>  into account

>  t/t2020-checkout-detach.sh| 114 
> ++
> ...

Thanks; with this one replaced, I'd expect that poisoned gettext
test to pass now.

I saw some style issues, so I'll queue a tentative fix-up on top.

> + # The first detach operation is more chatty than the following ones.
> + cat 1>1st_detach <<'EOF' &&
> +Note: checking out 'HEAD^'.
> +
> +You are in 'detached HEAD' state. You can look around, make experimental
> +changes and commit them, and you can discard any commits you make in this
> +state without impacting any branches by performing another checkout.
> +
> +If you want to create a new branch to retain commits you create, you may
> +do so (now or later) by using -b with the checkout command again. Example:
> +
> +  git checkout -b 
> +
> +HEAD is now at 7c7cd714e262 three
> +EOF

It looks somewhat strange to explicitly say 1> when redirecting the
standard output.  Also we prefer to indent our here-doc to the same
depth as commands by using "<<-", i.e.

cat >1st_detach <<-'EOF' &&
Note: checking out 'HEAD^'.
...
EOF

> + reset && check_not_detached && sane_unset GIT_PRINT_SHA1_ELLIPSIS &&

It also was a bit irritating to see multiple commands form an
overlong single line, like this one.

> + test_i18ncmp 1st_detach actual && sane_unset GIT_PRINT_SHA1_ELLIPSIS &&
> +
> + GIT_PRINT_SHA1_ELLIPSIS="no" git -c 'core.abbrev=12' checkout HEAD^ 
> 1>actual 2>&1 &&
> + check_detached &&
> + test_i18ncmp 2nd_detach actual && sane_unset GIT_PRINT_SHA1_ELLIPSIS &&

This part uses "set and export only for a single invocation of git",
and because the variable is unset at the end of the previous step
after 1st_detach and actual are compared, this unset at the end
feels redundant.

> + GIT_PRINT_SHA1_ELLIPSIS='nope' && export GIT_PRINT_SHA1_ELLIPSIS && git 
> -c 'core.abbrev=12' checkout HEAD^ 1>actual 2>&1 &&

But now this part sets and exports the variable for the remainder of
the script (until it is unset).

Is the use of these two styles, i.e.

VAR=value && export VAR && git -c core.abbrev=12 subcmd
VAR=value git -c core.abbrev=12 subcmd

intended?  If so for what purpose?  It's not like we are testing if
the shell implements environment variables correctly, so I'd expect
that the result would be easier to follow if it stuck to a single
style and used it consistently.



Re: [PATCH v5 08/10] sha1_file: support lazily fetching missing objects

2017-12-05 Thread Jeff Hostetler



On 12/2/2017 1:29 PM, Christian Couder wrote:

On Tue, Nov 21, 2017 at 10:07 PM, Jeff Hostetler  wrote:

From: Jonathan Tan 

[...]

  int sha1_object_info_extended(const unsigned char *sha1, struct object_info 
*oi, unsigned flags)
  {

[...]


-   if (!find_pack_entry(real, )) {
-   /* Most likely it's a loose object. */
-   if (!sha1_loose_object_info(real, oi, flags))
-   return 0;
+retry:
+   if (find_pack_entry(real, ))
+   goto found_packed;

-   /* Not a loose object; someone else may have just packed it. */
-   if (flags & OBJECT_INFO_QUICK) {
-   return -1;
-   } else {
-   reprepare_packed_git();
-   if (!find_pack_entry(real, ))
-   return -1;
-   }
+   /* Most likely it's a loose object. */
+   if (!sha1_loose_object_info(real, oi, flags))
+   return 0;
+
+   /* Not a loose object; someone else may have just packed it. */
+   reprepare_packed_git();
+   if (find_pack_entry(real, ))
+   goto found_packed;
+
+   /* Check if it is a missing object */
+   if (fetch_if_missing && repository_format_partial_clone &&
+   !already_retried) {
+   fetch_object(repository_format_partial_clone, real);
+   already_retried = 1;
+   goto retry;
 }

+   return -1;
+
+found_packed:
 if (oi == _oi)

[...]


The above is adding 2 different gotos into this function while there
are quite simple ways to avoid them. See
https://public-inbox.org/git/cap8ufd2thrj7+rxmismutuopxqv4wm7azsejpd_fheoycp+...@mail.gmail.com/
and the follow up email in the thread.


Personally, I'm OK with limited goto's like these.  Yes, they are ugly,
but not that complicated.  Alternatively, we could put everything between
the 2 labels in a while (1) {...} and replace the goto's with break's and
continue's.  I think it would be better to revisit this later, as I'd
rather not start refactoring sha1_file.c in the middle of what is already
a 30+ commit patch series (for the 3 parts of partial clone).

Thoughts?
Jeff



Re: [PATCH v5 10/10] gc: do not repack promisor packfiles

2017-12-05 Thread Jeff Hostetler



On 12/2/2017 12:39 PM, Christian Couder wrote:

On Tue, Nov 21, 2017 at 10:07 PM, Jeff Hostetler  wrote:

From: Jonathan Tan 



  pack_as_from_promisor () {
 HASH=$(git -C repo pack-objects .git/objects/pack/pack) &&
 >repo/.git/objects/pack/pack-$HASH.promisor
+   echo $HASH
  }


Now the exit code of the above function will always be the exit code
of "echo $HASH".
It seems to me that it would be better to add "&&" at the end of the
line above the "echo $HASH".



Fixed. Thanks.
Jeff


Re: gitattributes not read for diff-tree anymore in 2.15?

2017-12-05 Thread Ben Boeckel
On Mon, Dec 04, 2017 at 15:03:55 -0800, Brandon Williams wrote:
> Reading the attributes files should be done regardless if the gitmodules
> file is read.  The gitmodules file should only come into play if you are
> dealing with submodules.

Yeah, it doesn't seem to make sense why this commit breaks us, but there
it is *shrug*.

> Do you mind providing a reproduction recipe with expected outcome vs
> actual outcome and I can take a closer look.

I'll work on one. It isn't as simple as I thought it was :) . The setup
we do before running the checks is apparently involved as running it
from the command line is not exhibiting the difference.

--Ben


Re: [PATCH] imap-send: URI encode server folder

2017-12-05 Thread Junio C Hamano
Eric Sunshine  writes:

> ... Can you
> expand the commit message a bit to make it more self-contained? At
> minimum, perhaps show the error message you were experiencing, and
> cite (as Daniel pointed out) RFC 3986 and the bit about a "legal" URL
> not containing brackets.

Thanks for a good suggestion.


>
> Also, a natural question which pops into the head of someone reading
> this patch is whether other parts of the URL (host, user, etc.) also
> need to be handled similarly. It's possible that you audited the code
> and determined that they are handled fine already, but the reader of
> the commit message is unable to infer that. Consequently, it might be
> nice to have a sentence about that, as well ("other parts of the URL
> are already encoded, thus are fine" or "other parts of the URL are not
> subject to this problem because ...").
>
> The patch itself looks okay (from a cursory read).
>
> Thanks.
>
>> Reported-by: Doron Behar 
>> Signed-off-by: Nicolas Morey-Chaisemartin 
>> ---
>>  imap-send.c | 8 +++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/imap-send.c b/imap-send.c
>> index 54e6a80fd..36c7c1b4f 100644
>> --- a/imap-send.c
>> +++ b/imap-send.c
>> @@ -1412,6 +1412,7 @@ static CURL *setup_curl(struct imap_server_conf *srvc, 
>> struct credential *cred)
>>  {
>> CURL *curl;
>> struct strbuf path = STRBUF_INIT;
>> +   char *uri_encoded_folder;
>>
>> if (curl_global_init(CURL_GLOBAL_ALL) != CURLE_OK)
>> die("curl_global_init failed");
>> @@ -1429,7 +1430,12 @@ static CURL *setup_curl(struct imap_server_conf 
>> *srvc, struct credential *cred)
>> strbuf_addstr(, server.host);
>> if (!path.len || path.buf[path.len - 1] != '/')
>> strbuf_addch(, '/');
>> -   strbuf_addstr(, server.folder);
>> +
>> +   uri_encoded_folder = curl_easy_escape(curl, server.folder, 0);
>> +   if (!uri_encoded_folder)
>> +   die("failed to encode server folder");
>> +   strbuf_addstr(, uri_encoded_folder);
>> +   curl_free(uri_encoded_folder);
>>
>> curl_easy_setopt(curl, CURLOPT_URL, path.buf);
>> strbuf_release();
>> --
>> 2.15.1.272.g8e603414b


Re: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter

2017-12-05 Thread liam Beguin
Hi,

On 05/12/17 07:41 AM, Kerry, Richard wrote:
> 
> "Shorted" is what happens when you put a piece of wire across the terminals 
> of a battery ... (bang, smoke, etc).
> It's short for "short-circuited".
> Yes, I think you mean "shortened" in this case.
> 

Thanks for the explanation.
Sorry, my eyes stopped at the lowercase 's' in Johannes message.
Will fix.

> Regards,
> Richard.
> 
> 
> 
> Richard Kerry
> BNCS Engineer, SI SOL Telco & Media Vertical Practice
> 
> T: +44 (0)20 3618 2669
> M: +44 (0)7812 325518
> Lync: +44 (0) 20 3618 0778
> Room G300, Stadium House, Wood Lane, London, W12 7TA
> richard.ke...@atos.net
> 
> 

[...]

Thanks,
Liam


[PATCH v3] git-gui: Prevent double UTF-8 conversion

2017-12-05 Thread Łukasz Stelmach
Convert author's name and e-mail address from the UTF-8 (or any other)
encoding in load_last_commit function the same way commit message is
converted.

Amending commits in git-gui without such conversion breaks UTF-8
strings. For example, "\305\201ukasz" (as written by git cat-file) becomes
"\303\205\302\201ukasz" in an amended commit.

Signed-off-by: Łukasz Stelmach 
---
 git-gui/lib/commit.tcl | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/git-gui/lib/commit.tcl b/git-gui/lib/commit.tcl
index 83620b7cb..75ea965da 100644
--- a/git-gui/lib/commit.tcl
+++ b/git-gui/lib/commit.tcl
@@ -25,6 +25,8 @@ You are currently in the middle of a merge that has not been 
fully completed.  Y
set msg {}
set parents [list]
if {[catch {
+   set name ""
+   set email ""
set fd [git_read cat-file commit $curHEAD]
fconfigure $fd -encoding binary -translation lf
# By default commits are assumed to be in utf-8
@@ -34,9 +36,7 @@ You are currently in the middle of a merge that has not been 
fully completed.  Y
lappend parents [string range $line 7 
end]
} elseif {[string match {encoding *} $line]} {
set enc [string tolower [string range 
$line 9 end]]
-   } elseif {[regexp "author 
(.*)\\s<(.*)>\\s(\\d.*$)" $line all name email time]} {
-   set commit_author [list name $name 
email $email date $time]
-   }
+   } elseif {[regexp "author 
(.*)\\s<(.*)>\\s(\\d.*$)" $line all name email time]} { }
}
set msg [read $fd]
close $fd
@@ -44,7 +44,13 @@ You are currently in the middle of a merge that has not been 
fully completed.  Y
set enc [tcl_encoding $enc]
if {$enc ne {}} {
set msg [encoding convertfrom $enc $msg]
+   set name [encoding convertfrom $enc $name]
+   set email [encoding convertfrom $enc $email]
}
+   if {$name ne {} && $email ne {}} {
+   set commit_author [list name $name email $email 
date $time]
+   }
+
set msg [string trim $msg]
} err]} {
error_popup [strcat [mc "Error loading commit data for amend:"] 
"\n\n$err"]
-- 
2.11.0



Re: [PATCH 3/4] Makefile: use the sha1collisiondetection submodule by default

2017-12-05 Thread Ævar Arnfjörð Bjarmason

On Tue, Dec 05 2017, Junio C. Hamano jotted:

> Ævar Arnfjörð Bjarmason  writes:
>
>>> I'm not sure how I feel about this. I see your point that there's no
>>> real value in maintaining two systems indefinitely.  At the same time, I
>>> wonder how much value the submodule strategy is actually bringing us.
>>>
>>> IOW, are we agreed that the path forward is to get everybody using the
>>> submodule?
>> ...
>> In no particular order:
>>
>>  * I don't feel strongly about 2-4/4 in this series. I just hacked this
>>up because it occurred to me that I'd left this sha1dc stuff in some
>>in-between state and we'd talked about eventually moving forward with
>>this.
>
> Good.
>
>>We've had two releases with the submodule being purely optional, if
>>we're going to keep it it seems logical to start at least using it by
>>default.
>
> With a need for a patch like 1/4, I suspect two release cycles is
> way too short for making a move like 2-4/4, though.

You're conflating two unrelated things, which to be fair I'm confusingly
doing by submitting all this together.

1) Since 2.14 we've had the "auto" rule and
   DC_SHA1_SUBMODULE=[YesPlease|auto], so we'll prefer the submodule if
   it's there. So we've been testing if the mere presence of a
   .gitmodules breaks something for someone, seems like it doesn't.

2) Then in the 2.15 release Takashi Iwai submitted a feature to link to
   an external SHA1DC. This is used in the SuSE 2.15 package here:
   http://download.opensuse.org/tumbleweed/repo/src-oss/suse/src/

   However, as you'll see if you extract that package they don't run
   into that bug, because they're building it from a tarball which has
   an empty sha1collisiondetection/ directory as noted in my
   87bmjdscdr@evledraar.booking.com.

   Takashi *would* run into an error with my 1/4 if he was building from
   git.git, or if "make dist" included sha1collisiondetection/, but I
   don't see a reason to hold anything back back on that account. The
   only users of DC_SHA1_EXTERNAL=YesPlease are going to be packagers
   who know what they're doing, and if we start erroring out for them on
   this obscure option that's going to be trivially solved.

I don't see why this obscure edge case with #2 should keep us from
deciding whatever we'd decide with #1. They're really unrelated, #2
practically speaking only impacts tarball consumers, #1 impacts git.git
users.

It seems logical to me if we're going to move forward with #1 at all by
first making the submodule the default & then depending on how that
turns out making it a hard dependency, we'd do it now.

We'll learn nothing new by shipping a 2.16 with DC_SHA1_SUBMODULE=auto
that we haven't already learned in 2.14 & 2.15.


Re: [PATCH 3/4] Makefile: use the sha1collisiondetection submodule by default

2017-12-05 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

>> I'm not sure how I feel about this. I see your point that there's no
>> real value in maintaining two systems indefinitely.  At the same time, I
>> wonder how much value the submodule strategy is actually bringing us.
>>
>> IOW, are we agreed that the path forward is to get everybody using the
>> submodule?
> ...
> In no particular order:
>
>  * I don't feel strongly about 2-4/4 in this series. I just hacked this
>up because it occurred to me that I'd left this sha1dc stuff in some
>in-between state and we'd talked about eventually moving forward with
>this.

Good.

>We've had two releases with the submodule being purely optional, if
>we're going to keep it it seems logical to start at least using it by
>default.

With a need for a patch like 1/4, I suspect two release cycles is
way too short for making a move like 2-4/4, though.


Re: [PATCH v2 1/2] git-prompt: make __git_eread intended use explicit

2017-12-05 Thread Junio C Hamano
Robert Abel  writes:

> On 05 Dec 2017 01:27, Junio C Hamano wrote:
>> I know all of the above, but I think you misunderstood the point I
>> wanted to raise, so let me try again.  The thing is, none of what
>> you just wrote changes the fact that lack of callers that want to do
>> "multi-line" is IRRELEVANT.
>
> I disagree. The commit comment is meant to give context to the
> introduced changes. One change is the  additional comment for
> __git_eread, which now clearly states that only a single line is read.

I still do not understand why you think the 'next' person would care
about the (lack of )multi-line aspect of the helper.

Let's see how well the proposed log message gives the "context to
the introduced changes" (from your v3).

__git_eread is used to read a single line of a given file (if it
exists) into a single variable without the EOL. All six current
users of __git_eread use it that way and don't expect multi-line
content.

That does not include anything incorrect; but.

The helper is about (1) reading the first line and (2) reading it as
a whole into a single variable.  Both are already covered by the
first sentence, and there is no need to say 'and don't expect ...",
unless you want to stress something.

And it places a stress on the former, which is a less relevant
thing, WITHOUT giving the same treatment to the latter, which is a
more relevant thing.  After all, this patch is not about replacing
an earlier implementation that did

$2=$(cat "$1")

with

read $2 <"$1"

If that were the case, _then_ the fact that the purpose of the
helper is to read from a single-liner file (i.e. we do not expect
the input file to have more than one line) is VERY relevant.

But this is not such a patch.  And after readers read the above,
they find this:

Therefore, this patch removes the unused capability to split
file conents into tokens by passing multiple variable names.

And because the previous paragraph placed an emphasis on a wrong
aspect of the context of the calls to the helper function, this
"Therefore" does not quite "click" in the readers' minds.  The
reason why it is OK to remove the multi-variable feature is because
the callers of the helper want to always read the result into a
single variable, but the "no need for multi-variable" that they read
in the first sentence of the previous paragraph is less strong in
their mind by now, because they read an irrelevant (for the purpose
of this "Therefore") mention of "no need for multi-line" aspect of
the helper.

Perhaps

__git_eread is used to read the contents of a single-liner file
into a single variable while dropping EOL.  It is misleading to
use the "read" built-in with "$@", as if some callers would want
the contents read from the file to be split into multiple
variables.

Explicitly use a single variable, and also document that the
helper only reads the first line (simply because the input files
are designed to be single-liner files).

would say it the same thing, but with emphasis on the right aspect
of the facts.

I would also rephase the new in-code comment

# Helper function to read the first line of a file into a variable.

to un-stress "the first line of a file" and place more stress on the
fact that it is designed to read from a single-liner file (there is
a subtle but important distinction between the two).

# read the contents of a single-liner file into a variable,
# while dropping the end-of-line from it.

or something like that, perhaps.




RE: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter

2017-12-05 Thread Kerry, Richard

"Shorted" is what happens when you put a piece of wire across the terminals of 
a battery ... (bang, smoke, etc).
It's short for "short-circuited".
Yes, I think you mean "shortened" in this case.

Regards,
Richard.



Richard Kerry
BNCS Engineer, SI SOL Telco & Media Vertical Practice

T: +44 (0)20 3618 2669
M: +44 (0)7812 325518
Lync: +44 (0) 20 3618 0778
Room G300, Stadium House, Wood Lane, London, W12 7TA
richard.ke...@atos.net




> -Original Message-
> From: git-ow...@vger.kernel.org [mailto:git-ow...@vger.kernel.org] On
> Behalf Of Junio C Hamano
> Sent: Tuesday, December 05, 2017 12:37 PM
> To: liam Beguin 
> Cc: Johannes Schindelin ;
> git@vger.kernel.org; p...@peff.net
> Subject: Re: [PATCH v2 6/9] rebase -i: update functions to use a flags
> parameter
>
> liam Beguin  writes:
>
> > I'll change it to TODO_LIST_SHORTED_IDs. TODO_LIST_SHORTED_INSNS
> would
> > suggest the flag changes both parts of the todo.
>
> I am not a native speaker, but SHORTED does not sound like a right phrase.
> When you make something shorter, that thing is "shortened", not "shorted".

Atos, Atos Consulting, Worldline and Canopy The Open Cloud Company are trading 
names used by the Atos group. The following trading entities are registered in 
England and Wales: Atos IT Services UK Limited (registered number 01245534), 
Atos Consulting Limited (registered number 04312380), Atos Worldline UK Limited 
(registered number 08514184) and Canopy The Open Cloud Company Limited 
(registration number 08011902). The registered office for each is at 4 Triton 
Square, Regent’s Place, London, NW1 3HG.The VAT No. for each is: GB232327983.

This e-mail and the documents attached are confidential and intended solely for 
the addressee, and may contain confidential or privileged information. If you 
receive this e-mail in error, you are not authorised to copy, disclose, use or 
retain it. Please notify the sender immediately and delete this email from your 
systems. As emails may be intercepted, amended or lost, they are not secure. 
Atos therefore can accept no liability for any errors or their content. 
Although Atos endeavours to maintain a virus-free network, we do not warrant 
that this transmission is virus-free and can accept no liability for any 
damages resulting from any virus transmitted. The risks are deemed to be 
accepted by everyone who communicates with Atos by email.


Re: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter

2017-12-05 Thread Junio C Hamano
liam Beguin  writes:

> I'll change it to TODO_LIST_SHORTED_IDs. TODO_LIST_SHORTED_INSNS would
> suggest the flag changes both parts of the todo.

I am not a native speaker, but SHORTED does not sound like a right
phrase.  When you make something shorter, that thing is "shortened",
not "shorted".



Re: [PATCH v2 4/9] rebase -i: refactor transform_todo_ids

2017-12-05 Thread Junio C Hamano
liam Beguin  writes:

> Good suggestion. Would transform_todos() work too?

If the function is about munging multiple of them, then todo"s"
would work well; I wasn't focusing on singular vs plural, as I
thought the choice between them needs much less thought to make
correctly.



Re: [PATCH 2/2] progress: drop delay-threshold code

2017-12-05 Thread Lars Schneider

> On 05 Dec 2017, at 12:10, Ævar Arnfjörð Bjarmason  wrote:
> 
> On Tue, Dec 5, 2017 at 11:37 AM, Lars Schneider
>  wrote:
>> 
>>> On 04 Dec 2017, at 23:07, Jeff King  wrote:
>>> 
>>> From: Lars Schneider 
>>> 
>>> Since 180a9f2268 (provide a facility for "delayed" progress
>>> reporting, 2007-04-20), the progress code has allowed
>>> callers to skip showing progress if they have reached a
>>> percentage-threshold of the total work before the delay
>>> period passes.
>>> 
>>> But since 8aade107dd (progress: simplify "delayed" progress
>>> API, 2017-08-19), that parameter is not available to outside
>>> callers (we always passed zero after that commit, though
>>> that was corrected in the previous commit to "100%").
>>> 
>>> Let's drop the threshold code, which never triggers in
>>> any meaningful way.
>>> 
>>> Signed-off-by: Jeff King 
>>> ---
>>> I tweaked your patch slightly to clean up the now-simplified
>>> conditional.
>> 
>> Your first patch ("progress: set default delay threshold to 100%, not 0%")
>> as well as the modifications to this one look good to me. Feel free
>> to add my "Signed-off-by: Lars Schneider ".
>> 
>> Thanks,
>> Lars
>> 
>> 
>> PS: How do you generate the commit references "hash (first line, date)"?
>> Git log pretty print?
> 
> $ git grep -A5 'Copy commit summary' Documentation/SubmittingPatches
> Documentation/SubmittingPatches:151:The "Copy commit summary" command
> of gitk can be used to obtain this
> Documentation/SubmittingPatches-152-format, or this invocation of `git show`:
> Documentation/SubmittingPatches-153-
> Documentation/SubmittingPatches-154-
> Documentation/SubmittingPatches-155-git show -s --date=short
> --pretty='format:%h ("%s", %ad)' 

Thanks :-)

Re: [PATCH v4 9/9] t3512/t3513: remove KNOWN_FAILURE_CHERRY_PICK_SEES_EMPTY_COMMIT=1

2017-12-05 Thread Phillip Wood
On 04/12/17 19:24, Stefan Beller wrote:
> On Fri, Nov 24, 2017 at 3:07 AM, Phillip Wood  
> wrote:
>> From: Phillip Wood 
>>
>> Now that the sequencer creates commits without forking 'git commit' it
>> does not see an empty commit in these tests which fixes the known
>> breakage. Note that logic for handling
>> KNOWN_FAILURE_CHERRY_PICK_SEES_EMPTY_COMMIT=1 is not removed from
>> lib-submodule-update.sh as it is still used by other tests.
>>
>> Signed-off-by: Phillip Wood 
>> ---
>>
>> Notes:
>> The output of the tests with after the previous patch looks like the
>> output of the merge tests (see below), so I'm hopeful that this is a
>> genuine fix, but someone who knows about submodules should check that
>> things are in fact working properly now.
> 
> Looking at the patch only (in combination with the history of the
> submodule tests,
> 283f56a40b (cherry-pick: add t3512 for submodule updates, 2014-06-15))
> this patch
> looks good to me. I wonder though if this needs to be squashed in another 
> commit
> to keep the test suite working for each patch and have no intermittent
> failure in the
> series.

Hi Stefan

Thanks for looking at this, it's good to know you think it's OK. I'll
leave it to Junio's discretion whether to squash this into the previous
patch

Thanks again

Phillip

> Thanks,
> Stefan
> 



Re: [PATCH v4 7/9] sequencer: load commit related config

2017-12-05 Thread Phillip Wood
On 05/12/17 11:21, Phillip Wood wrote:
> On 04/12/17 18:30, Junio C Hamano wrote:
>> Phillip Wood  writes:
>>
>>> --- a/builtin/rebase--helper.c
>>> +++ b/builtin/rebase--helper.c
>>> @@ -9,6 +9,17 @@ static const char * const builtin_rebase_helper_usage[] = {
>>> NULL
>>>  };
>>>  
>>> +static int git_rebase_helper_config(const char *k, const char *v, void *cb)
>>> +{
>>> +   int status;
>>> +
>>> +   status = git_sequencer_config(k, v, NULL);
>>> +   if (status)
>>> +   return status;
>>> +
>>> +   return git_default_config(k, v, NULL);
>>> +}
>>> +
>>
>> Sorry for spotting the problem this late, but this code is
>> unfortunate and we will need to revisit it later; we may want to do
>> so sooner rather than later.
> 
> If it needs fixing then doing it before it hits next makes sense.
> 
>> When k,v is a valid configuration that is handled by
>> sequencer_config() successfully, this code still needs to call into
>> default_config() with the same k,v, only to get it ignored.
> 
> I'm a bit confused by this sentence. Do you mean that when k,v is a
> valid configuration that is successfully handled by sequencer_config()
> this code unnecessarily calls default_config() as there is no need to
> call default_config() if k,v have already been handled?
> 
>> The problem lies in the (mis)design of git_sequencer_config().  The
>> function should either allow the caller to notice that (k,v) has
>> already been handled, or be the last one in the callback by making a
>> call to default_config() itself.
> 
> The problem is that git_gpg_config() provides no indication if it has
> handled k,v so there's no way to avoid the call to default_config() in
> that case. builtin/am.c and builtin/commit.c both do something like
> 
> static int git_am_config(const char *k, const char *v, void *cb)
> {
>   int status;
> 
>   status = git_gpg_config(k, v, NULL);
>   if (status)
>   return status;
> 
>   return git_default_config(k, v, NULL);
> }
> 

Looking through the callers of gpg_config() it should be fairly easy to
convert it to take an extra parameter in the same way as you suggested
for sequencer_config()

> Looking more generally at sequencer_config() I wonder if we should be
> providing a warning or an error if the config contains an invalid
> cleanup mode. Also should it be calling git_diff_ui_config() to set
> things up for print_commit_summary()? (I'm not sure if anything in that
> function is affected by diff config settings)
> 
> Let me know what you think. I should have time to update this patch set
> later in the week.
> 
> Best Wishes
> 
> Phillip
> 
>> For the former, because this helper does not have to be called
>> directly as a git_config() callback, but instead it is always meant
>> to be called indirectly from another git_config() callback (like
>> git_rebase_helper_config() here, and common_config() in
>> builtin/revert.c like we see below), it does *not* have to be
>> constrained by the function signature required for it to be a config
>> callback.  It could take a pointer to an int that stores if 'k' was
>> handled inside the function,
>>
>> int git_sequencer_config_helper(char *k, char *v, int *handled);
>>
>> for example.
>>
> 



Re: [PATCH v4 7/9] sequencer: load commit related config

2017-12-05 Thread Phillip Wood
On 04/12/17 18:30, Junio C Hamano wrote:
> Phillip Wood  writes:
> 
>> --- a/builtin/rebase--helper.c
>> +++ b/builtin/rebase--helper.c
>> @@ -9,6 +9,17 @@ static const char * const builtin_rebase_helper_usage[] = {
>>  NULL
>>  };
>>  
>> +static int git_rebase_helper_config(const char *k, const char *v, void *cb)
>> +{
>> +int status;
>> +
>> +status = git_sequencer_config(k, v, NULL);
>> +if (status)
>> +return status;
>> +
>> +return git_default_config(k, v, NULL);
>> +}
>> +
> 
> Sorry for spotting the problem this late, but this code is
> unfortunate and we will need to revisit it later; we may want to do
> so sooner rather than later.

If it needs fixing then doing it before it hits next makes sense.

> When k,v is a valid configuration that is handled by
> sequencer_config() successfully, this code still needs to call into
> default_config() with the same k,v, only to get it ignored.

I'm a bit confused by this sentence. Do you mean that when k,v is a
valid configuration that is successfully handled by sequencer_config()
this code unnecessarily calls default_config() as there is no need to
call default_config() if k,v have already been handled?

> The problem lies in the (mis)design of git_sequencer_config().  The
> function should either allow the caller to notice that (k,v) has
> already been handled, or be the last one in the callback by making a
> call to default_config() itself.

The problem is that git_gpg_config() provides no indication if it has
handled k,v so there's no way to avoid the call to default_config() in
that case. builtin/am.c and builtin/commit.c both do something like

static int git_am_config(const char *k, const char *v, void *cb)
{
int status;

status = git_gpg_config(k, v, NULL);
if (status)
return status;

return git_default_config(k, v, NULL);
}


Looking more generally at sequencer_config() I wonder if we should be
providing a warning or an error if the config contains an invalid
cleanup mode. Also should it be calling git_diff_ui_config() to set
things up for print_commit_summary()? (I'm not sure if anything in that
function is affected by diff config settings)

Let me know what you think. I should have time to update this patch set
later in the week.

Best Wishes

Phillip

> For the former, because this helper does not have to be called
> directly as a git_config() callback, but instead it is always meant
> to be called indirectly from another git_config() callback (like
> git_rebase_helper_config() here, and common_config() in
> builtin/revert.c like we see below), it does *not* have to be
> constrained by the function signature required for it to be a config
> callback.  It could take a pointer to an int that stores if 'k' was
> handled inside the function,
> 
> int git_sequencer_config_helper(char *k, char *v, int *handled);
> 
> for example.
> 



Re: [PATCH 2/2] progress: drop delay-threshold code

2017-12-05 Thread Ævar Arnfjörð Bjarmason
On Tue, Dec 5, 2017 at 11:37 AM, Lars Schneider
 wrote:
>
>> On 04 Dec 2017, at 23:07, Jeff King  wrote:
>>
>> From: Lars Schneider 
>>
>> Since 180a9f2268 (provide a facility for "delayed" progress
>> reporting, 2007-04-20), the progress code has allowed
>> callers to skip showing progress if they have reached a
>> percentage-threshold of the total work before the delay
>> period passes.
>>
>> But since 8aade107dd (progress: simplify "delayed" progress
>> API, 2017-08-19), that parameter is not available to outside
>> callers (we always passed zero after that commit, though
>> that was corrected in the previous commit to "100%").
>>
>> Let's drop the threshold code, which never triggers in
>> any meaningful way.
>>
>> Signed-off-by: Jeff King 
>> ---
>> I tweaked your patch slightly to clean up the now-simplified
>> conditional.
>
> Your first patch ("progress: set default delay threshold to 100%, not 0%")
> as well as the modifications to this one look good to me. Feel free
> to add my "Signed-off-by: Lars Schneider ".
>
> Thanks,
> Lars
>
>
> PS: How do you generate the commit references "hash (first line, date)"?
> Git log pretty print?

$ git grep -A5 'Copy commit summary' Documentation/SubmittingPatches
Documentation/SubmittingPatches:151:The "Copy commit summary" command
of gitk can be used to obtain this
Documentation/SubmittingPatches-152-format, or this invocation of `git show`:
Documentation/SubmittingPatches-153-
Documentation/SubmittingPatches-154-
Documentation/SubmittingPatches-155-git show -s --date=short
--pretty='format:%h ("%s", %ad)' 


Re: [PATCH 2/2] progress: drop delay-threshold code

2017-12-05 Thread Lars Schneider

> On 04 Dec 2017, at 23:07, Jeff King  wrote:
> 
> From: Lars Schneider 
> 
> Since 180a9f2268 (provide a facility for "delayed" progress
> reporting, 2007-04-20), the progress code has allowed
> callers to skip showing progress if they have reached a
> percentage-threshold of the total work before the delay
> period passes.
> 
> But since 8aade107dd (progress: simplify "delayed" progress
> API, 2017-08-19), that parameter is not available to outside
> callers (we always passed zero after that commit, though
> that was corrected in the previous commit to "100%").
> 
> Let's drop the threshold code, which never triggers in
> any meaningful way.
> 
> Signed-off-by: Jeff King 
> ---
> I tweaked your patch slightly to clean up the now-simplified
> conditional.

Your first patch ("progress: set default delay threshold to 100%, not 0%")
as well as the modifications to this one look good to me. Feel free
to add my "Signed-off-by: Lars Schneider ".

Thanks,
Lars


PS: How do you generate the commit references "hash (first line, date)"?
Git log pretty print?


  1   2   >