Documentation Breakage at 2.5.6
Hi All, I'm trying to upgrade the NonStop port from 2.3.7 upward eventually to 2.15.1 and hit a snag on documentation. The xmlto component is a bit new to me and I hit the following error: XMLTO git-remote-testgit.1 xmlto: /home/git/git/Documentation/git-remote-testgit.xml does not validate (status 3) xmlto: Fix document syntax or use --skip-validation option I/O error : Attempt to load network entity http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd /home/git/git/Documentation/git-remote-testgit.xml:2: warning: failed to load external entity "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"; D DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"; ^ I/O error : Attempt to load network entity http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd warning: failed to load external entity "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"; validity error : Could not load the external subset http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd The -skip-validation option just takes me to a different problem validating via sourceforge URL that appears not to exist anymore, although I had to modify ./git/Documention/Makefile, which is vexing. XMLTO git-remote-testgit.1 I/O error : Attempt to load network entity http://docbook.sourceforge.net/release/xsl/current/manpages/docbook.xsl warning: failed to load external entity "http://docbook.sourceforge.net/release/xsl/current/manpages/docbook.xsl"; compilation error: file /tmp/xmlto-xsl.ie6J8p line 4 element import xsl:import : unable to load http://docbook.sourceforge.net/release/xsl/current/manpages/docbook.xsl Makefile:328: recipe for target 'git-remote-testgit.1' failed Any advice on getting past this? It would be nice to get git help to working again. An answer of "you need to get past 2.5.6" is ok too as long as I know where I'm going. Thanks, Randall -- Brief whoami: NonStop&UNIX developer since approximately UNIX(421664400)/NonStop(2112884442) -- In my real life, I talk too much.
Re: [PATCH v2 0/9] rebase -i: add config to abbreviate command names
On 05/12/17 05:21 PM, Junio C Hamano wrote: > Liam Beguin writes: > >> This series will add the 'rebase.abbreviateCommands' configuration >> option to allow `git rebase -i` to default to the single-letter command >> names when generating the todo list. >> >> Using single-letter command names can present two benefits. First, it >> makes it easier to change the action since you only need to replace a >> single character (i.e.: in vim "r" instead of >> "ciw"). Second, using this with a large enough value of >> 'core.abbrev' enables the lines of the todo list to remain aligned >> making the files easier to read. >> >> Changes in V2: >> - Refactor and rename 'transform_todo_ids' >> - Replace SHA-1 by OID in rebase--helper.c >> - Update todo list related functions to take a generic 'flags' parameter >> - Rename 'add_exec_commands' function to 'sequencer_add_exec_commands' >> - Rename 'add-exec' option to 'add-exec-commands' >> - Use 'strbur_read_file' instead of rewriting it >> - Make 'command_to_char' return 'comment_char_line' if no single-letter >> command name is defined >> - Combine both tests into a single test case >> - Update commit messages >> >> Changes in V2: >> - Rename 'transform_todo_insn' to 'transform_todos' >> - Fix flag name TODO_LIST_SHORTE{D,N}_IDS > > I've replaced this series and pushed out the result. Great! Thanks again, > > Thanks. > Liam
git send-email could check for no To: line
I keep coming up with new and innovative ways to send stupid-looking emails with git send-email. Please save me from myself. My latest SNAFU is to spend so much time setting up the 'cc' list in the git-format-patch step that I completely forgot to put anybody on the 'to' line, and even being prompted by git send-email that I might want to add somebody to the 'To' line wasn't enough. Could there be an extra confirmation, perhaps in send_message() that if ($cc ne '' && $to eq '') are_you_sure() (you don't want me writing perl. you really don't.)
Re: [PATCH v7 4/7] checkout: describe_detached_head: remove ellipsis after committish
We do not want an ellipsis displayed following an (abbreviated) SHA-1 value. The days when this was necessary to indicate the truncation to lower-level Git commands and/or the user are bygone. However, to ease the transition, the ellipsis will still be printed if the user sets the environment variable GIT_PRINT_SHA1_ELLIPSIS to "yes". Correct documentation with respect to what describe_detached_head prints when GIT_PRINT_SHA1_ELLIPSIS is not set as indicated above. Add tests for the old and new behaviour. Signed-off-by: Ann T Ropea --- v2: rename patch series & focus on removal of ellipses v3: env var instead of config option, use one-line comments where appropriate, preserve indent level v4: improve env var handling (rename, helper func to query, docu) v5: rewrite series to take Junio's comments in aboard v6: polish to take Junio's comments from into account v7: fix style issues (redirection, here-dox, long lines, setting/exporting/unsetting of env var): cf. Documentation/user-manual.txt | 2 +- builtin/checkout.c| 10 +++- t/t2020-checkout-detach.sh| 123 ++ 3 files changed, 132 insertions(+), 3 deletions(-) diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt index 497e82e88dd0..eff78902742a 100644 --- a/Documentation/user-manual.txt +++ b/Documentation/user-manual.txt @@ -319,7 +319,7 @@ do so (now or later) by using -b with the checkout command again. Example: git checkout -b new_branch_name -HEAD is now at 427abfa... Linux v2.6.17 +HEAD is now at 427abfa Linux v2.6.17 The HEAD then refers to the SHA-1 of the commit instead of to a branch, diff --git a/builtin/checkout.c b/builtin/checkout.c index 3faae382de4f..b0499542158f 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -400,10 +400,16 @@ static void show_local_changes(struct object *head, static void describe_detached_head(const char *msg, struct commit *commit) { struct strbuf sb = STRBUF_INIT; + if (!parse_commit(commit)) pp_commit_easy(CMIT_FMT_ONELINE, commit, &sb); - fprintf(stderr, "%s %s... %s\n", msg, - find_unique_abbrev(commit->object.oid.hash, DEFAULT_ABBREV), sb.buf); + if (print_sha1_ellipsis()) { + fprintf(stderr, "%s %s... %s\n", msg, + find_unique_abbrev(commit->object.oid.hash, DEFAULT_ABBREV), sb.buf); + } else { + fprintf(stderr, "%s %s %s\n", msg, + find_unique_abbrev(commit->object.oid.hash, DEFAULT_ABBREV), sb.buf); + } strbuf_release(&sb); } diff --git a/t/t2020-checkout-detach.sh b/t/t2020-checkout-detach.sh index fbb4ee9bb42d..e8e284cf9c04 100755 --- a/t/t2020-checkout-detach.sh +++ b/t/t2020-checkout-detach.sh @@ -186,4 +186,127 @@ test_expect_success 'no advice given for explicit detached head state' ' test_cmp expect.no-advice actual ' +# Detached HEAD tests for GIT_PRINT_SHA1_ELLIPSIS (new format) +test_expect_success 'describe_detached_head prints no SHA-1 ellipsis when not asked to' " + + # The first detach operation is more chatty than the following ones. + cat >1st_detach <<-'EOF' && + Note: checking out 'HEAD^'. + + You are in 'detached HEAD' state. You can look around, make experimental + changes and commit them, and you can discard any commits you make in this + state without impacting any branches by performing another checkout. + + If you want to create a new branch to retain commits you create, you may + do so (now or later) by using -b with the checkout command again. Example: + + git checkout -b + + HEAD is now at 7c7cd714e262 three + EOF + + # The remaining ones just show info about previous and current HEADs. + cat >2nd_detach <<-'EOF' && + Previous HEAD position was 7c7cd714e262 three + HEAD is now at 139b20d8e6c5 two + EOF + + cat >3rd_detach <<-'EOF' && + Previous HEAD position was 139b20d8e6c5 two + HEAD is now at d79ce1670bdc one + EOF + + reset && + check_not_detached && + + # Various ways of *not* asking for ellipses + + sane_unset GIT_PRINT_SHA1_ELLIPSIS && + git -c 'core.abbrev=12' checkout HEAD^ >actual 2>&1 && + check_detached && + test_i18ncmp 1st_detach actual && + + GIT_PRINT_SHA1_ELLIPSIS="no" git -c 'core.abbrev=12' checkout HEAD^ >actual 2>&1 && + check_detached && + test_i18ncmp 2nd_detach actual && + + GIT_PRINT_SHA1_ELLIPSIS= git -c 'core.abbrev=12' checkout HEAD^ >actual 2>&1 && + check_detached && + test_i18ncmp 3rd_detach actual && + + sane_unset GIT_PRINT_SHA1_ELLIPSIS && + + # We only have four commits, but we can re-use them + reset && + check_not_detached && + + # Make no mention of the env var at all + git -c
Re: How hard would it be to implement sparse fetching/pulling?
From: "Jeff Hostetler" Sent: Monday, December 04, 2017 3:36 PM On 12/2/2017 11:30 AM, Philip Oakley wrote: From: "Jeff Hostetler" Sent: Friday, December 01, 2017 2:30 PM On 11/30/2017 8:51 PM, Vitaly Arbuzov wrote: I think it would be great if we high level agree on desired user experience, so let me put a few possible use cases here. 1. Init and fetch into a new repo with a sparse list. Preconditions: origin blah exists and has a lot of folders inside of src including "bar". Actions: git init foo && cd foo git config core.sparseAll true # New flag to activate all sparse operations by default so you don't need to pass options to each command. echo "src/bar" > .git/info/sparse-checkout git remote add origin blah git pull origin master Expected results: foo contains src/bar folder and nothing else, objects that are unrelated to this tree are not fetched. Notes: This should work same when fetch/merge/checkout operations are used in the right order. With the current patches (parts 1,2,3) we can pass a blob-ish to the server during a clone that refers to a sparse-checkout specification. I hadn't appreciated this capability. I see it as important, and should be available both ways, so that a .gitNarrow spec can be imposed from the server side, as well as by the requester. It could also be used to assist in the 'precious/secret' blob problem, so that AWS keys are never pushed, nor available for fetching! To be honest, I've always considered partial clone/fetch as a client-side request as a performance feature to minimize download times and disk space requirements on the client. Mine was a two way view where one side or other specified an extent for the narrow clone to achieve either the speed/space improvement or partitioning capability. I've not thought of it from the "server has secrets" point of view. My potential for "secrets" was a little softer that some of the 'hard' security that is often discussed. I'm for the layered risk approach (swiss cheese model) We can talk about it, but I'd like to keep it outside the scope of the current effort. Agreed. My concerns are that that is not the appropriate mechanism to enforce MAC/DAC like security mechanisms. For example: [a] The client will still receive the containing trees that refer to the sensitive blobs, so the user can tell when the secret blobs change -- they wouldn't have either blob, but can tell when they are changed. This event by itself may or may not leak sensitive information depending on the terms of the security policy in place. [b] The existence of such missing blobs would tell the client which blobs are significant and secret and allow them to focus their attack. It would be better if those assets were completely hidden and not in the tree at all. [c] The client could push a fake secret blob to replace the valid one on the server. You would have to audit the server to ensure that it never accepts a push containing a change to any secret blob. And the server would need an infrastructure to know about all secrets in the tree. [d] When a secret blob does change, any local merges by the user lack information to complete the merge -- they can't merge the secrets and they can't be trusted to correctly pick-ours or pick-theirs -- so their workflows are broken. I'm not trying to blindly spread FUD here, but it is arguments like these that make me suggest that the partial clone mechanism is not the right vehicle for such "secret" blobs. I'm on the 'a little security is better than no security' side, but all the points are valid. There's a bit of a chicken-n-egg problem getting things set up. So if we assume your team would create a series of "known enlistments" under version control, then you could s/enlistments/entitlements/ I presume? Within my org we speak of "enlistments" as subset of the tree that you plan to work on. For example, you might enlist in the "file system" portion of the tree or in the "device drivers" portion. If the Makefiles have good partitioning, you should only need one of the above portions to do productive work within a feature area. Ah, so it's the things that have been requested by the client (I'd like to the enlist..) I'm not sure what you mean by "entitlements". It is like having the title deeds to a house - a list things you have, or can have. (e.g. a father saying: you can have the car on Saturday 6pm -11pm) At the end of the day the particular lists would be the same, they guide what is sent. just reference one by : during your clone. The server can lookup that blob and just use it. git clone --filter=sparse:oid=master:templates/bar URL And then the server will filter-out the unwanted blobs during the clone. (The current version only filters blobs; you still get full commits and trees. That will be revisited later.) I'm for the idea that only the in-heirachy trees should be sent. It shou
[PATCH v4 2/2] git-prompt: fix reading files with windows line endings
If any of the files read by __git_eread have \r\n line endings, read will only strip \n, leaving \r. This results in an ugly prompt, where instead of user@pc MINGW64 /path/to/repo (BARE:master) the last parenthesis is printed over the beginning of the prompt like )ser@pc MINGW64 /path/to/repo (BARE:master This patch fixes the issue by changing the internal field separator variable IFS to $'\r\n' before using the read builtin command. Note that ANSI-C Quoting/POSIX Quoting ($'...') is supported by bash as well as zsh, which are the current targets of git-prompt, cf. contrib/completion/git-prompt.sh. Signed-off-by: Robert Abel --- contrib/completion/git-prompt.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh index 41a471957a..983e419d2b 100644 --- a/contrib/completion/git-prompt.sh +++ b/contrib/completion/git-prompt.sh @@ -283,7 +283,7 @@ __git_ps1_colorize_gitstring () # variable, in that order. __git_eread () { - test -r "$1" && read "$2" <"$1" + test -r "$1" && IFS=$'\r\n' read "$2" <"$1" } # __git_ps1 accepts 0 or 1 arguments (i.e., format string) -- 2.13.0.windows.1
[PATCH v4 1/2] git-prompt: make __git_eread intended use explicit
__git_eread is used to read a single line of a given file (if it exists) into a single variable stripping the EOL. This patch removes the unused capability to split file contents into tokens by passing multiple variable names. Add a comment and explicitly use $2 instead of misleading $@ as argument to the read builtin command. Signed-off-by: Robert Abel --- contrib/completion/git-prompt.sh | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh index c6cbef38c2..41a471957a 100644 --- a/contrib/completion/git-prompt.sh +++ b/contrib/completion/git-prompt.sh @@ -278,11 +278,12 @@ __git_ps1_colorize_gitstring () r="$c_clear$r" } +# Helper function to read the first line of a file into a variable. +# __git_eread requires 2 arguments, the file path and the name of the +# variable, in that order. __git_eread () { - local f="$1" - shift - test -r "$f" && read "$@" <"$f" + test -r "$1" && read "$2" <"$1" } # __git_ps1 accepts 0 or 1 arguments (i.e., format string) -- 2.13.0.windows.1
Re: [PATCH v2 1/2] git-prompt: make __git_eread intended use explicit
Dear Junio, I'm amazed at how much time and energy you spend on correcting these essentially non-issues in my git commit messages for a quadruple-liner code change. I'll resend both patches one last time addressing the grave issue of the informative mention of multi-line files. Regards, Robert
Re: [PATCH] diff-tree: read the index so attribute checks work in bare repositories
On Tue, Dec 5, 2017 at 5:13 PM, Brandon Williams wrote: > A regression was introduced in 557a5998d (submodule: remove > gitmodules_config, 2017-08-03) to how attribute processing was handled > in bare repositories when running the diff-tree command. > > By default the attribute system will first try to read ".gitattribute" > files from the working tree and then falls back to reading them from the > index if there isn't a copy checked out in the worktree. Prior to > 557a5998d the index was read as a side effect of the call to > 'gitmodules_config()' which ensured that the index was already populated > before entering the attribute subsystem. > > Since the call to 'gitmodules_config()' was removed the index is no > longer being read so when the attribute system tries to read from the > in-memory index it doesn't find any ".gitattribute" entries effectively > ignoring any configured attributes. > > Fix this by explicitly reading the index during the setup of diff-tree. This commit message does a good job of explaining the issue, so someone who hasn't followed the thread (or has not followed it closely, like me) can understand the problem and solution. Thanks. A few comments below... > Reported-by: Ben Boeckel > Signed-off-by: Brandon Williams > --- > diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c > @@ -110,6 +110,7 @@ int cmd_diff_tree(int argc, const char **argv, const char > *prefix) > > git_config(git_diff_basic_config, NULL); /* no "diff" UI options */ > init_revisions(opt, prefix); > + read_cache(); 557a5998d (submodule: remove gitmodules_config, 2017-08-03) touched a fair number of built-in commands. It's not clear from the current patch's commit message if diff-tree is the only command which regressed. Is it? Or are other commands also likely to have regressed? Perhaps the commit message could say something about this. For instance: "All other commands touched by 557a5998d have been audited and were found to be regression-free" or "Other commands may regress in the same way, but we will take a wait-and-see attitude and fix them as needed because ". > diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh > @@ -636,6 +636,23 @@ test_expect_success 'check with space before tab in > indent (diff-tree)' ' > +test_expect_success 'check with ignored trailing whitespace attr > (diff-tree)' ' > + test_when_finished "git reset --hard HEAD^" && A few style nits... > + # Create a whitespace error that should be ignored. Comments in nearby tests are not capitalized and do not end with period. > + echo "* -whitespace" > ".gitattributes" && Please drop unnecessary quotes around the filename, as the extra noise makes it a bit harder to read. Also, lose space after redirection operator: echo "* -whitespace" >.gitattributes && > + git add ".gitattributes" && > + echo "trailing space -> " > "trailing-space" && All the nearby tests use some variation of: echo "foo (); " >x && which differs from the "trailing space ->" and filename 'trailing-space' used in this test. Lack of consistency makes this new test stick out like a sore thumb. > + git add "trailing-space" && > + git commit -m "trailing space" && > + > + # With a worktree diff-tree ignores the whitespace error > + git diff-tree --root --check HEAD && > + > + # Without a worktree diff-tree still ignores the whitespace error > + git -C .git diff-tree --root --check HEAD > +'
Re: [PATCH] diff-tree: read the index so attribute checks work in bare repositories
On Tue, Dec 5, 2017 at 2:13 PM, Brandon Williams wrote: > A regression was introduced in 557a5998d (submodule: remove > gitmodules_config, 2017-08-03) to how attribute processing was handled > in bare repositories when running the diff-tree command. > > By default the attribute system will first try to read ".gitattribute" > files from the working tree and then falls back to reading them from the > index if there isn't a copy checked out in the worktree. Prior to > 557a5998d the index was read as a side effect of the call to > 'gitmodules_config()' which ensured that the index was already populated > before entering the attribute subsystem. > > Since the call to 'gitmodules_config()' was removed the index is no > longer being read so when the attribute system tries to read from the > in-memory index it doesn't find any ".gitattribute" entries effectively > ignoring any configured attributes. > > Fix this by explicitly reading the index during the setup of diff-tree. > > Reported-by: Ben Boeckel > Signed-off-by: Brandon Williams > --- > > This patch should fix the regression. Let me know if it doesn't solve the > issue and I'll investigate some more. > Thanks for fixing this bug! The commit message is helpful to understand how this bug could slip in! > diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c > index d66499909..cfe7d0281 100644 > --- a/builtin/diff-tree.c > +++ b/builtin/diff-tree.c > @@ -110,6 +110,7 @@ int cmd_diff_tree(int argc, const char **argv, const char > *prefix) > > git_config(git_diff_basic_config, NULL); /* no "diff" UI options */ > init_revisions(opt, prefix); > + read_cache(); Although we do have very few unchecked calls to read_cache, I'd suggest to avoid spreading them. Most of the read_cache calls are guarded via: if (read_cache() < 0) die(_("index file corrupt")); I wonder if this hints at a bad API, and we'd rather have read_cache die() on errors, and the few callers that try to get out of trouble might need to use read_cache_gently() instead. (While this potentially large refactoring may be deferred, I'd ask for an if at least) Thanks, Stefan
Re: [PATCH] diff-tree: read the index so attribute checks work in bare repositories
Brandon Williams writes: > A regression was introduced in 557a5998d (submodule: remove > gitmodules_config, 2017-08-03) to how attribute processing was handled > in bare repositories when running the diff-tree command. > > By default the attribute system will first try to read ".gitattribute" > files from the working tree and then falls back to reading them from the > index if there isn't a copy checked out in the worktree. Prior to > 557a5998d the index was read as a side effect of the call to > 'gitmodules_config()' which ensured that the index was already populated > before entering the attribute subsystem. > > Since the call to 'gitmodules_config()' was removed the index is no > longer being read so when the attribute system tries to read from the > in-memory index it doesn't find any ".gitattribute" entries effectively > ignoring any configured attributes. > > Fix this by explicitly reading the index during the setup of diff-tree. Thanks, both. Will queue. > Reported-by: Ben Boeckel > Signed-off-by: Brandon Williams > --- > > This patch should fix the regression. Let me know if it doesn't solve the > issue and I'll investigate some more. > > builtin/diff-tree.c| 1 + > t/t4015-diff-whitespace.sh | 17 + > 2 files changed, 18 insertions(+) > > diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c > index d66499909..cfe7d0281 100644 > --- a/builtin/diff-tree.c > +++ b/builtin/diff-tree.c > @@ -110,6 +110,7 @@ int cmd_diff_tree(int argc, const char **argv, const char > *prefix) > > git_config(git_diff_basic_config, NULL); /* no "diff" UI options */ > init_revisions(opt, prefix); > + read_cache(); > opt->abbrev = 0; > opt->diff = 1; > opt->disable_stdin = 1; > diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh > index 559a7541a..6e061a002 100755 > --- a/t/t4015-diff-whitespace.sh > +++ b/t/t4015-diff-whitespace.sh > @@ -636,6 +636,23 @@ test_expect_success 'check with space before tab in > indent (diff-tree)' ' > test_must_fail git diff-tree --check HEAD^ HEAD > ' > > +test_expect_success 'check with ignored trailing whitespace attr > (diff-tree)' ' > + test_when_finished "git reset --hard HEAD^" && > + > + # Create a whitespace error that should be ignored. > + echo "* -whitespace" > ".gitattributes" && > + git add ".gitattributes" && > + echo "trailing space -> " > "trailing-space" && > + git add "trailing-space" && > + git commit -m "trailing space" && > + > + # With a worktree diff-tree ignores the whitespace error > + git diff-tree --root --check HEAD && > + > + # Without a worktree diff-tree still ignores the whitespace error > + git -C .git diff-tree --root --check HEAD > +' > + > test_expect_success 'check trailing whitespace (trailing-space: off)' ' > git config core.whitespace "-trailing-space" && > echo "foo (); " >x &&
Re: [PATCH] diff-tree: read the index so attribute checks work in bare repositories
On 12/05, Ben Boeckel wrote: > On Tue, Dec 05, 2017 at 14:13:37 -0800, Brandon Williams wrote: > > This patch should fix the regression. Let me know if it doesn't solve the > > issue and I'll investigate some more. > > Our test suite passes again. Thanks! Of course! Glad I could help :) > > Acked-by: Ben Boeckel > > --Ben -- Brandon Williams
Re: [PATCH] diff-tree: read the index so attribute checks work in bare repositories
On Tue, Dec 05, 2017 at 14:13:37 -0800, Brandon Williams wrote: > This patch should fix the regression. Let me know if it doesn't solve the > issue and I'll investigate some more. Our test suite passes again. Thanks! Acked-by: Ben Boeckel --Ben
Re: [WIP 2/2] submodule: read-only super-backed ref backend
On Fri, Dec 1, 2017 at 2:50 PM, Jonathan Tan wrote: > Note that a few major parts are still missing: > - special handling of the current branch of the superproject > - writing (whether "refs/..." to the superproject as an index change or >a commit, or non-"refs/..." directly to the subproject like usual) > > Signed-off-by: Jonathan Tan > --- a/refs.c > +++ b/refs.c > @@ -1575,12 +1575,17 @@ static struct ref_store *lookup_ref_store_map(struct > hashmap *map, > static struct ref_store *ref_store_init(const char *gitdir, > unsigned int flags) > { > - const char *be_name = "files"; > - struct ref_storage_be *be = find_ref_storage_backend(be_name); > + struct ref_storage_be *be; > struct ref_store *refs; > > + if (getenv("USE_SP")) { > + be = &refs_be_sp; > + } else { > + be = &refs_be_files; > + } > + > if (!be) > - die("BUG: reference backend %s is unknown", be_name); > + die("BUG: reference backend %s is unknown", "files"); If we pursue this approach more than just RFC-ish we would probably not use an environment variable but have some detection logic for the different ref backends. For example Shawns refstable[1] proposal uses the configuration `extensions.refStorage == reftable`, which this proposal could reuse and set to 'indexed_superproject' or 'commit_in_superproject', depending on the write semantics that we'd need to disucss (or do we want to offer 2 different superproject backends having these different semantics?) More to the code here, we could still use `find_ref_storage_backend()` depending on the env: const char *be_name = "files"; ... +if (getenv("USE_SP")) +be_name = "sp-backend"; ... and then we'd have to change the 'next ' pointer in refs_be_files (in refs/files-backend.c line ~3100) to point at the superproject backend instead of NULL. [1] https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md#Repository-format > +static int sp_read_raw_ref(struct ref_store *ref_store, > + const char *refname, struct object_id *oid, > + struct strbuf *referent, unsigned int *type) > +{ > + struct sp_ref_store *refs; > + > + refs = sp_downcast(ref_store, REF_STORE_READ, "read_raw_ref"); > + > + if (!starts_with(refname, "refs/")) { > + return refs_read_raw_ref(refs->files, refname, oid, referent, > type); > + } > + > + /* read from the superproject instead */ > + return get_superproject_gitlink_oid(refname, oid); > +} This function would need to get more sophisticated logic I would assume; the current RFC is a read-only thing, such that it may be sufficient to differentiate between refs/* and specials such as [FETCH_]HEAD, but as seen in the sp_ref_store struct definition we're already keeping a local reflog. I would assume that remotes are also local to the submodule instead of reusing the superprojects remote, for the sake of pulling in upstream changes of the submodule instead of being fully integrated. > diff --git a/submodule.c b/submodule.c > index ce511180e..1ffaeec82 100644 > --- a/submodule.c > +++ b/submodule.c > @@ -471,6 +471,7 @@ static void prepare_submodule_repo_env_no_git_dir(struct > argv_array *out) > void prepare_submodule_repo_env(struct argv_array *out) > { > prepare_submodule_repo_env_no_git_dir(out); > + argv_array_pushf(out, "USE_SP"); > argv_array_pushf(out, "%s=%s", GIT_DIR_ENVIRONMENT, > DEFAULT_GIT_DIR_ENVIRONMENT); > } This is an easy way to enforce all submodules use the superproject refs mode, which makes it easy to make recursive commands (e.g. `git checkout --recursive` would not need to pass down the exact gitlink sha1, but could pass down the branch name or even commit-ish "HEAD^^" and the submodule stays in perfect sync (according to superproject commit-ish). Once we take this further I would imagine there is an option in the submodule that specifies what mode it is using. Not sure if we'd want to keep the environment variable to force that mode from the superproject, though. > @@ -1986,7 +1987,7 @@ void absorb_git_dir_into_superproject(const char > *prefix, > * Return 0 if successful, 1 if not (for example, if the parent > * directory is not a repo or an unrelated one). > */ > -int get_superproject_entry(const char **full_name) > +static int get_superproject_entry(const char *ref, const char **full_name, > struct object_id *oid) Making it static could be part of the prior patch? When the ref is given we use `ls-tree -r` instead of `ls-files --stage` which give similar outputs to have the parsing logic overlap below, the difference is whether the index is looked up or the given ref. > @@ -2016,9 +2017,11 @@ int get_superproject_entry(const char **full_name) > prepare_
Re: [PATCH v2 0/9] rebase -i: add config to abbreviate command names
Liam Beguin writes: > This series will add the 'rebase.abbreviateCommands' configuration > option to allow `git rebase -i` to default to the single-letter command > names when generating the todo list. > > Using single-letter command names can present two benefits. First, it > makes it easier to change the action since you only need to replace a > single character (i.e.: in vim "r" instead of > "ciw"). Second, using this with a large enough value of > 'core.abbrev' enables the lines of the todo list to remain aligned > making the files easier to read. > > Changes in V2: > - Refactor and rename 'transform_todo_ids' > - Replace SHA-1 by OID in rebase--helper.c > - Update todo list related functions to take a generic 'flags' parameter > - Rename 'add_exec_commands' function to 'sequencer_add_exec_commands' > - Rename 'add-exec' option to 'add-exec-commands' > - Use 'strbur_read_file' instead of rewriting it > - Make 'command_to_char' return 'comment_char_line' if no single-letter > command name is defined > - Combine both tests into a single test case > - Update commit messages > > Changes in V2: > - Rename 'transform_todo_insn' to 'transform_todos' > - Fix flag name TODO_LIST_SHORTE{D,N}_IDS I've replaced this series and pushed out the result. Thanks.
Re: [RFE] Inverted sparseness (amended)
From: "Randall S. Becker" On December 3, 2017 6:14 PM, Philip Oakley wrote a nugget of wisdom: From: "Randall S. Becker" [...] If using the empty tree part doesn't pass muster (i.e. showing nothing isn't sufficient), then the narrow clone could come into play to limit what parts of the trees are widely visible, but mainly its using the grafts to cover the regulatory gap, and (for the moment) using fast-export to transfer the singleton commit / tags Oh Just remembered, there is the newish capability to fetch random blobs, so that may help. I think you hit the nail on the head pretty well. We're currently at 2.3.7, with a push to 2.15.1 this week, so I'm looking forward to trying this. My two worries are whether the empty tree is acceptable (it should be to the client, and might be to the vendor), and doing this reliably (semi-automated) so the user base does not have to worry about the gory details of doing this. The unit tests for it are undoubtedly going to give me headaches. Thanks for the advice. Islands of shallowness are a really descriptive image for what this is. So identifying that there are shoals (to extend the metaphor somewhat), will be crucial to this adventure. These islands of shallowness, however, are also concerns as described in the [Re: How hard would it be to implement sparse fetching/pulling?] thread. The matter of the security audit is important here also: I'm just thinking that even if we get a *perfectly working* partial clone/fetch/push/etc. that it would not pass a security audit. Philip says: I'd totally disagree in the sense that if we had a submodule anywhere_ in the repo that would be an independent island of code, and we are quite happy with that - we use the web of trust with the auditors for them to go check, separately, the oid of the independent portion, which may be at another site or another vendor/client. That's OK, so what's the problem here... We do the same for pinning the tips and tails of the lines of development that make for the shallowness and narrowness that create these shoals, and oxbows of development. Managing them is normal human activity, with the technical support that the Git chain provides - so much better than previous 'versioning systems' that we see regularly in engineering, with backdoor tweaks etc. The key is to ensure that there is a proper hand holding across the air gaps, such that the oids exist both sides of the gaps, and a properly built on, such that the hash chain is unbroken. It's a similar negotiation to those used for establishing web security between IP clients, so it is doable. But you are right to have concerns and suspisions to ensure that it is all tested and verified -- Philip (sorry about the poor quoting of the reply) Not having the capability would similarly cause a failure of a security audit. Cheers, Randall -- Brief whoami: NonStop&UNIX developer since approximately UNIX(421664400)/NonStop(2112884442) -- In my real life, I talk too much.
[PATCH] diff-tree: read the index so attribute checks work in bare repositories
A regression was introduced in 557a5998d (submodule: remove gitmodules_config, 2017-08-03) to how attribute processing was handled in bare repositories when running the diff-tree command. By default the attribute system will first try to read ".gitattribute" files from the working tree and then falls back to reading them from the index if there isn't a copy checked out in the worktree. Prior to 557a5998d the index was read as a side effect of the call to 'gitmodules_config()' which ensured that the index was already populated before entering the attribute subsystem. Since the call to 'gitmodules_config()' was removed the index is no longer being read so when the attribute system tries to read from the in-memory index it doesn't find any ".gitattribute" entries effectively ignoring any configured attributes. Fix this by explicitly reading the index during the setup of diff-tree. Reported-by: Ben Boeckel Signed-off-by: Brandon Williams --- This patch should fix the regression. Let me know if it doesn't solve the issue and I'll investigate some more. builtin/diff-tree.c| 1 + t/t4015-diff-whitespace.sh | 17 + 2 files changed, 18 insertions(+) diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c index d66499909..cfe7d0281 100644 --- a/builtin/diff-tree.c +++ b/builtin/diff-tree.c @@ -110,6 +110,7 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix) git_config(git_diff_basic_config, NULL); /* no "diff" UI options */ init_revisions(opt, prefix); + read_cache(); opt->abbrev = 0; opt->diff = 1; opt->disable_stdin = 1; diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh index 559a7541a..6e061a002 100755 --- a/t/t4015-diff-whitespace.sh +++ b/t/t4015-diff-whitespace.sh @@ -636,6 +636,23 @@ test_expect_success 'check with space before tab in indent (diff-tree)' ' test_must_fail git diff-tree --check HEAD^ HEAD ' +test_expect_success 'check with ignored trailing whitespace attr (diff-tree)' ' + test_when_finished "git reset --hard HEAD^" && + + # Create a whitespace error that should be ignored. + echo "* -whitespace" > ".gitattributes" && + git add ".gitattributes" && + echo "trailing space -> " > "trailing-space" && + git add "trailing-space" && + git commit -m "trailing space" && + + # With a worktree diff-tree ignores the whitespace error + git diff-tree --root --check HEAD && + + # Without a worktree diff-tree still ignores the whitespace error + git -C .git diff-tree --root --check HEAD +' + test_expect_success 'check trailing whitespace (trailing-space: off)' ' git config core.whitespace "-trailing-space" && echo "foo (); " >x && -- 2.15.1.424.g9478a66081-goog
Re: [PATCH v6 09/12] fixup: sha1_file: convert gotos to break/continue
On 12/5/2017 3:54 PM, Junio C Hamano wrote: Jeff Hostetler writes: From: Jeff Hostetler Signed-off-by: Jeff Hostetler --- sha1_file.c | 40 1 file changed, 20 insertions(+), 20 deletions(-) The second (i.e. this) part and the third part are not yet in 'next', so it will perfectly be fine to squash these into the commits that introduces the issues that are corrected in this "fixup". The same comment applies to all the other "fixup" patches. OK thanks. I left this one as a separate commit so that we could debate the merits of the while(1) vs the gotos and keep/reject independent of the other changes. I'll squash it in the next version. Thanks Jeff
Re: [PATCH v4 1/4] Makefile: generate Perl header from template file
Dan Jacques writes: > Thanks for checking! The patch that you quoted above looks like it's from > this "v4" thread; however, the patch that you are diffing against in your > latest reply seems like it is from an earlier version. > > I believe that the $(pathsep) changes in your proposed patch are already > present in v4,... You're of course right. The patches I had in my tree are outdated. Will replace, even though I won't be merging them to 'pu' while we wait for Ævar's perl build procedure update to stabilize. Thanks.
Re: [PATCH v4 1/4] Makefile: generate Perl header from template file
Johannes Sixt wrote: > I don't know what I tested last week; most likely not the version of the > patch I quoted above. > > Today's version, with the tip at 5d7f59c391ce, is definitely bogus > with its quoting. It needs the patch below, otherwise an unquoted > semicolon may be expanded from $(pathsep). This would terminate the sed > command, of course. Thanks for checking! The patch that you quoted above looks like it's from this "v4" thread; however, the patch that you are diffing against in your latest reply seems like it is from an earlier version. I believe that the $(pathsep) changes in your proposed patch are already present in v4, having been caught by Johannes, so perhaps you were using an older version of the patch as the diffbase? That change only appeared in v4, so any older version would have had the previous incorrect quoting. I think the reason that is necessary to remove the single quotes is not because the ";" is terminating the `sed` expression, but because it's terminating the entire shell stanza, causing the shell to execute two commands: 1) {"sed" "-e" "s=@@PATHSEP@@="} 2) {"=g" "-e" "s=@@INSTLIBDIR@@=..." ...} By including the ";" within the single-tick-protected `sed` argument, the shell receives the entire block, "s=@@PATHSEP@@=;=", as a single string. Also see Johannes' explanation here: https://github.com/danjacques/git/pull/1/commits/57dc265478692ea2a395fc61fa122cb73ce58dc3 Could you apply this v4 patch series and confirm that it is working for you? Or, if I am mistaken and your patch is correctly against this "v4" patch series, let me know and I'll try and figure out what I'm missing. Thanks again! -Dan
Re: [PATCH v4 1/4] Makefile: generate Perl header from template file
Johannes Sixt writes: > Today's version, with the tip at 5d7f59c391ce, is definitely bogus > with its quoting. It needs the patch below, otherwise an unquoted > semicolon may be expanded from $(pathsep). This would terminate the sed > command, of course. Of course ;-) Somehow I was lucky that the topic as been de-queued from 'pu' for the past few days. > diff --git a/Makefile b/Makefile > index 484dc44ade..a658c8169a 100644 > --- a/Makefile > +++ b/Makefile > @@ -2071,10 +2071,10 @@ GIT-PERL-HEADER: $(PERL_HEADER_TEMPLATE) > GIT-PERL-DEFINES perl/perl.mak > INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C perl -s --no-print-directory > instlibdir` && \ > INSTLIBDIR_EXTRA='$(PERLLIB_EXTRA_SQ)' && \ > INSTLIBDIR="$$INSTLIBDIR$${INSTLIBDIR_EXTRA:+:$$INSTLIBDIR_EXTRA}" && \ > - sed -e 's=@@PATHSEP@@='$(pathsep)'=g' \ > - -e 's=@@INSTLIBDIR@@='$$INSTLIBDIR'=g' \ > - -e 's=@@GITEXECDIR@@='$(gitexecdir_relative_SQ)'=g' \ > - -e 's=@@PERLLIBDIR@@='$(perllibdir_relative_SQ)'=g' \ > + sed -e 's=@@PATHSEP@@=$(pathsep)=g' \ > + -e 's=@@INSTLIBDIR@@='"$$INSTLIBDIR"'=g' \ > + -e 's=@@GITEXECDIR@@=$(gitexecdir_relative_SQ)=g' \ > + -e 's=@@PERLLIBDIR@@=$(perllibdir_relative_SQ)=g' \ > $< >$@ > > .PHONY: gitweb
Re: [PATCH v6 09/12] fixup: sha1_file: convert gotos to break/continue
Jeff Hostetler writes: > + while (1) { > + if (find_pack_entry(real, &e)) > + break; > > + /* Most likely it's a loose object. */ > + if (!sha1_loose_object_info(real, oi, flags)) > + return 0; > > + /* Not a loose object; someone else may have just packed it. */ > + reprepare_packed_git(); > + if (find_pack_entry(real, &e)) > + break; > > + /* Check if it is a missing object */ > + if (fetch_if_missing && repository_format_partial_clone && > + !already_retried) { > + fetch_object(repository_format_partial_clone, real); > + already_retried = 1; > + continue; > + } > + > + return -1; > + } OK. This "infinite" loop actually never runs more than twice, thanks to "already-retried" thing. It probably makes sense to structure the code this way. I suspect that this would also have worked better with the older incarnation of the jk/fewer-pack-rescan topic where an optimization used to be after that the initial queries to packs fail to find the object, but the topic has since been updated to do its optimization check much earlier in the function, so its interaction with this topic would not be much of an issue anymore with or without this rewrite. In any case, I think this is a good change.
Re: [PATCH v6 09/12] fixup: sha1_file: convert gotos to break/continue
Jeff Hostetler writes: > From: Jeff Hostetler > > Signed-off-by: Jeff Hostetler > --- > sha1_file.c | 40 > 1 file changed, 20 insertions(+), 20 deletions(-) The second (i.e. this) part and the third part are not yet in 'next', so it will perfectly be fine to squash these into the commits that introduces the issues that are corrected in this "fixup". The same comment applies to all the other "fixup" patches. > > diff --git a/sha1_file.c b/sha1_file.c > index fc7718a..ce67f27 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -1180,30 +1180,30 @@ int sha1_object_info_extended(const unsigned char > *sha1, struct object_info *oi, > } > } > > -retry: > - if (find_pack_entry(real, &e)) > - goto found_packed; > + while (1) { > + if (find_pack_entry(real, &e)) > + break; > > - /* Most likely it's a loose object. */ > - if (!sha1_loose_object_info(real, oi, flags)) > - return 0; > + /* Most likely it's a loose object. */ > + if (!sha1_loose_object_info(real, oi, flags)) > + return 0; > > - /* Not a loose object; someone else may have just packed it. */ > - reprepare_packed_git(); > - if (find_pack_entry(real, &e)) > - goto found_packed; > - > - /* Check if it is a missing object */ > - if (fetch_if_missing && repository_format_partial_clone && > - !already_retried) { > - fetch_object(repository_format_partial_clone, real); > - already_retried = 1; > - goto retry; > - } > + /* Not a loose object; someone else may have just packed it. */ > + reprepare_packed_git(); > + if (find_pack_entry(real, &e)) > + break; > > - return -1; > + /* Check if it is a missing object */ > + if (fetch_if_missing && repository_format_partial_clone && > + !already_retried) { > + fetch_object(repository_format_partial_clone, real); > + already_retried = 1; > + continue; > + } > + > + return -1; > + } > > -found_packed: > if (oi == &blank_oi) > /* >* We know that the caller doesn't actually need the
Re: [PATCH v4 1/4] Makefile: generate Perl header from template file
Am 01.12.2017 um 18:25 schrieb Johannes Sixt: > Am 01.12.2017 um 18:13 schrieb Johannes Schindelin: >> Hi Hannes, >> >> On Fri, 1 Dec 2017, Johannes Sixt wrote: >> >>> Am 29.11.2017 um 16:56 schrieb Dan Jacques: @@ -1989,6 +1986,15 @@ GIT-PERL-DEFINES: FORCE echo "$$FLAGS" >$@; \ fi +GIT-PERL-HEADER: $(PERL_HEADER_TEMPLATE) GIT-PERL-DEFINES perl/perl.mak Makefile + $(QUIET_GEN)$(RM) $@ && \ + INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C perl -s --no-print-directory instlibdir` && \ + INSTLIBDIR_EXTRA='$(PERLLIB_EXTRA_SQ)' && \ + INSTLIBDIR="$$INSTLIBDIR$${INSTLIBDIR_EXTRA:+:$$INSTLIBDIR_EXTRA}" && \ + sed -e 's=@@PATHSEP@@=$(pathsep)=g' \ >>> >>> This doesn't work, unfortunately. When $(pathsep) is ';', we get an >>> incomplete >>> sed expression because ';' is also a command separator in the sed >>> language. >> >> Funny, I tried this also with ';' as pathsep, and it worked in the Git >> for >> Windows SDK... > > My ancient sed vs. your modern sed, perhaps? I can check this on Monday. I don't know what I tested last week; most likely not the version of the patch I quoted above. Today's version, with the tip at 5d7f59c391ce, is definitely bogus with its quoting. It needs the patch below, otherwise an unquoted semicolon may be expanded from $(pathsep). This would terminate the sed command, of course. diff --git a/Makefile b/Makefile index 484dc44ade..a658c8169a 100644 --- a/Makefile +++ b/Makefile @@ -2071,10 +2071,10 @@ GIT-PERL-HEADER: $(PERL_HEADER_TEMPLATE) GIT-PERL-DEFINES perl/perl.mak INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C perl -s --no-print-directory instlibdir` && \ INSTLIBDIR_EXTRA='$(PERLLIB_EXTRA_SQ)' && \ INSTLIBDIR="$$INSTLIBDIR$${INSTLIBDIR_EXTRA:+:$$INSTLIBDIR_EXTRA}" && \ - sed -e 's=@@PATHSEP@@='$(pathsep)'=g' \ - -e 's=@@INSTLIBDIR@@='$$INSTLIBDIR'=g' \ - -e 's=@@GITEXECDIR@@='$(gitexecdir_relative_SQ)'=g' \ - -e 's=@@PERLLIBDIR@@='$(perllibdir_relative_SQ)'=g' \ + sed -e 's=@@PATHSEP@@=$(pathsep)=g' \ + -e 's=@@INSTLIBDIR@@='"$$INSTLIBDIR"'=g' \ + -e 's=@@GITEXECDIR@@=$(gitexecdir_relative_SQ)=g' \ + -e 's=@@PERLLIBDIR@@=$(perllibdir_relative_SQ)=g' \ $< >$@ .PHONY: gitweb -- 2.14.2.808.g3bc32f2729
Re: How hard would it be to implement sparse fetching/pulling?
Hi, Jeff Hostetler wrote: > On 12/2/2017 1:24 PM, Philip Oakley wrote: >> From: "Jeff Hostetler" >> Sent: Friday, December 01, 2017 5:23 PM >>> Discussing this feature in the context of the defense industry >>> makes me a little nervous. (I used to be in that area.) >> >> I'm viewing the desire for codebase partitioning from a soft layering >> of risk view (perhaps a more UK than USA approach ;-) > > I'm not sure I know what this means or how the UK defense > security models/policy/procedures are different from the US, > so I can't say much here. I'm just thinking that even if we > get a *perfectly working* partial clone/fetch/push/etc. that > it would not pass a security audit. I might be wrong here > (and I'm no expert on the subject), but I think they would > push you towards a different solution architecture. I'm pretty ignorant about the defense industry, but a few more comments: - gitolite implements some features on top of git's server code that I consider to be important for security. So much so that I've been considering what it would take to remove the git-shell command from git.git and move it to the gitolite project where people would be better equipped to use it in an appropriate context - in particular, git's reachability checking code could use some hardening/improvement. In particular, think of edge cases like where someone pushes a pack with deltas referring to objects they should not be able to reach. - Anyone willing to audit git code's security wins my approval. Please, please, audit git code and report the issues you find. :) [...] > Also omitting certain trees means you now (obviously) have both missing > trees and blobs. And both need to be dynamically or batch fetched as > needed. And certain operations will need multiple round trips to fully > resolve -- fault in a tree and then fault in blobs referenced by it. For omitting trees, we will need to modify the index format, since the index has entries for all paths today. That's on the roadmap but has not been implemented yet. Thanks, Jonathan
Re: [PATCH v6 00/12] Partial clone part 2: fsck and promisors
On Tue, 5 Dec 2017 16:58:42 + Jeff Hostetler wrote: > From: Jeff Hostetler > > This is V6 of part 2 of partial clone. This assumes V6 of part 1 > is already present. This version fixes a problem in fetch-pack > observed in V5. It also contains 2 "fixup" commits that are > WIP responses to comments on V5. A note on the fix of a problem in fetch-pack observed in V5: to do this, I renamed the "no_haves" setting to "no_dependents", since this setting now has a broader effect than merely suppressing "have" lines. This setting is described in patch 7 ("introduce fetch-object: fetch one promisor object"). > Part 2 is concerned with fsck, gc, initial support for dynamic > object fetching, and tracking promisor objects. Jonathan Tan > originally developed this code. I have moved it on top of > part 1 and updated it slightly. Thanks. I checked the diff between this and V5 and it looks as I expected. ("git am -3" didn't work on the patches as e-mailed to the list, though - I had to use the one hosted at GitHub [1].) [1] https://github.com/jeffhostetler/git/tree/core/pc6_p2
Re: [WIP 1/2] submodule: refactor acquisition of superproject info
On Fri, Dec 1, 2017 at 2:50 PM, Jonathan Tan wrote: > Signed-off-by: Jonathan Tan > --- > submodule.c | 76 > + > submodule.h | 3 +++ This patch reads very similar to [1], which was a preparation part of the series "[RFC PATCH 0/4] git-status reports relation to superproject"; there we also need the same information about the superproject. I'll take a closer look and compare these two patches if it turns out we want to go this way long term. [1] https://public-inbox.org/git/20171108195509.7839-3-sbel...@google.com/
Re: [PATCH v1 1/2] t/README: remove mention of adding copyright notices
Hi, Thomas Gummerer wrote: > We generally no longer include copyright notices in new test scripts. > However t/README still mentions it as something to include at the top of > every new script. Where can I read more about this change? Was it a deliberate change or something that simply happened? Thanks, Jonathan
Re: gitattributes not read for diff-tree anymore in 2.15?
On Tue, Dec 05, 2017 at 10:16:45 -0800, Brandon Williams wrote: > Perfect, thanks! OK, attached is a shell script which recreates the issue. I haven't been able to get it to happen without the `GIT_WORK_TREE` and `GIT_INDEX_FILE` setup involved, so that seems to be important. I reran the bisect using this script and came up with this commit: commit be4ca290570f9173db64ea1f925b5b3831c6efed Author: David Turner Date: Thu Apr 20 16:41:18 2017 -0400 Increase core.packedGitLimit which seems even less relevant… Thanks, --Ben diff-tree-break.sh Description: Bourne shell script
Re: [PATCH v1 2/2] t/README: document test_cmp_rev
Hi, Thomas Gummerer wrote: > test_cmp_rev is a useful function that's used in quite a few test > scripts. It is however not documented in t/README. Document it. > > Signed-off-by: Thomas Gummerer > --- > t/README | 5 + > 1 file changed, 5 insertions(+) Reviewed-by: Jonathan Nieder I admit I usually go straight to t/test-lib-functions.sh when I want to find an appropriate helper. I think this kind of introductory documentation in t/README is still useful, though. Thanks, Jonathan
RE: [RFE] Inverted sparseness (amended)
On December 3, 2017 6:14 PM, Philip Oakley wrote a nugget of wisdom: >From: "Randall S. Becker" >Sent: Friday, December 01, 2017 6:31 PM >> On December 1, 2017 1:19 PM, Jeff Hostetler wrote: >>>On 12/1/2017 12:21 PM, Randall S. Becker wrote: I recently encountered a really strange use-case relating to sparse clone/fetch that is really backwards from the discussion that has been going on, and well, I'm a bit embarrassed to bring it up, but I have no good solution including building a separate data store that will end up inconsistent with repositories (a bad solution). The use-case is as follows: Given a backbone of multiple git repositories spread across an organization with a server farm and upstream vendors. The vendor delivers code by having the client perform git pull into a specific branch. The customer may take the code as is or merge in customizations. The vendor wants to know exactly what commit of theirs is installed on each server, in near real time. The customer is willing to push the commit-ish to the vendor's upstream repo but does not want, by default, to share the actual commit contents for security reasons. Realistically, the vendor needs to know that their own commit id was put somewhere (process exists to track this, so not part of the use-case) and whether there is a subsequent commit contributed >by the customer, but the content is not relevant initially. After some time, the vendor may request the commit contents from the customer in order to satisfy support requirements - a.k.a. a defect was found but has to be resolved. The customer would then perform a deeper push that looks a lot like a "slightly" symmetrical operation of a deep fetch following a prior sparse fetch to supply the vendor with the specific commit(s). >> >>>Perhaps I'm not understanding the subtleties of what you're >>>describing, but could you do this with stock git functionality. >> >>>Let the vendor publish a "well known branch" for the client. >>>Let the client pull that and build. >>>Let the client create a branch set to the same commit that they fetched. >>>Let the client push that branch as a client-specific branch to the >>>vendor to indicate that that is the official release they are based on. >> >>>Then the vendor would know the official commit that the client was using. >> This is the easy part, and it doesn't require anything sparse to exist. >> >>>If the client makes local changes, does the vendor really need the >>>SHA of those -- without the actual content? >>>I mean any SHA would do right? Perhaps let the client create a >>>second client-specific branch (set to the same commit as the first) >>>to indicate they had mods. >>>Later, when the vendor needs the actual client changes, the client >>>does a normal push to this 2nd client-specific branch at the vendor. >>>This would send everything that the client has done to the code since >>>the official release. >> >> What I should have added to the use-case was that there is a strong >> audit requirement (regulatory, actually) involved that the SHA is >> exact, immutable, and cannot be substitute or forged (one of the >> reasons git is in such high regard). So, no I can't arrange a fake >> SHA to represent a SHA to be named later. It SHA of the installed >> commit is part of the official record of what happened on the specific >> server, so I'm stuck with it. >> >>>I'm not sure what you mean about "it is inside a tree". >> >> m---a---b---c---H1 >> `---d---H2 >> >> d would be at a head. b would be inside. Determining content of c is >> problematic if b is sparse, so I'm really unsure that any of this is >> possible. >I think I get the jist of your use case. Would I be right that you >don't have a true working solution yet? i.e. that it's a problem that is >almost sorted but falls down at the last step. >If one pretended that this was a single development shop, and the >various vendors, clients and customers as being independent devolopers, >each of whom is over protective of their code, it may give a better view that >maps onto classic feature development diagrams. >(i.e draw the answer for local devs, then mark where the splits happen) >In particular, I think you could use a notional regulator's view that >the whole code base is part of a large Git heirarchy of branches and >merges, and that some of the feature loops are only available via the >particular developer that worked on that feature. >This would mean that from a regulatory overview there is a merge commit in the >'main' >(master) heirachy that has the main and feature commits listed, and the >feature commit is probably an --allow-empty commit (that has an empty >tree if they are that paranoid) that says 'function X released' (and >probably tagged), and that release commit then has, as its parent, t
Re: [PATCH v2] pathspec: only match across submodule boundaries when requested
Brandon Williams writes: > Commit 74ed43711fd (grep: enable recurse-submodules to work on > objects, 2016-12-16) taught 'tree_entry_interesting()' to be able to > match across submodule boundaries in the presence of wildcards. This is > done by performing literal matching up to the first wildcard and then > punting to the submodule itself to perform more accurate pattern > matching. Instead of introducing a new flag to request this behavior, > commit 74ed43711fd overloaded the already existing 'recursive' flag in > 'struct pathspec' to request this behavior. > > This leads to a bug where whenever any other caller has the 'recursive' > flag set as well as a pathspec with wildcards that all submodules will > be indicated as matches. One simple example of this is: > > git init repo > cd repo > > git init submodule > git -C submodule commit -m initial --allow-empty > > touch "[bracket]" > git add "[bracket]" > git commit -m bracket > git add submodule > git commit -m submodule > > git rev-list HEAD -- "[bracket]" > > Fix this by introducing the new flag 'recurse_submodules' in 'struct > pathspec' and using this flag to determine if matches should be allowed > to cross submodule boundaries. Makes sense. I initially misread the title "pathspec: only match across submodule boundaries when requested" to be saying that the match happens only at the boundary, but that "only" is not about where the match happens. "pathspec: match across submodule boundaries only when requested" would have avoided such a confusion. > diff --git a/builtin/grep.c b/builtin/grep.c > index 5a6cfe6b4..3ca4ac80d 100644 > --- a/builtin/grep.c > +++ b/builtin/grep.c > @@ -1015,6 +1015,7 @@ int cmd_grep(int argc, const char **argv, const char > *prefix) > prefix, argv + i); > pathspec.max_depth = opt.max_depth; > pathspec.recursive = 1; > + pathspec.recurse_submodules = !!recurse_submodules; With the current code, recurse_submodules can only be 0 or 1 (the only assignment is from the return value of git_config_bool()), so the force-boolean !! is not strictly needed, but it may be a good future-proofing measure. Will queue; thanks.
Re: How hard would it be to implement sparse fetching/pulling?
On 12/2/2017 1:24 PM, Philip Oakley wrote: From: "Jeff Hostetler" Sent: Friday, December 01, 2017 5:23 PM On 11/30/2017 6:43 PM, Philip Oakley wrote: [...] Discussing this feature in the context of the defense industry makes me a little nervous. (I used to be in that area.) I'm viewing the desire for codebase partitioning from a soft layering of risk view (perhaps a more UK than USA approach ;-) I'm not sure I know what this means or how the UK defense security models/policy/procedures are different from the US, so I can't say much here. I'm just thinking that even if we get a *perfectly working* partial clone/fetch/push/etc. that it would not pass a security audit. I might be wrong here (and I'm no expert on the subject), but I think they would push you towards a different solution architecture. What we have in the code so far may be a nice start, but probably doesn't have the assurances that you would need for actual deployment. But it's a start True. I need to get some of my collegues more engaged... [...] Yes, this does tend to lead towards an always-online mentality. However, there are 2 parts: [a] dynamic object fetching for missing objects, such as during a random command like diff or blame or merge. We need this regardless of usage -- because we can't always predict (or dry-run) every command the user might run in advance. Making something "useful" happen here when off-line is an obvious goal. [b] batch fetch mode, such as using partial-fetch to match your sparse-checkout so that you always have the blobs of interest to you. And assuming you don't wander outside of this subset of the tree, you should be able to work offline as usual. If you can work within the confines of [b], you wouldn't need to always be online. I feel this is the area that does need ensure a capability to avoid any perception of the much maligned 'Embrace, extend, and extinguish'> by accidental lockout. I don't think this should be viewed as a type of sparse checkout - it's just a checkout of what you have (under the hood it could use the same code though). Right, I'm only thinking of this effort as a way to get a partial clone and fetch that omits unneeded (or, not immediately needed) objects for performance reasons. There are several use scenarios that I've discussed and sparse-checkout is one of them, but I do not consider this to be a sparse-checkout feature. [...] The main problem with markers or other lists of missing objects is that it has scale problems for large repos. Suppose I have 100M blobs in my repo. If I do a blob:none clone, I'd have 100M missing blobs that would need tracking. If I then do a batch fetch of the blobs needed to do a sparse checkout of HEAD, I'd have to remove those entries from the tracking data. Not impossible, but not speedy either. ** Ahhh. I see. That's a consequence of having all the trees isn't it. ** I've always thought that limiting the trees is at the heart of the Narrow clone/fetch problem. OK so if you have flat, wide structures with 10k files/directories per tree then it's still a fair sized problem, but it should *scale logarithmically* for the part of the tree structure that's not being downloaded. You never have to add a marker for a blob that you have no containing tree for. Nor for the tree that contained the blob's tree, all the way up to primary line of descent to the tree of concern. All those trees are never down loaded, there are few markers (.gitNarrowTree files) for those tree stubs, certainly no 100M missing blob markers. Currently, the code only omits blobs. I want to extend the current code to have filters that also exclude unneeded trees. That will help address some of these size concerns, but there are still perf issues here. * Marking of 'missing' objects in the local object store, and on the wire. The missing objects are replaced by a place holder object, which used the same oid/sha1, but has a short fixed length, with content “GitNarrowObject ”. The chance that that string would actually have such an oid clash is the same as all other object hashes, so is a *safe* self-referential device. Again, there is a scale problem here. If I have 100M missing blobs, I can't afford to create 100M loose place holder files. Or juggle a 2GB file of missing objects on various operations. As above, I'm also trimming the trees, so in general, there would be no missing blobs, just the content of the directory one was interested in. That's not quite true if higher level trees have blob references in them that are otherwise unwanted - they may each need a marker. [Or maybe a special single 'tree-of-blobs' marker for them all thus only one marker per tree - over-thinking maybe...] Also omitting certain trees means you now (obviously) have both missing trees and blobs. And both need to be dynamically or batch fetched as needed. And certain operations will need multiple round
Re: gitattributes not read for diff-tree anymore in 2.15?
On 12/05, Ben Boeckel wrote: > On Mon, Dec 04, 2017 at 15:03:55 -0800, Brandon Williams wrote: > > Reading the attributes files should be done regardless if the gitmodules > > file is read. The gitmodules file should only come into play if you are > > dealing with submodules. > > Yeah, it doesn't seem to make sense why this commit breaks us, but there > it is *shrug*. While it doesn't make the most sense, I still wouldn't be surprised if I missed something when writing that patch that inadvertently caused an issue. > > > Do you mind providing a reproduction recipe with expected outcome vs > > actual outcome and I can take a closer look. > > I'll work on one. It isn't as simple as I thought it was :) . The setup > we do before running the checks is apparently involved as running it > from the command line is not exhibiting the difference. > > --Ben Perfect, thanks! -- Brandon Williams
[PATCH v3 9/9] t3404: add test case for abbreviated commands
Make sure the todo list ends up using single-letter command abbreviations when the rebase.abbreviateCommands is enabled. This configuration option should not change anything else. Signed-off-by: Liam Beguin --- t/t3404-rebase-interactive.sh | 22 ++ 1 file changed, 22 insertions(+) diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh index 6a82d1ed876d..481a3500900d 100755 --- a/t/t3404-rebase-interactive.sh +++ b/t/t3404-rebase-interactive.sh @@ -1260,6 +1260,28 @@ test_expect_success 'rebase -i respects rebase.missingCommitsCheck = error' ' test B = $(git cat-file commit HEAD^ | sed -ne \$p) ' +test_expect_success 'respects rebase.abbreviateCommands with fixup, squash and exec' ' + rebase_setup_and_clean abbrevcmd && + test_commit "first" file1.txt "first line" first && + test_commit "second" file1.txt "another line" second && + test_commit "fixup! first" file2.txt "first line again" first_fixup && + test_commit "squash! second" file1.txt "another line here" second_squash && + cat >expected <<-EOF && + p $(git rev-list --abbrev-commit -1 first) first + f $(git rev-list --abbrev-commit -1 first_fixup) fixup! first + x git show HEAD + p $(git rev-list --abbrev-commit -1 second) second + s $(git rev-list --abbrev-commit -1 second_squash) squash! second + x git show HEAD + EOF + git checkout abbrevcmd && + set_cat_todo_editor && + test_config rebase.abbreviateCommands true && + test_must_fail git rebase -i --exec "git show HEAD" \ + --autosquash master >actual && + test_cmp expected actual +' + test_expect_success 'static check of bad command' ' rebase_setup_and_clean bad-cmd && set_fake_editor && -- 2.15.1.280.g10402c1f5b5c
[PATCH v3 5/9] rebase -i: replace reference to sha1 with oid
Since we are trying to abstract the hash function name elsewhere in the code base, lets use OID instead of SHA-1 in the rebase--helper too. Signed-off-by: Liam Beguin --- builtin/rebase--helper.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c index 8ad4779d1650..c3b8e4d401f8 100644 --- a/builtin/rebase--helper.c +++ b/builtin/rebase--helper.c @@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix) struct replay_opts opts = REPLAY_OPTS_INIT; int keep_empty = 0; enum { - CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S, + CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS, CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH } command = 0; struct option options[] = { @@ -27,9 +27,9 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix) OPT_CMDMODE(0, "make-script", &command, N_("make rebase script"), MAKE_SCRIPT), OPT_CMDMODE(0, "shorten-ids", &command, - N_("shorten SHA-1s in the todo list"), SHORTEN_SHA1S), + N_("shorten commit ids in the todo list"), SHORTEN_OIDS), OPT_CMDMODE(0, "expand-ids", &command, - N_("expand SHA-1s in the todo list"), EXPAND_SHA1S), + N_("expand commit ids in the todo list"), EXPAND_OIDS), OPT_CMDMODE(0, "check-todo-list", &command, N_("check the todo list"), CHECK_TODO_LIST), OPT_CMDMODE(0, "skip-unnecessary-picks", &command, @@ -54,9 +54,9 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix) return !!sequencer_remove_state(&opts); if (command == MAKE_SCRIPT && argc > 1) return !!sequencer_make_script(keep_empty, stdout, argc, argv); - if (command == SHORTEN_SHA1S && argc == 1) + if (command == SHORTEN_OIDS && argc == 1) return !!transform_todos(1); - if (command == EXPAND_SHA1S && argc == 1) + if (command == EXPAND_OIDS && argc == 1) return !!transform_todos(0); if (command == CHECK_TODO_LIST && argc == 1) return !!check_todo_list(); -- 2.15.1.280.g10402c1f5b5c
[PATCH v2 0/9] rebase -i: add config to abbreviate command names
Hi everyone, This series will add the 'rebase.abbreviateCommands' configuration option to allow `git rebase -i` to default to the single-letter command names when generating the todo list. Using single-letter command names can present two benefits. First, it makes it easier to change the action since you only need to replace a single character (i.e.: in vim "r" instead of "ciw"). Second, using this with a large enough value of 'core.abbrev' enables the lines of the todo list to remain aligned making the files easier to read. Changes in V2: - Refactor and rename 'transform_todo_ids' - Replace SHA-1 by OID in rebase--helper.c - Update todo list related functions to take a generic 'flags' parameter - Rename 'add_exec_commands' function to 'sequencer_add_exec_commands' - Rename 'add-exec' option to 'add-exec-commands' - Use 'strbur_read_file' instead of rewriting it - Make 'command_to_char' return 'comment_char_line' if no single-letter command name is defined - Combine both tests into a single test case - Update commit messages Changes in V2: - Rename 'transform_todo_insn' to 'transform_todos' - Fix flag name TODO_LIST_SHORTE{D,N}_IDS Liam Beguin (9): Documentation: move rebase.* configs to new file Documentation: use preferred name for the 'todo list' script rebase -i: set commit to null in exec commands rebase -i: refactor transform_todo_ids rebase -i: replace reference to sha1 with oid rebase -i: update functions to use a flags parameter rebase -i -x: add exec commands via the rebase--helper rebase -i: learn to abbreviate command names t3404: add test case for abbreviated commands Documentation/config.txt| 31 +--- Documentation/git-rebase.txt| 19 + Documentation/rebase-config.txt | 52 + builtin/rebase--helper.c| 29 +--- git-rebase--interactive.sh | 23 +- sequencer.c | 126 +--- sequencer.h | 10 ++- t/t3404-rebase-interactive.sh | 22 ++ 8 files changed, 186 insertions(+), 126 deletions(-) create mode 100644 Documentation/rebase-config.txt -- 2.15.1.280.g10402c1f5b5c
[PATCH v3 6/9] rebase -i: update functions to use a flags parameter
Update functions used in the rebase--helper so that they take a generic 'flags' parameter instead of a growing list of options. Signed-off-by: Liam Beguin --- builtin/rebase--helper.c | 13 +++-- sequencer.c | 9 + sequencer.h | 8 +--- 3 files changed, 17 insertions(+), 13 deletions(-) diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c index c3b8e4d401f8..1102ecb43b67 100644 --- a/builtin/rebase--helper.c +++ b/builtin/rebase--helper.c @@ -12,7 +12,7 @@ static const char * const builtin_rebase_helper_usage[] = { int cmd_rebase__helper(int argc, const char **argv, const char *prefix) { struct replay_opts opts = REPLAY_OPTS_INIT; - int keep_empty = 0; + unsigned flags = 0, keep_empty = 0; enum { CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS, CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH @@ -48,16 +48,17 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, NULL, options, builtin_rebase_helper_usage, PARSE_OPT_KEEP_ARGV0); + flags |= keep_empty ? TODO_LIST_KEEP_EMPTY : 0; + flags |= command == SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0; + if (command == CONTINUE && argc == 1) return !!sequencer_continue(&opts); if (command == ABORT && argc == 1) return !!sequencer_remove_state(&opts); if (command == MAKE_SCRIPT && argc > 1) - return !!sequencer_make_script(keep_empty, stdout, argc, argv); - if (command == SHORTEN_OIDS && argc == 1) - return !!transform_todos(1); - if (command == EXPAND_OIDS && argc == 1) - return !!transform_todos(0); + return !!sequencer_make_script(stdout, argc, argv, flags); + if ((command == SHORTEN_OIDS || command == EXPAND_OIDS) && argc == 1) + return !!transform_todos(flags); if (command == CHECK_TODO_LIST && argc == 1) return !!check_todo_list(); if (command == SKIP_UNNECESSARY_PICKS && argc == 1) diff --git a/sequencer.c b/sequencer.c index c9a661a8c4bd..8b0dd610c881 100644 --- a/sequencer.c +++ b/sequencer.c @@ -2444,14 +2444,15 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag) strbuf_release(&sob); } -int sequencer_make_script(int keep_empty, FILE *out, - int argc, const char **argv) +int sequencer_make_script(FILE *out, int argc, const char **argv, + unsigned flags) { char *format = NULL; struct pretty_print_context pp = {0}; struct strbuf buf = STRBUF_INIT; struct rev_info revs; struct commit *commit; + int keep_empty = flags & TODO_LIST_KEEP_EMPTY; init_revisions(&revs, NULL); revs.verbose_header = 1; @@ -2494,7 +2495,7 @@ int sequencer_make_script(int keep_empty, FILE *out, } -int transform_todos(int shorten_ids) +int transform_todos(unsigned flags) { const char *todo_file = rebase_path_todo(); struct todo_list todo_list = TODO_LIST_INIT; @@ -2522,7 +2523,7 @@ int transform_todos(int shorten_ids) /* add commit id */ if (item->commit) { - const char *oid = shorten_ids ? + const char *oid = flags & TODO_LIST_SHORTEN_IDS ? short_commit_name(item->commit) : oid_to_hex(&item->commit->object.oid); diff --git a/sequencer.h b/sequencer.h index 4f7f2c93f83e..68284e9762c8 100644 --- a/sequencer.h +++ b/sequencer.h @@ -45,10 +45,12 @@ int sequencer_continue(struct replay_opts *opts); int sequencer_rollback(struct replay_opts *opts); int sequencer_remove_state(struct replay_opts *opts); -int sequencer_make_script(int keep_empty, FILE *out, - int argc, const char **argv); +#define TODO_LIST_KEEP_EMPTY (1U << 0) +#define TODO_LIST_SHORTEN_IDS (1U << 1) +int sequencer_make_script(FILE *out, int argc, const char **argv, + unsigned flags); -int transform_todos(int shorten_ids); +int transform_todos(unsigned flags); int check_todo_list(void); int skip_unnecessary_picks(void); int rearrange_squash(void); -- 2.15.1.280.g10402c1f5b5c
[PATCH v3 7/9] rebase -i -x: add exec commands via the rebase--helper
Recent work on `git-rebase--interactive` aims to convert shell code to C. Even if this is most likely not a big performance enhancement, let's convert it too since a coming change to abbreviate command names requires it to be updated. Signed-off-by: Liam Beguin --- builtin/rebase--helper.c | 7 ++- git-rebase--interactive.sh | 23 +- sequencer.c| 39 ++ sequencer.h| 1 + 4 files changed, 47 insertions(+), 23 deletions(-) diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c index 1102ecb43b67..4229ea0dc122 100644 --- a/builtin/rebase--helper.c +++ b/builtin/rebase--helper.c @@ -15,7 +15,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix) unsigned flags = 0, keep_empty = 0; enum { CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS, - CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH + CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH, + ADD_EXEC } command = 0; struct option options[] = { OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")), @@ -36,6 +37,8 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix) N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS), OPT_CMDMODE(0, "rearrange-squash", &command, N_("rearrange fixup/squash lines"), REARRANGE_SQUASH), + OPT_CMDMODE(0, "add-exec-commands", &command, + N_("insert exec commands in todo list"), ADD_EXEC), OPT_END() }; @@ -65,5 +68,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix) return !!skip_unnecessary_picks(); if (command == REARRANGE_SQUASH && argc == 1) return !!rearrange_squash(); + if (command == ADD_EXEC && argc == 2) + return !!sequencer_add_exec_commands(argv[1]); usage_with_options(builtin_rebase_helper_usage, options); } diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index 437815669f00..e3f5a0abf3c7 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -722,27 +722,6 @@ collapse_todo_ids() { git rebase--helper --shorten-ids } -# Add commands after a pick or after a squash/fixup series -# in the todo list. -add_exec_commands () { - { - first=t - while read -r insn rest - do - case $insn in - pick) - test -n "$first" || - printf "%s" "$cmd" - ;; - esac - printf "%s %s\n" "$insn" "$rest" - first= - done - printf "%s" "$cmd" - } <"$1" >"$1.new" && - mv "$1.new" "$1" -} - # Switch to the branch in $into and notify it in the reflog checkout_onto () { GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $onto_name" @@ -982,7 +961,7 @@ fi test -s "$todo" || echo noop >> "$todo" test -z "$autosquash" || git rebase--helper --rearrange-squash || exit -test -n "$cmd" && add_exec_commands "$todo" +test -n "$cmd" && git rebase--helper --add-exec-commands "$cmd" todocount=$(git stripspace --strip-comments <"$todo" | wc -l) todocount=${todocount##* } diff --git a/sequencer.c b/sequencer.c index 8b0dd610c881..892d242f6966 100644 --- a/sequencer.c +++ b/sequencer.c @@ -2494,6 +2494,45 @@ int sequencer_make_script(FILE *out, int argc, const char **argv, return 0; } +/* + * Add commands after pick and (series of) squash/fixup commands + * in the todo list. + */ +int sequencer_add_exec_commands(const char *commands) +{ + const char *todo_file = rebase_path_todo(); + struct todo_list todo_list = TODO_LIST_INIT; + struct todo_item *item; + struct strbuf *buf = &todo_list.buf; + size_t offset = 0, commands_len = strlen(commands); + int i, first; + + if (strbuf_read_file(&todo_list.buf, todo_file, 0) < 0) + return error(_("could not read '%s'."), todo_file); + + if (parse_insn_buffer(todo_list.buf.buf, &todo_list)) { + todo_list_release(&todo_list); + return error(_("unusable todo list: '%s'"), todo_file); + } + + first = 1; + /* insert before every pick except the first one */ + for (item = todo_list.items, i = 0; i < todo_list.nr; i++, item++) { + if (item->command == TODO_PICK && !first) { + strbuf_insert(buf, item->offset_in_buf + offset, + commands, commands_len); + offset += commands_len; + } + first = 0; + } + + /* append final */ + st
[PATCH v3 4/9] rebase -i: refactor transform_todo_ids
The transform_todo_ids function is a little hard to read. Lets try to make it easier by using more of the strbuf API. Also, since we'll soon be adding command abbreviations, let's rename the function so it's name reflects that change. Signed-off-by: Liam Beguin --- builtin/rebase--helper.c | 4 +-- sequencer.c | 69 sequencer.h | 2 +- 3 files changed, 31 insertions(+), 44 deletions(-) diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c index f8519363a393..8ad4779d1650 100644 --- a/builtin/rebase--helper.c +++ b/builtin/rebase--helper.c @@ -55,9 +55,9 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix) if (command == MAKE_SCRIPT && argc > 1) return !!sequencer_make_script(keep_empty, stdout, argc, argv); if (command == SHORTEN_SHA1S && argc == 1) - return !!transform_todo_ids(1); + return !!transform_todos(1); if (command == EXPAND_SHA1S && argc == 1) - return !!transform_todo_ids(0); + return !!transform_todos(0); if (command == CHECK_TODO_LIST && argc == 1) return !!check_todo_list(); if (command == SKIP_UNNECESSARY_PICKS && argc == 1) diff --git a/sequencer.c b/sequencer.c index 5033b049d995..c9a661a8c4bd 100644 --- a/sequencer.c +++ b/sequencer.c @@ -2494,60 +2494,47 @@ int sequencer_make_script(int keep_empty, FILE *out, } -int transform_todo_ids(int shorten_ids) +int transform_todos(int shorten_ids) { const char *todo_file = rebase_path_todo(); struct todo_list todo_list = TODO_LIST_INIT; - int fd, res, i; - FILE *out; + struct strbuf buf = STRBUF_INIT; + struct todo_item *item; + int i; - strbuf_reset(&todo_list.buf); - fd = open(todo_file, O_RDONLY); - if (fd < 0) - return error_errno(_("could not open '%s'"), todo_file); - if (strbuf_read(&todo_list.buf, fd, 0) < 0) { - close(fd); + if (strbuf_read_file(&todo_list.buf, todo_file, 0) < 0) return error(_("could not read '%s'."), todo_file); - } - close(fd); - res = parse_insn_buffer(todo_list.buf.buf, &todo_list); - if (res) { + if (parse_insn_buffer(todo_list.buf.buf, &todo_list)) { todo_list_release(&todo_list); return error(_("unusable todo list: '%s'"), todo_file); } - out = fopen(todo_file, "w"); - if (!out) { - todo_list_release(&todo_list); - return error(_("unable to open '%s' for writing"), todo_file); - } - for (i = 0; i < todo_list.nr; i++) { - struct todo_item *item = todo_list.items + i; - int bol = item->offset_in_buf; - const char *p = todo_list.buf.buf + bol; - int eol = i + 1 < todo_list.nr ? - todo_list.items[i + 1].offset_in_buf : - todo_list.buf.len; - - if (item->command >= TODO_EXEC && item->command != TODO_DROP) - fwrite(p, eol - bol, 1, out); - else { - const char *id = shorten_ids ? - short_commit_name(item->commit) : - oid_to_hex(&item->commit->object.oid); - int len; - - p += strspn(p, " \t"); /* left-trim command */ - len = strcspn(p, " \t"); /* length of command */ - - fprintf(out, "%.*s %s %.*s\n", - len, p, id, item->arg_len, item->arg); + for (item = todo_list.items, i = 0; i < todo_list.nr; i++, item++) { + /* if the item is not a command write it and continue */ + if (item->command >= TODO_COMMENT) { + strbuf_addf(&buf, "%.*s\n", item->arg_len, item->arg); + continue; + } + + /* add command to the buffer */ + strbuf_addstr(&buf, command_to_string(item->command)); + + /* add commit id */ + if (item->commit) { + const char *oid = shorten_ids ? + short_commit_name(item->commit) : + oid_to_hex(&item->commit->object.oid); + + strbuf_addf(&buf, " %s", oid); } + /* add all the rest */ + strbuf_addf(&buf, " %.*s\n", item->arg_len, item->arg); } - fclose(out); + + i = write_message(buf.buf, buf.len, todo_file, 0); todo_list_release(&todo_list); - return 0; + return i; } enum check_level { diff --git a/sequencer.h b/sequencer.h index 6f3d3df82c0a..4f7f2c93f83e 100644 --- a/sequencer.h +++ b/sequencer.h @@ -48,7 +48,7 @@ int sequencer_remove_state(
[PATCH v3 2/9] Documentation: use preferred name for the 'todo list' script
Use "todo list" instead of "instruction list" or "todo-list" to reduce further confusion regarding the name of this script. Signed-off-by: Liam Beguin --- Documentation/rebase-config.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/rebase-config.txt b/Documentation/rebase-config.txt index dba088d7c68f..30ae08cb5a4b 100644 --- a/Documentation/rebase-config.txt +++ b/Documentation/rebase-config.txt @@ -23,10 +23,10 @@ rebase.missingCommitsCheck:: --edit-todo' can then be used to correct the error. If set to "ignore", no checking is done. To drop a commit without warning or error, use the `drop` - command in the todo-list. + command in the todo list. Defaults to "ignore". rebase.instructionFormat:: A format string, as specified in linkgit:git-log[1], to be used for the - instruction list during an interactive rebase. The format will + todo list during an interactive rebase. The format will automatically have the long commit hash prepended to the format. -- 2.15.1.280.g10402c1f5b5c
[PATCH v3 8/9] rebase -i: learn to abbreviate command names
`git rebase -i` already know how to interpret single-letter command names. Teach it to generate the todo list with these same abbreviated names. Based-on-patch-by: Johannes Schindelin Signed-off-by: Liam Beguin --- Documentation/rebase-config.txt | 20 builtin/rebase--helper.c| 3 +++ sequencer.c | 16 ++-- sequencer.h | 1 + 4 files changed, 38 insertions(+), 2 deletions(-) diff --git a/Documentation/rebase-config.txt b/Documentation/rebase-config.txt index 30ae08cb5a4b..42e1ba757564 100644 --- a/Documentation/rebase-config.txt +++ b/Documentation/rebase-config.txt @@ -30,3 +30,23 @@ rebase.instructionFormat:: A format string, as specified in linkgit:git-log[1], to be used for the todo list during an interactive rebase. The format will automatically have the long commit hash prepended to the format. + +rebase.abbreviateCommands:: + If set to true, `git rebase` will use abbreviated command names in the + todo list resulting in something like this: ++ +--- + p deadbee The oneline of the commit + p fa1afe1 The oneline of the next commit + ... +--- ++ +instead of: ++ +--- + pick deadbee The oneline of the commit + pick fa1afe1 The oneline of the next commit + ... +--- ++ +Defaults to false. diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c index 4229ea0dc122..7daee544b7b4 100644 --- a/builtin/rebase--helper.c +++ b/builtin/rebase--helper.c @@ -13,6 +13,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix) { struct replay_opts opts = REPLAY_OPTS_INIT; unsigned flags = 0, keep_empty = 0; + int abbreviate_commands = 0; enum { CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS, CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH, @@ -43,6 +44,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix) }; git_config(git_default_config, NULL); + git_config_get_bool("rebase.abbreviatecommands", &abbreviate_commands); opts.action = REPLAY_INTERACTIVE_REBASE; opts.allow_ff = 1; @@ -52,6 +54,7 @@ int cmd_rebase__helper(int argc, const char **argv, const char *prefix) builtin_rebase_helper_usage, PARSE_OPT_KEEP_ARGV0); flags |= keep_empty ? TODO_LIST_KEEP_EMPTY : 0; + flags |= abbreviate_commands ? TODO_LIST_ABBREVIATE_CMDS : 0; flags |= command == SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0; if (command == CONTINUE && argc == 1) diff --git a/sequencer.c b/sequencer.c index 892d242f6966..115085d39ca8 100644 --- a/sequencer.c +++ b/sequencer.c @@ -795,6 +795,13 @@ static const char *command_to_string(const enum todo_command command) die("Unknown command: %d", command); } +static const char command_to_char(const enum todo_command command) +{ + if (command < TODO_COMMENT && todo_command_info[command].c) + return todo_command_info[command].c; + return comment_line_char; +} + static int is_noop(const enum todo_command command) { return TODO_NOOP <= command; @@ -2453,6 +2460,7 @@ int sequencer_make_script(FILE *out, int argc, const char **argv, struct rev_info revs; struct commit *commit; int keep_empty = flags & TODO_LIST_KEEP_EMPTY; + const char *insn = flags & TODO_LIST_ABBREVIATE_CMDS ? "p" : "pick"; init_revisions(&revs, NULL); revs.verbose_header = 1; @@ -2485,7 +2493,8 @@ int sequencer_make_script(FILE *out, int argc, const char **argv, strbuf_reset(&buf); if (!keep_empty && is_original_commit_empty(commit)) strbuf_addf(&buf, "%c ", comment_line_char); - strbuf_addf(&buf, "pick %s ", oid_to_hex(&commit->object.oid)); + strbuf_addf(&buf, "%s %s ", insn, + oid_to_hex(&commit->object.oid)); pretty_print_commit(&pp, commit, &buf); strbuf_addch(&buf, '\n'); fputs(buf.buf, out); @@ -2558,7 +2567,10 @@ int transform_todos(unsigned flags) } /* add command to the buffer */ - strbuf_addstr(&buf, command_to_string(item->command)); + if (flags & TODO_LIST_ABBREVIATE_CMDS) + strbuf_addch(&buf, command_to_char(item->command)); + else + strbuf_addstr(&buf, command_to_string(item->command)); /* add commit id */ if (item->commit) { diff --git a/sequencer.h b/sequencer.h index 212426c44548..81f6d7d393fd 100644 --- a/sequencer.h +++ b/sequencer.h @@ -47,6 +47,7 @@ int sequencer_remove_state(s
[PATCH v3 1/9] Documentation: move rebase.* configs to new file
Move all rebase.* configuration variables to a separate file in order to remove duplicates, and include it in config.txt and git-rebase.txt. The new descriptions are mostly taken from config.txt as they are more verbose. Signed-off-by: Liam Beguin --- Documentation/config.txt| 31 +-- Documentation/git-rebase.txt| 19 +-- Documentation/rebase-config.txt | 32 3 files changed, 34 insertions(+), 48 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 531649cb40ea..e424b7de90b5 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2691,36 +2691,7 @@ push.recurseSubmodules:: is retained. You may override this configuration at time of push by specifying '--recurse-submodules=check|on-demand|no'. -rebase.stat:: - Whether to show a diffstat of what changed upstream since the last - rebase. False by default. - -rebase.autoSquash:: - If set to true enable `--autosquash` option by default. - -rebase.autoStash:: - When set to true, automatically create a temporary stash entry - before the operation begins, and apply it after the operation - ends. This means that you can run rebase on a dirty worktree. - However, use with care: the final stash application after a - successful rebase might result in non-trivial conflicts. - Defaults to false. - -rebase.missingCommitsCheck:: - If set to "warn", git rebase -i will print a warning if some - commits are removed (e.g. a line was deleted), however the - rebase will still proceed. If set to "error", it will print - the previous warning and stop the rebase, 'git rebase - --edit-todo' can then be used to correct the error. If set to - "ignore", no checking is done. - To drop a commit without warning or error, use the `drop` - command in the todo-list. - Defaults to "ignore". - -rebase.instructionFormat:: - A format string, as specified in linkgit:git-log[1], to be used for - the instruction list during an interactive rebase. The format will automatically - have the long commit hash prepended to the format. +include::rebase-config.txt[] receive.advertiseAtomic:: By default, git-receive-pack will advertise the atomic push diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt index 3cedfb0fd22b..8a861c1e0d69 100644 --- a/Documentation/git-rebase.txt +++ b/Documentation/git-rebase.txt @@ -203,24 +203,7 @@ Alternatively, you can undo the 'git rebase' with CONFIGURATION - -rebase.stat:: - Whether to show a diffstat of what changed upstream since the last - rebase. False by default. - -rebase.autoSquash:: - If set to true enable `--autosquash` option by default. - -rebase.autoStash:: - If set to true enable `--autostash` option by default. - -rebase.missingCommitsCheck:: - If set to "warn", print warnings about removed commits in - interactive mode. If set to "error", print the warnings and - stop the rebase. If set to "ignore", no checking is - done. "ignore" by default. - -rebase.instructionFormat:: - Custom commit list format to use during an `--interactive` rebase. +include::rebase-config.txt[] OPTIONS --- diff --git a/Documentation/rebase-config.txt b/Documentation/rebase-config.txt new file mode 100644 index ..dba088d7c68f --- /dev/null +++ b/Documentation/rebase-config.txt @@ -0,0 +1,32 @@ +rebase.stat:: + Whether to show a diffstat of what changed upstream since the last + rebase. False by default. + +rebase.autoSquash:: + If set to true enable `--autosquash` option by default. + +rebase.autoStash:: + When set to true, automatically create a temporary stash entry + before the operation begins, and apply it after the operation + ends. This means that you can run rebase on a dirty worktree. + However, use with care: the final stash application after a + successful rebase might result in non-trivial conflicts. + This option can be overridden by the `--no-autostash` and + `--autostash` options of linkgit:git-rebase[1]. + Defaults to false. + +rebase.missingCommitsCheck:: + If set to "warn", git rebase -i will print a warning if some + commits are removed (e.g. a line was deleted), however the + rebase will still proceed. If set to "error", it will print + the previous warning and stop the rebase, 'git rebase + --edit-todo' can then be used to correct the error. If set to + "ignore", no checking is done. + To drop a commit without warning or error, use the `drop` + command in the todo-list. + Defaults to "ignore". + +rebase.instructionFormat:: + A format string, as specified in linkgit:git-log[1], to be used for the + instruction list during an int
[PATCH v3 3/9] rebase -i: set commit to null in exec commands
Make sure commit is set to NULL when parsing exec instructions from the todo list. If not, we may try to access an uninitialized address later while updating the todo list. Signed-off-by: Liam Beguin --- sequencer.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sequencer.c b/sequencer.c index fa94ed652d2c..5033b049d995 100644 --- a/sequencer.c +++ b/sequencer.c @@ -1268,6 +1268,7 @@ static int parse_insn_line(struct todo_item *item, const char *bol, char *eol) bol += padding; if (item->command == TODO_EXEC) { + item->commit = NULL; item->arg = bol; item->arg_len = (int)(eol - bol); return 0; -- 2.15.1.280.g10402c1f5b5c
Re: [PATCH] Documentation/git-clone: improve description for submodule recursing
Stefan Beller writes: > There have been a few complaints on the mailing list that git-clone doesn't > respect the `submodule.recurse` setting, which every other command (that > potentially knows how to deal with submodules) respects. In case of clone > this is not beneficial to respect as the user may not want to obtain all > submodules (assuming a pathspec of '.'). > > Improve the documentation such that the pathspec is mentioned in the > synopsis to alleviate the confusion around the submodule recursion flag > in git-clone. > > While at it clarify that the option can be given multiple times for complex\ > pathspecs. Well written (modulo the backslash there, which I can easily remove while queuing). > > Signed-off-by: Stefan Beller > --- > Documentation/git-clone.txt | 19 +++ > 1 file changed, 11 insertions(+), 8 deletions(-) > > diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt > index 83c8e9b394..42ca7b5095 100644 > --- a/Documentation/git-clone.txt > +++ b/Documentation/git-clone.txt > @@ -14,7 +14,7 @@ SYNOPSIS > [-o ] [-b ] [-u ] [--reference ] > [--dissociate] [--separate-git-dir ] > [--depth ] [--[no-]single-branch] [--no-tags] > - [--recurse-submodules] [--[no-]shallow-submodules] > + [--recurse-submodules[=]] [--[no-]shallow-submodules] > [--jobs ] [--] [] > > DESCRIPTION > @@ -231,14 +231,17 @@ branch of some repository for search indexing. > After the clone is created, initialize and clone submodules > within based on the provided pathspec. If no pathspec is > provided, all submodules are initialized and cloned. > - Submodules are initialized and cloned using their default > - settings. The resulting clone has `submodule.active` set to > + This option can be given multiple times for pathspecs consisting > + of multiple entries. The resulting clone has `submodule.active` set to > the provided pathspec, or "." (meaning all submodules) if no > - pathspec is provided. This is equivalent to running > - `git submodule update --init --recursive` immediately after > - the clone is finished. This option is ignored if the cloned > - repository does not have a worktree/checkout (i.e. if any of > - `--no-checkout`/`-n`, `--bare`, or `--mirror` is given) > + pathspec is provided. > ++ > +Submodules are initialized and cloned using their default settings. This is > +equivalent to running > +`git submodule update --init --recursive ` immediately after > +the clone is finished. This option is ignored if the cloned repository does > +not have a worktree/checkout (i.e. if any of `--no-checkout`/`-n`, `--bare`, > +or `--mirror` is given) > > --[no-]shallow-submodules:: > All submodules which are cloned will be shallow with a depth of 1.
[PATCH v6 01/14] upload-pack: add object filtering for partial clone
From: Jeff Hostetler Teach upload-pack to negotiate object filtering over the protocol and to send filter parameters to pack-objects. This is intended for partial clone and fetch. The idea to make upload-pack configurable using uploadpack.allowFilter comes from Jonathan Tan's work in [1]. [1] https://public-inbox.org/git/f211093280b422c32cc1b7034130072f35c5ed51.1506714999.git.jonathanta...@google.com/ Signed-off-by: Jeff Hostetler --- Documentation/config.txt | 4 Documentation/technical/pack-protocol.txt | 8 +++ Documentation/technical/protocol-capabilities.txt | 8 +++ upload-pack.c | 26 ++- 4 files changed, 45 insertions(+), 1 deletion(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 1ac0ae6..e528210 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -3268,6 +3268,10 @@ uploadpack.packObjectsHook:: was run. I.e., `upload-pack` will feed input intended for `pack-objects` to the hook, and expects a completed packfile on stdout. + +uploadpack.allowFilter:: + If this option is set, `upload-pack` will advertise partial + clone and partial fetch object filtering. + Note that this configuration variable is ignored if it is seen in the repository-level config (this is a safety measure against fetching from diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt index ed1eae8..a43a113 100644 --- a/Documentation/technical/pack-protocol.txt +++ b/Documentation/technical/pack-protocol.txt @@ -212,6 +212,7 @@ out of what the server said it could do with the first 'want' line. upload-request= want-list *shallow-line *1depth-request + [filter-request] flush-pkt want-list = first-want @@ -227,6 +228,8 @@ out of what the server said it could do with the first 'want' line. additional-want = PKT-LINE("want" SP obj-id) depth = 1*DIGIT + + filter-request= PKT-LINE("filter" SP filter-spec) Clients MUST send all the obj-ids it wants from the reference @@ -249,6 +252,11 @@ complete those commits. Commits whose parents are not received as a result are defined as shallow and marked as such in the server. This information is sent back to the client in the next step. +The client can optionally request that pack-objects omit various +objects from the packfile using one of several filtering techniques. +These are intended for use with partial clone and partial fetch +operations. See `rev-list` for possible "filter-spec" values. + Once all the 'want's and 'shallow's (and optional 'deepen') are transferred, clients MUST send a flush-pkt, to tell the server side that it is done sending the list. diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt index 26dcc6f..332d209 100644 --- a/Documentation/technical/protocol-capabilities.txt +++ b/Documentation/technical/protocol-capabilities.txt @@ -309,3 +309,11 @@ to accept a signed push certificate, and asks the to be included in the push certificate. A send-pack client MUST NOT send a push-cert packet unless the receive-pack server advertises this capability. + +filter +-- + +If the upload-pack server advertises the 'filter' capability, +fetch-pack may send "filter" commands to request a partial clone +or partial fetch and request that the server omit various objects +from the packfile. diff --git a/upload-pack.c b/upload-pack.c index e25f725..e6d38b9 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -10,6 +10,8 @@ #include "diff.h" #include "revision.h" #include "list-objects.h" +#include "list-objects-filter.h" +#include "list-objects-filter-options.h" #include "run-command.h" #include "connect.h" #include "sigchain.h" @@ -18,6 +20,7 @@ #include "parse-options.h" #include "argv-array.h" #include "prio-queue.h" +#include "quote.h" static const char * const upload_pack_usage[] = { N_("git upload-pack [] "), @@ -64,6 +67,10 @@ static int advertise_refs; static int stateless_rpc; static const char *pack_objects_hook; +static int filter_capability_requested; +static int filter_advertise; +static struct list_objects_filter_options filter_options; + static void reset_timeout(void) { alarm(timeout); @@ -131,6 +138,12 @@ static void create_pack_file(void) argv_array_push(&pack_objects.args, "--delta-base-offset"); if (use_include_tag) argv_array_push(&pack_objects.args, "--include-tag"); + if (filter_options.filter_spec) { + struct strbuf buf = STRBUF_INIT; + sq_quote_buf(&buf, filter_options.filter_spec); + argv_array_pushf(&pack_objects.args, "--filter=%s", buf.buf); + strbuf_release(&buf);
[PATCH v6 04/14] fetch-pack: test support excluding large blobs
From: Jonathan Tan Created tests to verify fetch-pack and upload-pack support for excluding large blobs using --filter=blobs:limit= parameter. Signed-off-by: Jonathan Tan Signed-off-by: Jeff Hostetler --- t/t5500-fetch-pack.sh | 27 +++ upload-pack.c | 13 + 2 files changed, 36 insertions(+), 4 deletions(-) diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh index 80a1a32..c57916b 100755 --- a/t/t5500-fetch-pack.sh +++ b/t/t5500-fetch-pack.sh @@ -755,4 +755,31 @@ test_expect_success 'fetching deepen' ' ) ' +test_expect_success 'filtering by size' ' + rm -rf server client && + test_create_repo server && + test_commit -C server one && + test_config -C server uploadpack.allowfilter 1 && + + test_create_repo client && + git -C client fetch-pack --filter=blob:limit=0 ../server HEAD && + + # Ensure that object is not inadvertently fetched + test_must_fail git -C client cat-file -e $(git hash-object server/one.t) +' + +test_expect_success 'filtering by size has no effect if support for it is not advertised' ' + rm -rf server client && + test_create_repo server && + test_commit -C server one && + + test_create_repo client && + git -C client fetch-pack --filter=blob:limit=0 ../server HEAD 2> err && + + # Ensure that object is fetched + git -C client cat-file -e $(git hash-object server/one.t) && + + test_i18ngrep "filtering not recognized by server" err +' + test_done diff --git a/upload-pack.c b/upload-pack.c index e6d38b9..15b6605 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -139,10 +139,15 @@ static void create_pack_file(void) if (use_include_tag) argv_array_push(&pack_objects.args, "--include-tag"); if (filter_options.filter_spec) { - struct strbuf buf = STRBUF_INIT; - sq_quote_buf(&buf, filter_options.filter_spec); - argv_array_pushf(&pack_objects.args, "--filter=%s", buf.buf); - strbuf_release(&buf); + if (pack_objects.use_shell) { + struct strbuf buf = STRBUF_INIT; + sq_quote_buf(&buf, filter_options.filter_spec); + argv_array_pushf(&pack_objects.args, "--filter=%s", buf.buf); + strbuf_release(&buf); + } else { + argv_array_pushf(&pack_objects.args, "--filter=%s", +filter_options.filter_spec); + } } pack_objects.in = -1; -- 2.9.3
[PATCH v6 08/14] fixup: fetch: update --blob-max-bytes to --fitler
From: Jeff Hostetler Signed-off-by: Jeff Hostetler --- t/t5500-fetch-pack.sh | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh index 65fc7bb..ec9ba9b 100755 --- a/t/t5500-fetch-pack.sh +++ b/t/t5500-fetch-pack.sh @@ -782,7 +782,7 @@ test_expect_success 'filtering by size has no effect if support for it is not ad test_i18ngrep "filtering not recognized by server" err ' -fetch_blob_max_bytes () { +fetch_filter_blob_limit_zero () { SERVER="$1" URL="$2" @@ -804,15 +804,15 @@ fetch_blob_max_bytes () { test_must_fail git -C client cat-file -e $(git hash-object "$SERVER/two.t") } -test_expect_success 'fetch with --blob-max-bytes' ' - fetch_blob_max_bytes server server +test_expect_success 'fetch with --filter=blob:limit=0' ' + fetch_filter_blob_limit_zero server server ' . "$TEST_DIRECTORY"/lib-httpd.sh start_httpd -test_expect_success 'fetch with --blob-max-bytes and HTTP' ' - fetch_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/smart/server" +test_expect_success 'fetch with --filter=blob:limit=0 and HTTP' ' + fetch_filter_blob_limit_zero "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/smart/server" ' stop_httpd -- 2.9.3
[PATCH v6 07/14] fixup: fetch: update error messages from --blob-max-bytes to --filter
From: Jeff Hostetler Signed-off-by: Jeff Hostetler --- builtin/fetch.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 05d0b1a..14aab71 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1372,7 +1372,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) deepen = 1; if (filter_options.choice && !repository_format_partial_clone) - die("--blob-max-bytes can only be used when extensions.partialClone is set"); + die("--filter can only be used when extensions.partialClone is set"); if (all) { if (argc == 1) @@ -1406,11 +1406,11 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) if (remote) { if (filter_options.choice && strcmp(remote->name, repository_format_partial_clone)) - die(_("--blob-max-bytes can only be used with the remote configured in core.partialClone")); + die(_("--filter can only be used with the remote configured in core.partialClone")); result = fetch_one(remote, argc, argv); } else { if (filter_options.choice) - die(_("--blob-max-bytes can only be used with the remote configured in core.partialClone")); + die(_("--filter can only be used with the remote configured in core.partialClone")); result = fetch_multiple(&list); } -- 2.9.3
[PATCH v6 09/14] fixup: connected: conditionally pass --exclude-promisor-objects to rev-list
From: Jeff Hostetler Teach connected.c to only pass --exclude-promisor-objects to rev-list when partial clone is enabled. Signed-off-by: Jeff Hostetler --- connected.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/connected.c b/connected.c index a51c01d..3a5bd67 100644 --- a/connected.c +++ b/connected.c @@ -56,7 +56,8 @@ int check_connected(sha1_iterate_fn fn, void *cb_data, argv_array_push(&rev_list.args,"rev-list"); argv_array_push(&rev_list.args, "--objects"); argv_array_push(&rev_list.args, "--stdin"); - argv_array_push(&rev_list.args, "--exclude-promisor-objects"); + if (repository_format_partial_clone) + argv_array_push(&rev_list.args, "--exclude-promisor-objects"); argv_array_push(&rev_list.args, "--not"); argv_array_push(&rev_list.args, "--all"); argv_array_push(&rev_list.args, "--quiet"); -- 2.9.3
[PATCH v6 10/14] partial-clone: define partial clone settings in config
From: Jeff Hostetler Create get and set routines for "partial clone" config settings. These will be used in a future commit by clone and fetch to remember the promisor remote and the default filter-spec. Signed-off-by: Jeff Hostetler --- cache.h | 1 + config.c | 5 +++ environment.c | 1 + list-objects-filter-options.c | 90 +++ list-objects-filter-options.h | 6 +++ 5 files changed, 88 insertions(+), 15 deletions(-) diff --git a/cache.h b/cache.h index 6980072..bccc510 100644 --- a/cache.h +++ b/cache.h @@ -861,6 +861,7 @@ extern int grafts_replace_parents; #define GIT_REPO_VERSION_READ 1 extern int repository_format_precious_objects; extern char *repository_format_partial_clone; +extern const char *core_partial_clone_filter_default; struct repository_format { int version; diff --git a/config.c b/config.c index adb7d7a..adeee04 100644 --- a/config.c +++ b/config.c @@ -1241,6 +1241,11 @@ static int git_default_core_config(const char *var, const char *value) return 0; } + if (!strcmp(var, "core.partialclonefilter")) { + return git_config_string(&core_partial_clone_filter_default, +var, value); + } + /* Add other config variables here and to Documentation/config.txt. */ return 0; } diff --git a/environment.c b/environment.c index e52aab3..7537565 100644 --- a/environment.c +++ b/environment.c @@ -28,6 +28,7 @@ int warn_on_object_refname_ambiguity = 1; int ref_paranoia = -1; int repository_format_precious_objects; char *repository_format_partial_clone; +const char *core_partial_clone_filter_default; const char *git_commit_encoding; const char *git_log_output_encoding; const char *apply_default_whitespace; diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index 4c5b34e..5c47e2b 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -21,29 +21,36 @@ * subordinate commands when necessary. We also "intern" the arg for * the convenience of the current command. */ -int parse_list_objects_filter(struct list_objects_filter_options *filter_options, - const char *arg) +static int gently_parse_list_objects_filter( + struct list_objects_filter_options *filter_options, + const char *arg, + struct strbuf *errbuf) { const char *v0; - if (filter_options->choice) - die(_("multiple object filter types cannot be combined")); + if (filter_options->choice) { + if (errbuf) { + strbuf_init(errbuf, 0); + strbuf_addstr( + errbuf, + _("multiple filter-specs cannot be combined")); + } + return 1; + } filter_options->filter_spec = strdup(arg); if (!strcmp(arg, "blob:none")) { filter_options->choice = LOFC_BLOB_NONE; return 0; - } - if (skip_prefix(arg, "blob:limit=", &v0)) { - if (!git_parse_ulong(v0, &filter_options->blob_limit_value)) - die(_("invalid filter-spec expression '%s'"), arg); - filter_options->choice = LOFC_BLOB_LIMIT; - return 0; - } + } else if (skip_prefix(arg, "blob:limit=", &v0)) { + if (git_parse_ulong(v0, &filter_options->blob_limit_value)) { + filter_options->choice = LOFC_BLOB_LIMIT; + return 0; + } - if (skip_prefix(arg, "sparse:oid=", &v0)) { + } else if (skip_prefix(arg, "sparse:oid=", &v0)) { struct object_context oc; struct object_id sparse_oid; @@ -57,15 +64,27 @@ int parse_list_objects_filter(struct list_objects_filter_options *filter_options filter_options->sparse_oid_value = oiddup(&sparse_oid); filter_options->choice = LOFC_SPARSE_OID; return 0; - } - if (skip_prefix(arg, "sparse:path=", &v0)) { + } else if (skip_prefix(arg, "sparse:path=", &v0)) { filter_options->choice = LOFC_SPARSE_PATH; filter_options->sparse_path_value = strdup(v0); return 0; } - die(_("invalid filter-spec expression '%s'"), arg); + if (errbuf) { + strbuf_init(errbuf, 0); + strbuf_addf(errbuf, "invalid filter-spec '%s'", arg); + } + memset(filter_options, 0, sizeof(*filter_options)); + return 1; +} + +int parse_list_objects_filter(struct list_objects_filter_options *filter_options, + const char *arg) +{ + struct strbuf buf = STRBUF_INIT; + if (gently_parse_list_objects_filter(filter_options, arg, &buf)) + di
[PATCH v6 05/14] fetch: refactor calculation of remote list
From: Jonathan Tan Separate out the calculation of remotes to be fetched from and the actual fetching. This will allow us to include an additional step before the actual fetching in a subsequent commit. Signed-off-by: Jonathan Tan --- builtin/fetch.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 225c734..1b1f039 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1322,7 +1322,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) { int i; struct string_list list = STRING_LIST_INIT_DUP; - struct remote *remote; + struct remote *remote = NULL; int result = 0; struct argv_array argv_gc_auto = ARGV_ARRAY_INIT; @@ -1367,17 +1367,14 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) else if (argc > 1) die(_("fetch --all does not make sense with refspecs")); (void) for_each_remote(get_one_remote_for_fetch, &list); - result = fetch_multiple(&list); } else if (argc == 0) { /* No arguments -- use default remote */ remote = remote_get(NULL); - result = fetch_one(remote, argc, argv); } else if (multiple) { /* All arguments are assumed to be remotes or groups */ for (i = 0; i < argc; i++) if (!add_remote_or_group(argv[i], &list)) die(_("No such remote or remote group: %s"), argv[i]); - result = fetch_multiple(&list); } else { /* Single remote or group */ (void) add_remote_or_group(argv[0], &list); @@ -1385,14 +1382,19 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) /* More than one remote */ if (argc > 1) die(_("Fetching a group and specifying refspecs does not make sense")); - result = fetch_multiple(&list); } else { /* Zero or one remotes */ remote = remote_get(argv[0]); - result = fetch_one(remote, argc-1, argv+1); + argc--; + argv++; } } + if (remote) + result = fetch_one(remote, argc, argv); + else + result = fetch_multiple(&list); + if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) { struct argv_array options = ARGV_ARRAY_INIT; -- 2.9.3
[PATCH v6 06/14] fetch: support filters
From: Jeff Hostetler Teach fetch to support filters. This is only allowed for the remote configured in extensions.partialcloneremote. Signed-off-by: Jonathan Tan --- builtin/fetch.c | 23 +-- connected.c | 1 + remote-curl.c | 6 ++ t/t5500-fetch-pack.sh | 36 4 files changed, 64 insertions(+), 2 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 1b1f039..05d0b1a 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -18,6 +18,7 @@ #include "argv-array.h" #include "utf8.h" #include "packfile.h" +#include "list-objects-filter-options.h" static const char * const builtin_fetch_usage[] = { N_("git fetch [] [ [...]]"), @@ -55,6 +56,7 @@ static int recurse_submodules_default = RECURSE_SUBMODULES_ON_DEMAND; static int shown_url = 0; static int refmap_alloc, refmap_nr; static const char **refmap_array; +static struct list_objects_filter_options filter_options; static int git_fetch_config(const char *k, const char *v, void *cb) { @@ -160,6 +162,7 @@ static struct option builtin_fetch_options[] = { TRANSPORT_FAMILY_IPV4), OPT_SET_INT('6', "ipv6", &family, N_("use IPv6 addresses only"), TRANSPORT_FAMILY_IPV6), + OPT_PARSE_LIST_OBJECTS_FILTER(&filter_options), OPT_END() }; @@ -1044,6 +1047,11 @@ static struct transport *prepare_transport(struct remote *remote, int deepen) set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, "yes"); if (update_shallow) set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes"); + if (filter_options.choice) { + set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER, + filter_options.filter_spec); + set_option(transport, TRANS_OPT_FROM_PROMISOR, "1"); + } return transport; } @@ -1328,6 +1336,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) packet_trace_identity("fetch"); + fetch_if_missing = 0; + /* Record the command line for the reflog */ strbuf_addstr(&default_rla, "fetch"); for (i = 1; i < argc; i++) @@ -1361,6 +1371,9 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) if (depth || deepen_since || deepen_not.nr) deepen = 1; + if (filter_options.choice && !repository_format_partial_clone) + die("--blob-max-bytes can only be used when extensions.partialClone is set"); + if (all) { if (argc == 1) die(_("fetch --all does not take a repository argument")); @@ -1390,10 +1403,16 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) } } - if (remote) + if (remote) { + if (filter_options.choice && + strcmp(remote->name, repository_format_partial_clone)) + die(_("--blob-max-bytes can only be used with the remote configured in core.partialClone")); result = fetch_one(remote, argc, argv); - else + } else { + if (filter_options.choice) + die(_("--blob-max-bytes can only be used with the remote configured in core.partialClone")); result = fetch_multiple(&list); + } if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) { struct argv_array options = ARGV_ARRAY_INIT; diff --git a/connected.c b/connected.c index f416b05..a51c01d 100644 --- a/connected.c +++ b/connected.c @@ -56,6 +56,7 @@ int check_connected(sha1_iterate_fn fn, void *cb_data, argv_array_push(&rev_list.args,"rev-list"); argv_array_push(&rev_list.args, "--objects"); argv_array_push(&rev_list.args, "--stdin"); + argv_array_push(&rev_list.args, "--exclude-promisor-objects"); argv_array_push(&rev_list.args, "--not"); argv_array_push(&rev_list.args, "--all"); argv_array_push(&rev_list.args, "--quiet"); diff --git a/remote-curl.c b/remote-curl.c index 4318391..6ec5352 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -24,6 +24,7 @@ struct options { char *deepen_since; struct string_list deepen_not; struct string_list push_options; + char *filter; unsigned progress : 1, check_self_contained_and_connected : 1, cloning : 1, @@ -165,6 +166,9 @@ static int set_option(const char *name, const char *value) } else if (!strcmp(name, "no-dependents")) { options.no_dependents = 1; return 0; + } else if (!strcmp(name, "filter")) { + options.filter = xstrdup(value);; + return 0; } else { return 1 /* unsupported */; } @@ -834,6 +838,8 @@ static int fetch_git(struct discovery *heads, argv_array_push(&args, "--from-prom
[PATCH v6 03/14] fetch-pack: add --no-filter
From: Jeff Hostetler Fixup fetch-pack to accept --no-filter to be consistent with rev-list and pack-objects. Signed-off-by: Jeff Hostetler --- builtin/fetch-pack.c | 4 1 file changed, 4 insertions(+) diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 7957807..cbf5035 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -157,6 +157,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) parse_list_objects_filter(&args.filter_options, arg); continue; } + if (!strcmp(arg, ("--no-" CL_ARG__FILTER))) { + list_objects_filter_release(&args.filter_options); + continue; + } usage(fetch_pack_usage); } if (deepen_not.nr) -- 2.9.3
[PATCH v6 02/14] fetch-pack, index-pack, transport: partial clone
From: Jeff Hostetler Signed-off-by: Jeff Hostetler --- builtin/fetch-pack.c | 4 fetch-pack.c | 13 + fetch-pack.h | 2 ++ transport-helper.c | 5 + transport.c | 4 transport.h | 5 + 6 files changed, 33 insertions(+) diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 15eeed7..7957807 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -153,6 +153,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) args.no_dependents = 1; continue; } + if (skip_prefix(arg, ("--" CL_ARG__FILTER "="), &arg)) { + parse_list_objects_filter(&args.filter_options, arg); + continue; + } usage(fetch_pack_usage); } if (deepen_not.nr) diff --git a/fetch-pack.c b/fetch-pack.c index 0798e0b..3c5f064 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -29,6 +29,7 @@ static int deepen_not_ok; static int fetch_fsck_objects = -1; static int transfer_fsck_objects = -1; static int agent_supported; +static int server_supports_filtering; static struct lock_file shallow_lock; static const char *alternate_shallow_file; @@ -379,6 +380,8 @@ static int find_common(struct fetch_pack_args *args, if (deepen_not_ok) strbuf_addstr(&c, " deepen-not"); if (agent_supported)strbuf_addf(&c, " agent=%s", git_user_agent_sanitized()); + if (args->filter_options.choice) + strbuf_addstr(&c, " filter"); packet_buf_write(&req_buf, "want %s%s\n", remote_hex, c.buf); strbuf_release(&c); } else @@ -407,6 +410,9 @@ static int find_common(struct fetch_pack_args *args, packet_buf_write(&req_buf, "deepen-not %s", s->string); } } + if (server_supports_filtering && args->filter_options.choice) + packet_buf_write(&req_buf, "filter %s", +args->filter_options.filter_spec); packet_buf_flush(&req_buf); state_len = req_buf.len; @@ -969,6 +975,13 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, else prefer_ofs_delta = 0; + if (server_supports("filter")) { + server_supports_filtering = 1; + print_verbose(args, _("Server supports filter")); + } else if (args->filter_options.choice) { + warning("filtering not recognized by server, ignoring"); + } + if ((agent_feature = server_feature_value("agent", &agent_len))) { agent_supported = 1; if (agent_len) diff --git a/fetch-pack.h b/fetch-pack.h index aeac152..3e224a1 100644 --- a/fetch-pack.h +++ b/fetch-pack.h @@ -3,6 +3,7 @@ #include "string-list.h" #include "run-command.h" +#include "list-objects-filter-options.h" struct oid_array; @@ -12,6 +13,7 @@ struct fetch_pack_args { int depth; const char *deepen_since; const struct string_list *deepen_not; + struct list_objects_filter_options filter_options; unsigned deepen_relative:1; unsigned quiet:1; unsigned keep_pack:1; diff --git a/transport-helper.c b/transport-helper.c index c948d52..0650ca0 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -671,6 +671,11 @@ static int fetch(struct transport *transport, if (data->transport_options.update_shallow) set_helper_option(transport, "update-shallow", "true"); + if (data->transport_options.filter_options.choice) + set_helper_option( + transport, "filter", + data->transport_options.filter_options.filter_spec); + if (data->fetch) return fetch_with_fetch(transport, nr_heads, to_fetch); diff --git a/transport.c b/transport.c index f2fbc6f..cce50f5 100644 --- a/transport.c +++ b/transport.c @@ -166,6 +166,9 @@ static int set_git_option(struct git_transport_options *opts, } else if (!strcmp(name, TRANS_OPT_NO_DEPENDENTS)) { opts->no_dependents = !!value; return 0; + } else if (!strcmp(name, TRANS_OPT_LIST_OBJECTS_FILTER)) { + parse_list_objects_filter(&opts->filter_options, value); + return 0; } return 1; } @@ -236,6 +239,7 @@ static int fetch_refs_via_pack(struct transport *transport, args.update_shallow = data->options.update_shallow; args.from_promisor = data->options.from_promisor; args.no_dependents = data->options.no_dependents; + args.filter_options = data->options.filter_options; if (!data->got_remote_heads) { connect_setup(transport,
[PATCH v6 13/14] fetch-pack: restore save_commit_buffer after use
From: Jonathan Tan In fetch-pack, the global variable save_commit_buffer is set to 0, but not restored to its original value after use. In particular, if show_log() (in log-tree.c) is invoked after fetch_pack() in the same process, show_log() will return before printing out the commit message (because the invocation to get_cached_commit_buffer() returns NULL, because the commit buffer was not saved). I discovered this when attempting to run "git log -S" in a partial clone, triggering the case where revision walking lazily loads missing objects. Therefore, restore save_commit_buffer to its original value after use. An alternative to solve the problem I had is to replace get_cached_commit_buffer() with get_commit_buffer(). That invocation was introduced in commit a97934d ("use get_cached_commit_buffer where appropriate", 2014-06-13) to replace "commit->buffer" introduced in commit 3131b71 ("Add "--show-all" revision walker flag for debugging", 2008-02-13). In the latter commit, the commit author seems to be deciding between not showing an unparsed commit at all and showing an unparsed commit without the message (which is what the commit does), and did not mention parsing the unparsed commit, so I prefer to preserve the existing behavior. Signed-off-by: Jonathan Tan Signed-off-by: Jeff Hostetler --- fetch-pack.c | 4 1 file changed, 4 insertions(+) diff --git a/fetch-pack.c b/fetch-pack.c index 3c5f064..773453c 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -717,6 +717,7 @@ static int everything_local(struct fetch_pack_args *args, { struct ref *ref; int retval; + int old_save_commit_buffer = save_commit_buffer; timestamp_t cutoff = 0; save_commit_buffer = 0; @@ -786,6 +787,9 @@ static int everything_local(struct fetch_pack_args *args, print_verbose(args, _("already have %s (%s)"), oid_to_hex(remote), ref->name); } + + save_commit_buffer = old_save_commit_buffer; + return retval; } -- 2.9.3
[PATCH v6 14/14] t5616: end-to-end tests for partial clone
From: Jeff Hostetler Additional end-to-end tests for partial clone. Signed-off-by: Jeff Hostetler --- t/t5616-partial-clone.sh | 125 +++ 1 file changed, 125 insertions(+) create mode 100755 t/t5616-partial-clone.sh diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh new file mode 100755 index 000..fa573f8 --- /dev/null +++ b/t/t5616-partial-clone.sh @@ -0,0 +1,125 @@ +#!/bin/sh + +test_description='git partial clone' + +. ./test-lib.sh + +# create a normal "src" repo where we can later create new commits. +# expect_1.oids will contain a list of the OIDs of all blobs. +test_expect_success 'setup normal src repo' ' + echo "{print \$1}" >print_1.awk && + echo "{print \$2}" >print_2.awk && + + git init src && + for n in 1 2 3 4 + do + echo "This is file: $n" > src/file.$n.txt + git -C src add file.$n.txt + git -C src commit -m "file $n" + git -C src ls-files -s file.$n.txt >>temp + done && + awk -f print_2.awk expect_1.oids && + test "$(wc -l observed.oids && + test_cmp expect_1.oids observed.oids && + test "$(git -C pc1 config --local core.repositoryformatversion)" = "1" && + test "$(git -C pc1 config --local extensions.partialclone)" = "origin" && + test "$(git -C pc1 config --local core.partialclonefilter)" = "blob:none" +' + +# checkout master to force dynamic object fetch of blobs at HEAD. +# confirm we now have the expected blobs in a new packfile. +test_expect_success 'verify checkout with dynamic object fetch' ' + git -C pc1 checkout master && + ( cd pc1/.git/objects/pack; + git verify-pack -v *.pack + ) >temp && + grep blob observed.oids && + test_cmp expect_1.oids observed.oids +' + +# create new commits in "src" repo and push to "srv.bare". +# repack srv.bare just to make it easy to count the blobs. +# expect_2.oids will contain a list of the OIDs of all blobs. +test_expect_success 'push new commits to server' ' + git -C src remote add srv "file://$(pwd)/srv.bare" && + for x in a b c d + do + echo "Mod $x" >>src/file.1.txt + git -C src add file.1.txt + git -C src commit -m "mod $x" + done && + git -C src push -u srv master && + git -C srv.bare repack && + ( cd srv.bare/objects/pack; + git verify-pack -v *.pack + ) >temp && + grep blob expect_2.oids && + test "$(wc -l expect.blame +' + +# fetch in the partial clone repo from the server (the promisor remote). +# verify that fetch was a "partial fetch". +# [] that it inherited the filter settings +# [] that is DOES NOT have the new blobs. +test_expect_success 'partial fetch inherits filter settings' ' + git -C pc1 fetch origin && + ( cd pc1/.git/objects/pack; + git verify-pack -v *.pack + ) >temp && + grep blob observed.oids && + test_cmp expect_1.oids observed.oids +' + +# force dynamic object fetch using diff. +# we should only get 1 new blob (for the file in origin/master). +# it should be in a new packfile (since the promisor boundary is +# currently a packfile, it should not get unpacked upon receipt.) +test_expect_success 'verify diff causes dynamic object fetch' ' + test "$(wc -l temp && + grep blob observed.oids && + cat observed.oids && + test "$(wc -l observed.blame && + test_cmp expect.blame observed.blame +' + +test_done -- 2.9.3
[PATCH v6 12/14] unpack-trees: batch fetching of missing blobs
From: Jonathan Tan When running checkout, first prefetch all blobs that are to be updated but are missing. This means that only one pack is downloaded during such operations, instead of one per missing blob. This operates only on the blob level - if a repository has a missing tree, they are still fetched one at a time. This does not use the delayed checkout mechanism introduced in commit 2841e8f ("convert: add "status=delayed" to filter process protocol", 2017-06-30) due to significant conceptual differences - in particular, for partial clones, we already know what needs to be fetched based on the contents of the local repo alone, whereas for status=delayed, it is the filter process that tells us what needs to be checked in the end. Signed-off-by: Jonathan Tan Signed-off-by: Jeff Hostetler --- fetch-object.c | 26 ++ fetch-object.h | 5 + t/t5601-clone.sh | 52 unpack-trees.c | 22 ++ 4 files changed, 101 insertions(+), 4 deletions(-) diff --git a/fetch-object.c b/fetch-object.c index 258fcfa..853624f 100644 --- a/fetch-object.c +++ b/fetch-object.c @@ -5,11 +5,10 @@ #include "transport.h" #include "fetch-object.h" -void fetch_object(const char *remote_name, const unsigned char *sha1) +static void fetch_refs(const char *remote_name, struct ref *ref) { struct remote *remote; struct transport *transport; - struct ref *ref; int original_fetch_if_missing = fetch_if_missing; fetch_if_missing = 0; @@ -18,10 +17,29 @@ void fetch_object(const char *remote_name, const unsigned char *sha1) die(_("Remote with no URL")); transport = transport_get(remote, remote->url[0]); - ref = alloc_ref(sha1_to_hex(sha1)); - hashcpy(ref->old_oid.hash, sha1); transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1"); transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1"); transport_fetch_refs(transport, ref); fetch_if_missing = original_fetch_if_missing; } + +void fetch_object(const char *remote_name, const unsigned char *sha1) +{ + struct ref *ref = alloc_ref(sha1_to_hex(sha1)); + hashcpy(ref->old_oid.hash, sha1); + fetch_refs(remote_name, ref); +} + +void fetch_objects(const char *remote_name, const struct oid_array *to_fetch) +{ + struct ref *ref = NULL; + int i; + + for (i = 0; i < to_fetch->nr; i++) { + struct ref *new_ref = alloc_ref(oid_to_hex(&to_fetch->oid[i])); + oidcpy(&new_ref->old_oid, &to_fetch->oid[i]); + new_ref->next = ref; + ref = new_ref; + } + fetch_refs(remote_name, ref); +} diff --git a/fetch-object.h b/fetch-object.h index f371300..4b269d0 100644 --- a/fetch-object.h +++ b/fetch-object.h @@ -1,6 +1,11 @@ #ifndef FETCH_OBJECT_H #define FETCH_OBJECT_H +#include "sha1-array.h" + extern void fetch_object(const char *remote_name, const unsigned char *sha1); +extern void fetch_objects(const char *remote_name, + const struct oid_array *to_fetch); + #endif diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index 6d37c6d..13610b7 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -611,6 +611,58 @@ test_expect_success 'partial clone: warn if server does not support object filte test_i18ngrep "filtering not recognized by server" err ' +test_expect_success 'batch missing blob request during checkout' ' + rm -rf server client && + + test_create_repo server && + echo a >server/a && + echo b >server/b && + git -C server add a b && + + git -C server commit -m x && + echo aa >server/a && + echo bb >server/b && + git -C server add a b && + git -C server commit -m x && + + test_config -C server uploadpack.allowfilter 1 && + test_config -C server uploadpack.allowanysha1inwant 1 && + + git clone --filter=blob:limit=0 "file://$(pwd)/server" client && + + # Ensure that there is only one negotiation by checking that there is + # only "done" line sent. ("done" marks the end of negotiation.) + GIT_TRACE_PACKET="$(pwd)/trace" git -C client checkout HEAD^ && + grep "git> done" trace >done_lines && + test_line_count = 1 done_lines +' + +test_expect_success 'batch missing blob request does not inadvertently try to fetch gitlinks' ' + rm -rf server client && + + test_create_repo repo_for_submodule && + test_commit -C repo_for_submodule x && + + test_create_repo server && + echo a >server/a && + echo b >server/b && + git -C server add a b && + git -C server commit -m x && + + echo aa >server/a && + echo bb >server/b && + # Also add a gitlink pointing to an arbitrary repository + git -C server submodule add "$(pwd)/repo_for_submodule" c && + git -C server add a b c && +
[PATCH v6 00/14] Parial clone part 3: clone, fetch, fetch-pack, upload-pack, and tests
From: Jeff Hostetler This is V6 of part 3 of partial clone. It assumes that V6 of parts 1 and 2 are already present. This version is refactors and reorders commits to better combine new functionality and tests. It contains 3 fixup commits that should be squashed after some discussion. Jeff Hostetler (9): upload-pack: add object filtering for partial clone fetch-pack, index-pack, transport: partial clone fetch-pack: add --no-filter fetch: support filters fixup: fetch: update error messages from --blob-max-bytes to --filter fixup: fetch: update --blob-max-bytes to --fitler fixup: connected: conditionally pass --exclude-promisor-objects to rev-list partial-clone: define partial clone settings in config t5616: end-to-end tests for partial clone Jonathan Tan (5): fetch-pack: test support excluding large blobs fetch: refactor calculation of remote list clone: partial clone unpack-trees: batch fetching of missing blobs fetch-pack: restore save_commit_buffer after use Documentation/config.txt | 4 + Documentation/technical/pack-protocol.txt | 8 ++ Documentation/technical/protocol-capabilities.txt | 8 ++ builtin/clone.c | 22 +++- builtin/fetch-pack.c | 8 ++ builtin/fetch.c | 33 -- cache.h | 1 + config.c | 5 + connected.c | 2 + environment.c | 1 + fetch-object.c| 26 - fetch-object.h| 5 + fetch-pack.c | 17 +++ fetch-pack.h | 2 + list-objects-filter-options.c | 90 +--- list-objects-filter-options.h | 6 ++ remote-curl.c | 6 ++ t/t5500-fetch-pack.sh | 63 +++ t/t5601-clone.sh | 101 + t/t5616-partial-clone.sh | 125 ++ transport-helper.c| 5 + transport.c | 4 + transport.h | 5 + unpack-trees.c| 22 upload-pack.c | 31 +- 25 files changed, 572 insertions(+), 28 deletions(-) create mode 100755 t/t5616-partial-clone.sh -- 2.9.3
[PATCH v6 11/14] clone: partial clone
From: Jonathan Tan Signed-off-by: Jonathan Tan Signed-off-by: Jeff Hostetler --- builtin/clone.c | 22 -- t/t5601-clone.sh | 49 + 2 files changed, 69 insertions(+), 2 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index dbddd98..f519bd4 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -26,6 +26,7 @@ #include "run-command.h" #include "connected.h" #include "packfile.h" +#include "list-objects-filter-options.h" /* * Overall FIXMEs: @@ -60,6 +61,7 @@ static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP; static int option_dissociate; static int max_jobs = -1; static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP; +static struct list_objects_filter_options filter_options; static int recurse_submodules_cb(const struct option *opt, const char *arg, int unset) @@ -135,6 +137,7 @@ static struct option builtin_clone_options[] = { TRANSPORT_FAMILY_IPV4), OPT_SET_INT('6', "ipv6", &family, N_("use IPv6 addresses only"), TRANSPORT_FAMILY_IPV6), + OPT_PARSE_LIST_OBJECTS_FILTER(&filter_options), OPT_END() }; @@ -886,6 +889,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) struct refspec *refspec; const char *fetch_pattern; + fetch_if_missing = 0; + packet_trace_identity("clone"); argc = parse_options(argc, argv, prefix, builtin_clone_options, builtin_clone_usage, 0); @@ -1073,6 +1078,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) warning(_("--shallow-since is ignored in local clones; use file:// instead.")); if (option_not.nr) warning(_("--shallow-exclude is ignored in local clones; use file:// instead.")); + if (filter_options.choice) + warning(_("--filter is ignored in local clones; use file:// instead.")); if (!access(mkpath("%s/shallow", path), F_OK)) { if (option_local > 0) warning(_("source repository is shallow, ignoring --local")); @@ -1104,7 +,13 @@ int cmd_clone(int argc, const char **argv, const char *prefix) transport_set_option(transport, TRANS_OPT_UPLOADPACK, option_upload_pack); - if (transport->smart_options && !deepen) + if (filter_options.choice) { + transport_set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER, +filter_options.filter_spec); + transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1"); + } + + if (transport->smart_options && !deepen && !filter_options.choice) transport->smart_options->check_self_contained_and_connected = 1; refs = transport_get_remote_refs(transport); @@ -1164,13 +1177,17 @@ int cmd_clone(int argc, const char **argv, const char *prefix) write_refspec_config(src_ref_prefix, our_head_points_at, remote_head_points_at, &branch_top); + if (filter_options.choice) + partial_clone_register("origin", &filter_options); + if (is_local) clone_local(path, git_dir); else if (refs && complete_refs_before_fetch) transport_fetch_refs(transport, mapped_refs); update_remote_refs(refs, mapped_refs, remote_head_points_at, - branch_top.buf, reflog_msg.buf, transport, !is_local); + branch_top.buf, reflog_msg.buf, transport, + !is_local && !filter_options.choice); update_head(our_head_points_at, remote_head, reflog_msg.buf); @@ -1191,6 +1208,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) } junk_mode = JUNK_LEAVE_REPO; + fetch_if_missing = 1; err = checkout(submodule_progress); strbuf_release(&reflog_msg); diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index 9c56f77..6d37c6d 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -571,4 +571,53 @@ test_expect_success 'GIT_TRACE_PACKFILE produces a usable pack' ' git -C replay.git index-pack -v --stdin err && + + test_i18ngrep "filtering not recognized by server" err +' + +. "$TEST_DIRECTORY"/lib-httpd.sh +start_httpd + +test_expect_success 'partial clone using HTTP' ' + partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/smart/server" +' + +stop_httpd + test_done -- 2.9.3
[PATCH v6 10/12] fixup: sha1_file: add TODO
From: Jeff Hostetler Signed-off-by: Jeff Hostetler --- sha1_file.c | 4 1 file changed, 4 insertions(+) diff --git a/sha1_file.c b/sha1_file.c index ce67f27..dd956e2 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1196,6 +1196,10 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, /* Check if it is a missing object */ if (fetch_if_missing && repository_format_partial_clone && !already_retried) { + /* +* TODO Investigate haveing fetch_object() return +* TODO error/success and stopping the music here. +*/ fetch_object(repository_format_partial_clone, real); already_retried = 1; continue; -- 2.9.3
[PATCH v6 06/12] index-pack: refactor writing of .keep files
From: Jonathan Tan In a subsequent commit, index-pack will be taught to write ".promisor" files which are similar to the ".keep" files it knows how to write. Refactor the writing of ".keep" files, so that the implementation of writing ".promisor" files becomes easier. Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 99 1 file changed, 53 insertions(+), 46 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 8ec459f..4f305a7 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1389,15 +1389,58 @@ static void fix_unresolved_deltas(struct sha1file *f) free(sorted_by_pos); } +static const char *derive_filename(const char *pack_name, const char *suffix, + struct strbuf *buf) +{ + size_t len; + if (!strip_suffix(pack_name, ".pack", &len)) + die(_("packfile name '%s' does not end with '.pack'"), + pack_name); + strbuf_add(buf, pack_name, len); + strbuf_addch(buf, '.'); + strbuf_addstr(buf, suffix); + return buf->buf; +} + +static void write_special_file(const char *suffix, const char *msg, + const char *pack_name, const unsigned char *sha1, + const char **report) +{ + struct strbuf name_buf = STRBUF_INIT; + const char *filename; + int fd; + int msg_len = strlen(msg); + + if (pack_name) + filename = derive_filename(pack_name, suffix, &name_buf); + else + filename = odb_pack_name(&name_buf, sha1, suffix); + + fd = odb_pack_keep(filename); + if (fd < 0) { + if (errno != EEXIST) + die_errno(_("cannot write %s file '%s'"), + suffix, filename); + } else { + if (msg_len > 0) { + write_or_die(fd, msg, msg_len); + write_or_die(fd, "\n", 1); + } + if (close(fd) != 0) + die_errno(_("cannot close written %s file '%s'"), + suffix, filename); + *report = suffix; + } + strbuf_release(&name_buf); +} + static void final(const char *final_pack_name, const char *curr_pack_name, const char *final_index_name, const char *curr_index_name, - const char *keep_name, const char *keep_msg, - unsigned char *sha1) + const char *keep_msg, unsigned char *sha1) { const char *report = "pack"; struct strbuf pack_name = STRBUF_INIT; struct strbuf index_name = STRBUF_INIT; - struct strbuf keep_name_buf = STRBUF_INIT; int err; if (!from_stdin) { @@ -1409,28 +1452,9 @@ static void final(const char *final_pack_name, const char *curr_pack_name, die_errno(_("error while closing pack file")); } - if (keep_msg) { - int keep_fd, keep_msg_len = strlen(keep_msg); - - if (!keep_name) - keep_name = odb_pack_name(&keep_name_buf, sha1, "keep"); - - keep_fd = odb_pack_keep(keep_name); - if (keep_fd < 0) { - if (errno != EEXIST) - die_errno(_("cannot write keep file '%s'"), - keep_name); - } else { - if (keep_msg_len > 0) { - write_or_die(keep_fd, keep_msg, keep_msg_len); - write_or_die(keep_fd, "\n", 1); - } - if (close(keep_fd) != 0) - die_errno(_("cannot close written keep file '%s'"), - keep_name); - report = "keep"; - } - } + if (keep_msg) + write_special_file("keep", keep_msg, final_pack_name, sha1, + &report); if (final_pack_name != curr_pack_name) { if (!final_pack_name) @@ -1472,7 +1496,6 @@ static void final(const char *final_pack_name, const char *curr_pack_name, strbuf_release(&index_name); strbuf_release(&pack_name); - strbuf_release(&keep_name_buf); } static int git_index_pack_config(const char *k, const char *v, void *cb) @@ -1615,26 +1638,13 @@ static void show_pack_info(int stat_only) } } -static const char *derive_filename(const char *pack_name, const char *suffix, - struct strbuf *buf) -{ - size_t len; - if (!strip_suffix(pack_name, ".pack", &len)) - die(_("packfile name '%s' does not end with '.pack'"), - pack_name); - strbuf_add(buf, pack_name, len); - strbuf_addstr(buf, suffix); - return
[PATCH v6 07/12] introduce fetch-object: fetch one promisor object
From: Jonathan Tan Introduce fetch-object, providing the ability to fetch one object from a promisor remote. This uses fetch-pack. To do this, the transport mechanism has been updated with 2 flags, "from-promisor" to indicate that the resulting pack comes from a promisor remote (and thus should be annotated as such by index-pack), and "no-dependents" to indicate that only the objects themselves need to be fetched (but fetching additional objects is nevertheless safe). Whenever "no-dependents" is used, fetch-pack will refrain from using any object flags, because it is most likely invoked as part of a dynamic object fetch by another Git command (which may itself use object flags). An alternative to this is to leave fetch-pack alone, and instead update the allocation of flags so that fetch-pack's flags never overlap with any others, but this will end up shrinking the number of flags available to nearly every other Git command (that is, every Git command that accesses objects), so the approach in this commit was used instead. This will be tested in a subsequent commit. Signed-off-by: Jonathan Tan --- Documentation/gitremote-helpers.txt | 7 ++ Makefile| 1 + builtin/fetch-pack.c| 8 +++ builtin/index-pack.c| 16 ++--- fetch-object.c | 24 +++ fetch-object.h | 6 + fetch-pack.c| 48 + fetch-pack.h| 8 +++ remote-curl.c | 14 ++- transport.c | 8 +++ transport.h | 11 + 11 files changed, 126 insertions(+), 25 deletions(-) create mode 100644 fetch-object.c create mode 100644 fetch-object.h diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt index 4a584f3..4b8c93e 100644 --- a/Documentation/gitremote-helpers.txt +++ b/Documentation/gitremote-helpers.txt @@ -466,6 +466,13 @@ set by Git if the remote helper has the 'option' capability. Transmit as a push option. As the push option must not contain LF or NUL characters, the string is not encoded. +'option from-promisor' {'true'|'false'}:: + Indicate that these objects are being fetched from a promisor. + +'option no-dependents' {'true'|'false'}:: + Indicate that only the objects wanted need to be fetched, not + their dependents. + SEE ALSO linkgit:git-remote[1] diff --git a/Makefile b/Makefile index ca378a4..795e0c7 100644 --- a/Makefile +++ b/Makefile @@ -792,6 +792,7 @@ LIB_OBJS += ewah/ewah_bitmap.o LIB_OBJS += ewah/ewah_io.o LIB_OBJS += ewah/ewah_rlw.o LIB_OBJS += exec_cmd.o +LIB_OBJS += fetch-object.o LIB_OBJS += fetch-pack.o LIB_OBJS += fsck.o LIB_OBJS += gettext.o diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 366b9d1..02abe72 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -143,6 +143,14 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) args.update_shallow = 1; continue; } + if (!strcmp("--from-promisor", arg)) { + args.from_promisor = 1; + continue; + } + if (!strcmp("--no-dependents", arg)) { + args.no_dependents = 1; + continue; + } usage(fetch_pack_usage); } if (deepen_not.nr) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 4f305a7..24c2f05 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1429,14 +1429,16 @@ static void write_special_file(const char *suffix, const char *msg, if (close(fd) != 0) die_errno(_("cannot close written %s file '%s'"), suffix, filename); - *report = suffix; + if (report) + *report = suffix; } strbuf_release(&name_buf); } static void final(const char *final_pack_name, const char *curr_pack_name, const char *final_index_name, const char *curr_index_name, - const char *keep_msg, unsigned char *sha1) + const char *keep_msg, const char *promisor_msg, + unsigned char *sha1) { const char *report = "pack"; struct strbuf pack_name = STRBUF_INIT; @@ -1455,6 +1457,9 @@ static void final(const char *final_pack_name, const char *curr_pack_name, if (keep_msg) write_special_file("keep", keep_msg, final_pack_name, sha1, &report); + if (promisor_msg) + write_special_file("promisor", promisor_msg, final_pack_name, + sha1, NULL); if (final_pack_name != curr_pack_name) {
[PATCH v6 12/12] gc: do not repack promisor packfiles
From: Jonathan Tan Teach gc to stop traversal at promisor objects, and to leave promisor packfiles alone. This has the effect of only repacking non-promisor packfiles, and preserves the distinction between promisor packfiles and non-promisor packfiles. Signed-off-by: Jonathan Tan Signed-off-by: Jeff Hostetler --- Documentation/git-pack-objects.txt | 11 builtin/gc.c | 3 +++ builtin/pack-objects.c | 37 -- builtin/prune.c| 7 + builtin/repack.c | 8 -- t/t0410-partial-clone.sh | 54 -- 6 files changed, 114 insertions(+), 6 deletions(-) diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index aa403d0..81bc490 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -255,6 +255,17 @@ a missing object is encountered. This is the default action. The form '--missing=allow-any' will allow object traversal to continue if a missing object is encountered. Missing objects will silently be omitted from the results. ++ +The form '--missing=allow-promisor' is like 'allow-any', but will only +allow object traversal to continue for EXPECTED promisor missing objects. +Unexpected missing object will raise an error. + +--exclude-promisor-objects:: + Omit objects that are known to be in the promisor remote. (This + option has the purpose of operating only on locally created objects, + so that when we repack, we still maintain a distinction between + locally created objects [without .promisor] and objects from the + promisor remote [with .promisor].) This is used with partial clone. SEE ALSO diff --git a/builtin/gc.c b/builtin/gc.c index 3c5eae0..77fa720 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -458,6 +458,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix) argv_array_push(&prune, prune_expire); if (quiet) argv_array_push(&prune, "--no-progress"); + if (repository_format_partial_clone) + argv_array_push(&prune, + "--exclude-promisor-objects"); if (run_command_v_opt(prune.argv, RUN_GIT_CMD)) return error(FAILED_RUN, prune.argv[0]); } diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 45ad35d..f5fc401 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -75,6 +75,8 @@ static int use_bitmap_index = -1; static int write_bitmap_index; static uint16_t write_bitmap_options; +static int exclude_promisor_objects; + static unsigned long delta_cache_size = 0; static unsigned long max_delta_cache_size = 256 * 1024 * 1024; static unsigned long cache_max_small_delta_size = 1000; @@ -84,8 +86,9 @@ static unsigned long window_memory_limit = 0; static struct list_objects_filter_options filter_options; enum missing_action { - MA_ERROR = 0,/* fail if any missing objects are encountered */ - MA_ALLOW_ANY,/* silently allow ALL missing objects */ + MA_ERROR = 0, /* fail if any missing objects are encountered */ + MA_ALLOW_ANY, /* silently allow ALL missing objects */ + MA_ALLOW_PROMISOR, /* silently allow all missing PROMISOR objects */ }; static enum missing_action arg_missing_action; static show_object_fn fn_show_object; @@ -2577,6 +2580,20 @@ static void show_object__ma_allow_any(struct object *obj, const char *name, void show_object(obj, name, data); } +static void show_object__ma_allow_promisor(struct object *obj, const char *name, void *data) +{ + assert(arg_missing_action == MA_ALLOW_PROMISOR); + + /* +* Quietly ignore EXPECTED missing objects. This avoids problems with +* staging them now and getting an odd error later. +*/ + if (!has_object_file(&obj->oid) && is_promisor_object(&obj->oid)) + return; + + show_object(obj, name, data); +} + static int option_parse_missing_action(const struct option *opt, const char *arg, int unset) { @@ -2591,10 +2608,18 @@ static int option_parse_missing_action(const struct option *opt, if (!strcmp(arg, "allow-any")) { arg_missing_action = MA_ALLOW_ANY; + fetch_if_missing = 0; fn_show_object = show_object__ma_allow_any; return 0; } + if (!strcmp(arg, "allow-promisor")) { + arg_missing_action = MA_ALLOW_PROMISOR; + fetch_if_missing = 0; + fn_show_object = show_object__ma_allow_promisor; + return 0; + } + die(_("invalid value for --missing")); return 0; } @@ -3008,6 +3033,8 @@ int cmd_pack
[PATCH v6 11/12] rev-list: support termination at promisor objects
From: Jonathan Tan Teach rev-list to support termination of an object traversal at any object from a promisor remote (whether one that the local repo also has, or one that the local repo knows about because it has another promisor object that references it). This will be used subsequently in gc and in the connectivity check used by fetch. For efficiency, if an object is referenced by a promisor object, and is in the local repo only as a non-promisor object, object traversal will not stop there. This is to avoid building the list of promisor object references. (In list-objects.c, the case where obj is NULL in process_blob() and process_tree() do not need to be changed because those happen only when there is a conflict between the expected type and the existing object. If the object doesn't exist, an object will be synthesized, which is fine.) Signed-off-by: Jonathan Tan Signed-off-by: Jeff Hostetler --- Documentation/rev-list-options.txt | 11 builtin/rev-list.c | 71 +++--- list-objects.c | 29 ++- object.c | 2 +- revision.c | 33 +++- revision.h | 5 +- t/t0410-partial-clone.sh | 101 + 7 files changed, 240 insertions(+), 12 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 8d8b7f4..0ce8ccd 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -745,10 +745,21 @@ The form '--missing=allow-any' will allow object traversal to continue if a missing object is encountered. Missing objects will silently be omitted from the results. + +The form '--missing=allow-promisor' is like 'allow-any', but will only +allow object traversal to continue for EXPECTED promisor missing objects. +Unexpected missing objects will raise an error. ++ The form '--missing=print' is like 'allow-any', but will also print a list of the missing objects. Object IDs are prefixed with a ``?'' character. endif::git-rev-list[] +--exclude-promisor-objects:: + (For internal use only.) Prefilter object traversal at + promisor boundary. This is used with partial clone. This is + stronger than `--missing=allow-promisor` because it limits the + traversal, rather than just silencing errors about missing + objects. + --no-walk[=(sorted|unsorted)]:: Only show the given commits, but do not traverse their ancestors. This has no effect if a range is specified. If the argument diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 547a3e0..48f922d 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -15,6 +15,7 @@ #include "progress.h" #include "reflog-walk.h" #include "oidset.h" +#include "packfile.h" static const char rev_list_usage[] = "git rev-list [OPTION] ... [ -- paths... ]\n" @@ -67,6 +68,7 @@ enum missing_action { MA_ERROR = 0,/* fail if any missing objects are encountered */ MA_ALLOW_ANY,/* silently allow ALL missing objects */ MA_PRINT,/* print ALL missing objects in special section */ + MA_ALLOW_PROMISOR, /* silently allow all missing PROMISOR objects */ }; static enum missing_action arg_missing_action; @@ -197,6 +199,12 @@ static void finish_commit(struct commit *commit, void *data) static inline void finish_object__ma(struct object *obj) { + /* +* Whether or not we try to dynamically fetch missing objects +* from the server, we currently DO NOT have the object. We +* can either print, allow (ignore), or conditionally allow +* (ignore) them. +*/ switch (arg_missing_action) { case MA_ERROR: die("missing blob object '%s'", oid_to_hex(&obj->oid)); @@ -209,25 +217,36 @@ static inline void finish_object__ma(struct object *obj) oidset_insert(&missing_objects, &obj->oid); return; + case MA_ALLOW_PROMISOR: + if (is_promisor_object(&obj->oid)) + return; + die("unexpected missing blob object '%s'", + oid_to_hex(&obj->oid)); + return; + default: BUG("unhandled missing_action"); return; } } -static void finish_object(struct object *obj, const char *name, void *cb_data) +static int finish_object(struct object *obj, const char *name, void *cb_data) { struct rev_list_info *info = cb_data; - if (obj->type == OBJ_BLOB && !has_object_file(&obj->oid)) + if (obj->type == OBJ_BLOB && !has_object_file(&obj->oid)) { finish_object__ma(obj); + return 1; + } if (info->revs->verify_objects && !obj->parsed && obj->type != OBJ_COMMIT) parse_object(&obj->oid); + return 0; } static void show_object(stru
[PATCH v6 08/12] sha1_file: support lazily fetching missing objects
From: Jonathan Tan Teach sha1_file to fetch objects from the remote configured in extensions.partialclone whenever an object is requested but missing. The fetching of objects can be suppressed through a global variable. This is used by fsck and index-pack. However, by default, such fetching is not suppressed. This is meant as a temporary measure to ensure that all Git commands work in such a situation. Future patches will update some commands to either tolerate missing objects (without fetching them) or be more efficient in fetching them. In order to determine the code changes in sha1_file.c necessary, I investigated the following: (1) functions in sha1_file.c that take in a hash, without the user regarding how the object is stored (loose or packed) (2) functions in packfile.c (because I need to check callers that know about the loose/packed distinction and operate on both differently, and ensure that they can handle the concept of objects that are neither loose nor packed) (1) is handled by the modification to sha1_object_info_extended(). For (2), I looked at for_each_packed_object and others. For for_each_packed_object, the callers either already work or are fixed in this patch: - reachable - only to find recent objects - builtin/fsck - already knows about missing objects - builtin/cat-file - warning message added in this commit Callers of the other functions do not need to be changed: - parse_pack_index - http - indirectly from http_get_info_packs - find_pack_entry_one - this searches a single pack that is provided as an argument; the caller already knows (through other means) that the sought object is in a specific pack - find_sha1_pack - fast-import - appears to be an optimization to not store a file if it is already in a pack - http-walker - to search through a struct alt_base - http-push - to search through remote packs - has_sha1_pack - builtin/fsck - already knows about promisor objects - builtin/count-objects - informational purposes only (check if loose object is also packed) - builtin/prune-packed - check if object to be pruned is packed (if not, don't prune it) - revision - used to exclude packed objects if requested by user - diff - just for optimization Signed-off-by: Jonathan Tan --- builtin/cat-file.c | 2 ++ builtin/fetch-pack.c | 2 ++ builtin/fsck.c | 3 +++ builtin/index-pack.c | 6 ++ cache.h | 8 fetch-object.c | 3 +++ sha1_file.c | 38 t/t0410-partial-clone.sh | 51 8 files changed, 100 insertions(+), 13 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index f5fa4fd..cf9ea5c 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -475,6 +475,8 @@ static int batch_objects(struct batch_options *opt) for_each_loose_object(batch_loose_object, &sa, 0); for_each_packed_object(batch_packed_object, &sa, 0); + if (repository_format_partial_clone) + warning("This repository has extensions.partialClone set. Some objects may not be loaded."); cb.opt = opt; cb.expand = &data; diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 02abe72..15eeed7 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -53,6 +53,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) struct oid_array shallow = OID_ARRAY_INIT; struct string_list deepen_not = STRING_LIST_INIT_DUP; + fetch_if_missing = 0; + packet_trace_identity("fetch-pack"); memset(&args, 0, sizeof(args)); diff --git a/builtin/fsck.c b/builtin/fsck.c index 578a7c8..3b76c0e 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -678,6 +678,9 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) int i; struct alternate_object_database *alt; + /* fsck knows how to handle missing promisor objects */ + fetch_if_missing = 0; + errors_found = 0; check_replace_refs = 0; diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 24c2f05..a0a35e6 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1657,6 +1657,12 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) unsigned foreign_nr = 1;/* zero is a "good" value, assume bad */ int report_end_of_input = 0; + /* +* index-pack never needs to fetch missing objects, since it only +* accesses the repo to do hash collision checks +*/ + fetch_if_missing = 0; + if (argc == 2 && !strcmp(argv[1], "-h")) usage(index_pack_usage); diff --git a/cache.h b/cache.h index c76f2e9..6980072 100644 --- a/cache.h +++ b/cache.h @@ -1727,6 +1727,14 @@ struct object_info { #define OBJ
[PATCH v6 09/12] fixup: sha1_file: convert gotos to break/continue
From: Jeff Hostetler Signed-off-by: Jeff Hostetler --- sha1_file.c | 40 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index fc7718a..ce67f27 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1180,30 +1180,30 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, } } -retry: - if (find_pack_entry(real, &e)) - goto found_packed; + while (1) { + if (find_pack_entry(real, &e)) + break; - /* Most likely it's a loose object. */ - if (!sha1_loose_object_info(real, oi, flags)) - return 0; + /* Most likely it's a loose object. */ + if (!sha1_loose_object_info(real, oi, flags)) + return 0; - /* Not a loose object; someone else may have just packed it. */ - reprepare_packed_git(); - if (find_pack_entry(real, &e)) - goto found_packed; - - /* Check if it is a missing object */ - if (fetch_if_missing && repository_format_partial_clone && - !already_retried) { - fetch_object(repository_format_partial_clone, real); - already_retried = 1; - goto retry; - } + /* Not a loose object; someone else may have just packed it. */ + reprepare_packed_git(); + if (find_pack_entry(real, &e)) + break; - return -1; + /* Check if it is a missing object */ + if (fetch_if_missing && repository_format_partial_clone && + !already_retried) { + fetch_object(repository_format_partial_clone, real); + already_retried = 1; + continue; + } + + return -1; + } -found_packed: if (oi == &blank_oi) /* * We know that the caller doesn't actually need the -- 2.9.3
[PATCH v6 03/12] fsck: support refs pointing to promisor objects
From: Jonathan Tan Teach fsck to not treat refs referring to missing promisor objects as an error when extensions.partialclone is set. For the purposes of warning about no default refs, such refs are still treated as legitimate refs. Signed-off-by: Jonathan Tan --- builtin/fsck.c | 8 t/t0410-partial-clone.sh | 24 2 files changed, 32 insertions(+) diff --git a/builtin/fsck.c b/builtin/fsck.c index 2934299..ee937bb 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -434,6 +434,14 @@ static int fsck_handle_ref(const char *refname, const struct object_id *oid, obj = parse_object(oid); if (!obj) { + if (is_promisor_object(oid)) { + /* +* Increment default_refs anyway, because this is a +* valid ref. +*/ +default_refs++; +return 0; + } error("%s: invalid sha1 pointer %s", refname, oid_to_hex(oid)); errors_found |= ERROR_REACHABLE; /* We'll continue with the rest despite the error.. */ diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index 3ddb3b9..bf75162 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -13,6 +13,14 @@ pack_as_from_promisor () { >repo/.git/objects/pack/pack-$HASH.promisor } +promise_and_delete () { + HASH=$(git -C repo rev-parse "$1") && + git -C repo tag -a -m message my_annotated_tag "$HASH" && + git -C repo rev-parse my_annotated_tag | pack_as_from_promisor && + git -C repo tag -d my_annotated_tag && + delete_object repo "$HASH" +} + test_expect_success 'missing reflog object, but promised by a commit, passes fsck' ' test_create_repo repo && test_commit -C repo my_commit && @@ -78,4 +86,20 @@ test_expect_success 'missing reflog object alone fails fsck, even with extension test_must_fail git -C repo fsck ' +test_expect_success 'missing ref object, but promised, passes fsck' ' + rm -rf repo && + test_create_repo repo && + test_commit -C repo my_commit && + + A=$(git -C repo commit-tree -m a HEAD^{tree}) && + + # Reference $A only from ref + git -C repo branch my_branch "$A" && + promise_and_delete "$A" && + + git -C repo config core.repositoryformatversion 1 && + git -C repo config extensions.partialclone "arbitrary string" && + git -C repo fsck +' + test_done -- 2.9.3
[PATCH v6 05/12] fsck: support promisor objects as CLI argument
From: Jonathan Tan Teach fsck to not treat missing promisor objects provided on the CLI as an error when extensions.partialclone is set. Signed-off-by: Jonathan Tan --- builtin/fsck.c | 2 ++ t/t0410-partial-clone.sh | 13 + 2 files changed, 15 insertions(+) diff --git a/builtin/fsck.c b/builtin/fsck.c index 4c2a56d..578a7c8 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -750,6 +750,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) struct object *obj = lookup_object(oid.hash); if (!obj || !(obj->flags & HAS_OBJ)) { + if (is_promisor_object(&oid)) + continue; error("%s: object missing", oid_to_hex(&oid)); errors_found |= ERROR_OBJECT; continue; diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index 4f9931f..e96f436 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -125,4 +125,17 @@ test_expect_success 'missing object, but promised, passes fsck' ' git -C repo fsck ' +test_expect_success 'missing CLI object, but promised, passes fsck' ' + rm -rf repo && + test_create_repo repo && + test_commit -C repo my_commit && + + A=$(git -C repo commit-tree -m a HEAD^{tree}) && + promise_and_delete "$A" && + + git -C repo config core.repositoryformatversion 1 && + git -C repo config extensions.partialclone "arbitrary string" && + git -C repo fsck "$A" +' + test_done -- 2.9.3
[PATCH v6 04/12] fsck: support referenced promisor objects
From: Jonathan Tan Teach fsck to not treat missing promisor objects indirectly pointed to by refs as an error when extensions.partialclone is set. Signed-off-by: Jonathan Tan --- builtin/fsck.c | 11 +++ t/t0410-partial-clone.sh | 23 +++ 2 files changed, 34 insertions(+) diff --git a/builtin/fsck.c b/builtin/fsck.c index ee937bb..4c2a56d 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -149,6 +149,15 @@ static int mark_object(struct object *obj, int type, void *data, struct fsck_opt if (obj->flags & REACHABLE) return 0; obj->flags |= REACHABLE; + + if (is_promisor_object(&obj->oid)) + /* +* Further recursion does not need to be performed on this +* object since it is a promisor object (so it does not need to +* be added to "pending"). +*/ + return 0; + if (!(obj->flags & HAS_OBJ)) { if (parent && !has_object_file(&obj->oid)) { printf("broken link from %7s %s\n", @@ -208,6 +217,8 @@ static void check_reachable_object(struct object *obj) * do a full fsck */ if (!(obj->flags & HAS_OBJ)) { + if (is_promisor_object(&obj->oid)) + return; if (has_sha1_pack(obj->oid.hash)) return; /* it is in pack - forget about it */ printf("missing %s %s\n", printable_type(obj), diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index bf75162..4f9931f 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -102,4 +102,27 @@ test_expect_success 'missing ref object, but promised, passes fsck' ' git -C repo fsck ' +test_expect_success 'missing object, but promised, passes fsck' ' + rm -rf repo && + test_create_repo repo && + test_commit -C repo 1 && + test_commit -C repo 2 && + test_commit -C repo 3 && + git -C repo tag -a annotated_tag -m "annotated tag" && + + C=$(git -C repo rev-parse 1) && + T=$(git -C repo rev-parse 2^{tree}) && + B=$(git hash-object repo/3.t) && + AT=$(git -C repo rev-parse annotated_tag) && + + promise_and_delete "$C" && + promise_and_delete "$T" && + promise_and_delete "$B" && + promise_and_delete "$AT" && + + git -C repo config core.repositoryformatversion 1 && + git -C repo config extensions.partialclone "arbitrary string" && + git -C repo fsck +' + test_done -- 2.9.3
[PATCH v6 00/12] Partial clone part 2: fsck and promisors
From: Jeff Hostetler This is V6 of part 2 of partial clone. This assumes V6 of part 1 is already present. This version fixes a problem in fetch-pack observed in V5. It also contains 2 "fixup" commits that are WIP responses to comments on V5. Part 2 is concerned with fsck, gc, initial support for dynamic object fetching, and tracking promisor objects. Jonathan Tan originally developed this code. I have moved it on top of part 1 and updated it slightly. Jeff Hostetler (2): fixup: sha1_file: convert gotos to break/continue fixup: sha1_file: add TODO Jonathan Tan (10): extension.partialclone: introduce partial clone extension fsck: introduce partialclone extension fsck: support refs pointing to promisor objects fsck: support referenced promisor objects fsck: support promisor objects as CLI argument index-pack: refactor writing of .keep files introduce fetch-object: fetch one promisor object sha1_file: support lazily fetching missing objects rev-list: support termination at promisor objects gc: do not repack promisor packfiles Documentation/git-pack-objects.txt | 11 + Documentation/gitremote-helpers.txt| 7 + Documentation/rev-list-options.txt | 11 + Documentation/technical/repository-version.txt | 12 + Makefile | 1 + builtin/cat-file.c | 2 + builtin/fetch-pack.c | 10 + builtin/fsck.c | 26 +- builtin/gc.c | 3 + builtin/index-pack.c | 113 builtin/pack-objects.c | 37 ++- builtin/prune.c| 7 + builtin/repack.c | 8 +- builtin/rev-list.c | 71 - cache.h| 13 +- environment.c | 1 + fetch-object.c | 27 ++ fetch-object.h | 6 + fetch-pack.c | 48 ++-- fetch-pack.h | 8 + list-objects.c | 29 ++- object.c | 2 +- packfile.c | 77 +- packfile.h | 13 + remote-curl.c | 14 +- revision.c | 33 ++- revision.h | 5 +- setup.c| 7 +- sha1_file.c| 32 ++- t/t0410-partial-clone.sh | 343 + transport.c| 8 + transport.h| 11 + 32 files changed, 899 insertions(+), 97 deletions(-) create mode 100644 fetch-object.c create mode 100644 fetch-object.h create mode 100755 t/t0410-partial-clone.sh -- 2.9.3
[PATCH v6 02/12] fsck: introduce partialclone extension
From: Jonathan Tan Currently, Git does not support repos with very large numbers of objects or repos that wish to minimize manipulation of certain blobs (for example, because they are very large) very well, even if the user operates mostly on part of the repo, because Git is designed on the assumption that every referenced object is available somewhere in the repo storage. In such an arrangement, the full set of objects is usually available in remote storage, ready to be lazily downloaded. Teach fsck about the new state of affairs. In this commit, teach fsck that missing promisor objects referenced from the reflog are not an error case; in future commits, fsck will be taught about other cases. Signed-off-by: Jonathan Tan --- builtin/fsck.c | 2 +- cache.h | 3 +- packfile.c | 77 +++-- packfile.h | 13 t/t0410-partial-clone.sh | 81 5 files changed, 171 insertions(+), 5 deletions(-) create mode 100755 t/t0410-partial-clone.sh diff --git a/builtin/fsck.c b/builtin/fsck.c index 56afe40..2934299 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -398,7 +398,7 @@ static void fsck_handle_reflog_oid(const char *refname, struct object_id *oid, xstrfmt("%s@{%"PRItime"}", refname, timestamp)); obj->flags |= USED; mark_object_reachable(obj); - } else { + } else if (!is_promisor_object(oid)) { error("%s: invalid reflog entry %s", refname, oid_to_hex(oid)); errors_found |= ERROR_REACHABLE; } diff --git a/cache.h b/cache.h index 35e3f5e..c76f2e9 100644 --- a/cache.h +++ b/cache.h @@ -1587,7 +1587,8 @@ extern struct packed_git { unsigned pack_local:1, pack_keep:1, freshened:1, -do_not_close:1; +do_not_close:1, +pack_promisor:1; unsigned char sha1[20]; struct revindex_entry *revindex; /* something like ".git/objects/pack/x.pack" */ diff --git a/packfile.c b/packfile.c index 4a5fe7a..234797c 100644 --- a/packfile.c +++ b/packfile.c @@ -8,6 +8,11 @@ #include "list.h" #include "streaming.h" #include "sha1-lookup.h" +#include "commit.h" +#include "object.h" +#include "tag.h" +#include "tree-walk.h" +#include "tree.h" char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, @@ -643,10 +648,10 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local) return NULL; /* -* ".pack" is long enough to hold any suffix we're adding (and +* ".promisor" is long enough to hold any suffix we're adding (and * the use xsnprintf double-checks that) */ - alloc = st_add3(path_len, strlen(".pack"), 1); + alloc = st_add3(path_len, strlen(".promisor"), 1); p = alloc_packed_git(alloc); memcpy(p->pack_name, path, path_len); @@ -654,6 +659,10 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local) if (!access(p->pack_name, F_OK)) p->pack_keep = 1; + xsnprintf(p->pack_name + path_len, alloc - path_len, ".promisor"); + if (!access(p->pack_name, F_OK)) + p->pack_promisor = 1; + xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack"); if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) { free(p); @@ -781,7 +790,8 @@ static void prepare_packed_git_one(char *objdir, int local) if (ends_with(de->d_name, ".idx") || ends_with(de->d_name, ".pack") || ends_with(de->d_name, ".bitmap") || - ends_with(de->d_name, ".keep")) + ends_with(de->d_name, ".keep") || + ends_with(de->d_name, ".promisor")) string_list_append(&garbage, path.buf); else report_garbage(PACKDIR_FILE_GARBAGE, path.buf); @@ -1889,6 +1899,9 @@ int for_each_packed_object(each_packed_object_fn cb, void *data, unsigned flags) for (p = packed_git; p; p = p->next) { if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local) continue; + if ((flags & FOR_EACH_OBJECT_PROMISOR_ONLY) && + !p->pack_promisor) + continue; if (open_pack_index(p)) { pack_errors = 1; continue; @@ -1899,3 +1912,61 @@ int for_each_packed_object(each_packed_object_fn cb, void *data, unsigned flags) } return r ? r : pack_errors; } + +static int add_promisor_object(const struct object_id *oid, + struct packed_git *pack, +
[PATCH v6 01/12] extension.partialclone: introduce partial clone extension
From: Jonathan Tan Introduce new repository extension option: `extensions.partialclone` See the update to Documentation/technical/repository-version.txt in this patch for more information. Signed-off-by: Jonathan Tan --- Documentation/technical/repository-version.txt | 12 cache.h| 2 ++ environment.c | 1 + setup.c| 7 ++- 4 files changed, 21 insertions(+), 1 deletion(-) diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt index 00ad379..e03eacc 100644 --- a/Documentation/technical/repository-version.txt +++ b/Documentation/technical/repository-version.txt @@ -86,3 +86,15 @@ for testing format-1 compatibility. When the config key `extensions.preciousObjects` is set to `true`, objects in the repository MUST NOT be deleted (e.g., by `git-prune` or `git repack -d`). + +`partialclone` +~~ + +When the config key `extensions.partialclone` is set, it indicates +that the repo was created with a partial clone (or later performed +a partial fetch) and that the remote may have omitted sending +certain unwanted objects. Such a remote is called a "promisor remote" +and it promises that all such omitted objects can be fetched from it +in the future. + +The value of this key is the name of the promisor remote. diff --git a/cache.h b/cache.h index 6440e2b..35e3f5e 100644 --- a/cache.h +++ b/cache.h @@ -860,10 +860,12 @@ extern int grafts_replace_parents; #define GIT_REPO_VERSION 0 #define GIT_REPO_VERSION_READ 1 extern int repository_format_precious_objects; +extern char *repository_format_partial_clone; struct repository_format { int version; int precious_objects; + char *partial_clone; /* value of extensions.partialclone */ int is_bare; char *work_tree; struct string_list unknown_extensions; diff --git a/environment.c b/environment.c index 8289c25..e52aab3 100644 --- a/environment.c +++ b/environment.c @@ -27,6 +27,7 @@ int warn_ambiguous_refs = 1; int warn_on_object_refname_ambiguity = 1; int ref_paranoia = -1; int repository_format_precious_objects; +char *repository_format_partial_clone; const char *git_commit_encoding; const char *git_log_output_encoding; const char *apply_default_whitespace; diff --git a/setup.c b/setup.c index 03f51e0..58536bd 100644 --- a/setup.c +++ b/setup.c @@ -420,7 +420,11 @@ static int check_repo_format(const char *var, const char *value, void *vdata) ; else if (!strcmp(ext, "preciousobjects")) data->precious_objects = git_config_bool(var, value); - else + else if (!strcmp(ext, "partialclone")) { + if (!value) + return config_error_nonbool(var); + data->partial_clone = xstrdup(value); + } else string_list_append(&data->unknown_extensions, ext); } else if (strcmp(var, "core.bare") == 0) { data->is_bare = git_config_bool(var, value); @@ -463,6 +467,7 @@ static int check_repository_format_gently(const char *gitdir, int *nongit_ok) } repository_format_precious_objects = candidate.precious_objects; + repository_format_partial_clone = candidate.partial_clone; string_list_clear(&candidate.unknown_extensions, 0); if (!has_common) { if (candidate.is_bare != -1) { -- 2.9.3
[PATCH v6 1/3] list-objects-filter-options: fix 'keword' typo in comment
From: Christian Couder Signed-off-by: Christian Couder Signed-off-by: Jeff Hostetler --- list-objects-filter-options.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index 9b28322..52bdec7 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -8,7 +8,7 @@ #include "list-objects-filter-options.h" /* - * Parse value of the argument to the "filter" keword. + * Parse value of the argument to the "filter" keyword. * On the command line this looks like: * --filter= * and in the pack protocol as: -- 2.9.3
[PATCH v6 2/3] list-objects-filter-options: support --no-filter
From: Jeff Hostetler Teach opt_parse_list_objects_filter() to take --no-filter option and to free the contents of struct filter_options. This command line argument will be automatically inherited by commands using OPT_PARSE_LIST_OBJECTS_FILTER(); this includes pack-objects. Signed-off-by: Jeff Hostetler --- Documentation/git-pack-objects.txt | 3 +++ list-objects-filter-options.c | 15 +-- list-objects-filter-options.h | 5 - 3 files changed, 20 insertions(+), 3 deletions(-) diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index b924c6c..aa403d0 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -242,6 +242,9 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it creates a bundle. the resulting packfile. See linkgit:git-rev-list[1] for valid `` forms. +--no-filter:: + Turns off any previous `--filter=` argument. + --missing=:: A debug option to help with future "partial clone" development. This option specifies how missing objects are handled. diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index 52bdec7..4c5b34e 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -74,8 +74,19 @@ int opt_parse_list_objects_filter(const struct option *opt, { struct list_objects_filter_options *filter_options = opt->value; - assert(arg); - assert(!unset); + if (unset || !arg) { + list_objects_filter_release(filter_options); + return 0; + } return parse_list_objects_filter(filter_options, arg); } + +void list_objects_filter_release( + struct list_objects_filter_options *filter_options) +{ + free(filter_options->filter_spec); + free(filter_options->sparse_oid_value); + free(filter_options->sparse_path_value); + memset(filter_options, 0, sizeof(*filter_options)); +} diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h index dd7e5db..eea44a1 100644 --- a/list-objects-filter-options.h +++ b/list-objects-filter-options.h @@ -52,7 +52,10 @@ int opt_parse_list_objects_filter(const struct option *opt, #define OPT_PARSE_LIST_OBJECTS_FILTER(fo) \ { OPTION_CALLBACK, 0, CL_ARG__FILTER, fo, N_("args"), \ - N_("object filtering"), PARSE_OPT_NONEG, \ + N_("object filtering"), 0, \ opt_parse_list_objects_filter } +void list_objects_filter_release( + struct list_objects_filter_options *filter_options); + #endif /* LIST_OBJECTS_FILTER_OPTIONS_H */ -- 2.9.3
[PATCH v6 3/3] rev-list: support --no-filter argument
From: Jeff Hostetler Teach rev-list to support --no-filter to override a previous --filter= argument. This is to be consistent with commands that use OPT_PARSE macros. Signed-off-by: Jeff Hostetler --- Documentation/rev-list-options.txt | 15 ++- builtin/rev-list.c | 4 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 11bb87f..8d8b7f4 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -715,16 +715,21 @@ ifdef::git-rev-list[] The form '--filter=blob:none' omits all blobs. + The form '--filter=blob:limit=[kmg]' omits blobs larger than n bytes -or units. The value may be zero. +or units. n may be zero. The suffixes k, m, and g can be used to name +units in KiB, MiB, or GiB. For example, 'blob:limit=1k' is the same +as 'blob:limit=1024'. + -The form '--filter=sparse:oid=' uses a sparse-checkout -specification contained in the object (or the object that the expression -evaluates to) to omit blobs that would not be not required for a -sparse checkout on the requested refs. +The form '--filter=sparse:oid=' uses a sparse-checkout +specification contained in the blob (or blob-expression) '' +to omit blobs that would not be not required for a sparse checkout on +the requested refs. + The form '--filter=sparse:path=' similarly uses a sparse-checkout specification contained in . +--no-filter:: + Turn off any previous `--filter=` argument. + --filter-print-omitted:: Only useful with `--filter=`; prints a list of the objects omitted by the filter. Object IDs are prefixed with a ``~'' character. diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 4700473..547a3e0 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -403,6 +403,10 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) filter_options.filter_spec); continue; } + if (!strcmp(arg, ("--no-" CL_ARG__FILTER))) { + list_objects_filter_release(&filter_options); + continue; + } if (!strcmp(arg, "--filter-print-omitted")) { arg_print_omitted = 1; continue; -- 2.9.3
[PATCH v6 0/3] Partial clone part 1: object filtering
From: Jeff Hostetler Here is V6 of the list-object filtering, rev-list, and pack-objects. This is an incremental patch series to be applied on top of V5 which is already in 'next'. This version fixes a typo, add the --no-filter option, eliminates a couple of asserts(), and updates the documentation. Christian Couder (1): list-objects-filter-options: fix 'keword' typo in comment Jeff Hostetler (2): list-objects-filter-options: support --no-filter rev-list: support --no-filter argument Documentation/git-pack-objects.txt | 3 +++ Documentation/rev-list-options.txt | 15 ++- builtin/rev-list.c | 4 list-objects-filter-options.c | 17 ++--- list-objects-filter-options.h | 5 - 5 files changed, 35 insertions(+), 9 deletions(-) -- 2.9.3
Re: [PATCH 3/4] Makefile: use the sha1collisiondetection submodule by default
Ævar Arnfjörð Bjarmason writes: > Change the build process so that instead of needing to supply > DC_SHA1_SUBMODULE=YesPlease to use the sha1collisiondetection > submodule instead of the copy of the same code shipped in the sha1dc > directory, it uses the submodule by default unless > NO_DC_SHA1_SUBMODULE=UnfortunatelyYes is supplied. > ... > This change removes the "auto" logic Junio added in > cac87dc01d ("sha1collisiondetection: automatically enable when > submodule is populated", 2017-07-01), I feel that automatically > falling back to using sha1dc would defeat the point, which is to smoke > out any remaining users of git.git who have issues cloning the > submodule for whatever reason. I think it makes sense to drop 'auto' if we were to go this route. I do not think the right value for NO_DC_SHA1_SUBMODULE should be "unfortunately yes"; it should be spelled NoThanks or something. It's not like an external reason "unfortunately" prevents you from using the code from the submodule---the person sets it deliberately and by choice. > Makefile:1031: *** The sha1collisiondetection submodule is not > checked out. Please make it available, either by cloning with > --recurse-submodules, or by running "git submodule update > --init". If you can't use it for whatever reason you can define > NO_DC_SHA1_SUBMODULE=UnfortunatelyYes. Stop. Likewise here.
Re: [PATCH v5 07/10] introduce fetch-object: fetch one promisor object
On 12/2/2017 3:33 PM, Christian Couder wrote: On Tue, Nov 21, 2017 at 10:07 PM, Jeff Hostetler wrote: From: Jonathan Tan +void fetch_object(const char *remote_name, const unsigned char *sha1) +{ + struct remote *remote; + struct transport *transport; + struct ref *ref; + + remote = remote_get(remote_name); + if (!remote->url[0]) + die(_("Remote with no URL")); + transport = transport_get(remote, remote->url[0]); + + ref = alloc_ref(sha1_to_hex(sha1)); + hashcpy(ref->old_oid.hash, sha1); + transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1"); + transport_set_option(transport, TRANS_OPT_NO_HAVES, "1"); + transport_fetch_refs(transport, ref); +} I think it would be interesting to return what transport_fetch_refs() returns, so that a caller could know if an error happened. That might help the retry/found_packed loop mentioned in your response on the next patch in this series. I'll make a TODO item for this to investigate. Thanks Jeff
RE: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter
Indeed so. In which case it is short for "selling short", or possibly "short selling". Of course a little searching shows that "shorted" could mean some other things, including possibly the meaning originally suggested. Nevertheless it seems to me that "shortened" is the most appropriate word in modern English. R. Richard Kerry BNCS Engineer, SI SOL Telco & Media Vertical Practice T: +44 (0)20 3618 2669 M: +44 (0)7812 325518 Lync: +44 (0) 20 3618 0778 Room G300, Stadium House, Wood Lane, London, W12 7TA richard.ke...@atos.net > -Original Message- > From: Junio C Hamano [mailto:gits...@pobox.com] > Sent: Tuesday, December 05, 2017 4:06 PM > To: Kerry, Richard > Cc: git@vger.kernel.org; Johannes Schindelin > ; p...@peff.net; liam Beguin > > Subject: Re: [PATCH v2 6/9] rebase -i: update functions to use a flags > parameter > > "Kerry, Richard" writes: > > > "Shorted" is what happens when you put a piece of wire across the > terminals of a battery ... (bang, smoke, etc). > > It's short for "short-circuited". > > Or it is what you do to something that you sell and that you yet do not own, > expecting that you can later buy it cheaper, allowing you to pocket the > difference ;-). > Atos, Atos Consulting, Worldline and Canopy The Open Cloud Company are trading names used by the Atos group. The following trading entities are registered in England and Wales: Atos IT Services UK Limited (registered number 01245534), Atos Consulting Limited (registered number 04312380), Atos Worldline UK Limited (registered number 08514184) and Canopy The Open Cloud Company Limited (registration number 08011902). The registered office for each is at 4 Triton Square, Regent’s Place, London, NW1 3HG.The VAT No. for each is: GB232327983. This e-mail and the documents attached are confidential and intended solely for the addressee, and may contain confidential or privileged information. If you receive this e-mail in error, you are not authorised to copy, disclose, use or retain it. Please notify the sender immediately and delete this email from your systems. As emails may be intercepted, amended or lost, they are not secure. Atos therefore can accept no liability for any errors or their content. Although Atos endeavours to maintain a virus-free network, we do not warrant that this transmission is virus-free and can accept no liability for any damages resulting from any virus transmitted. The risks are deemed to be accepted by everyone who communicates with Atos by email.
Re: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter
"Kerry, Richard" writes: > "Shorted" is what happens when you put a piece of wire across the terminals > of a battery ... (bang, smoke, etc). > It's short for "short-circuited". Or it is what you do to something that you sell and that you yet do not own, expecting that you can later buy it cheaper, allowing you to pocket the difference ;-).
Re: [PATCH v6 4/7] checkout: describe_detached_head: remove ellipsis after committish
Ann T Ropea writes: > v6: polish to take Junio's comments from > into account > t/t2020-checkout-detach.sh| 114 > ++ > ... Thanks; with this one replaced, I'd expect that poisoned gettext test to pass now. I saw some style issues, so I'll queue a tentative fix-up on top. > + # The first detach operation is more chatty than the following ones. > + cat 1>1st_detach <<'EOF' && > +Note: checking out 'HEAD^'. > + > +You are in 'detached HEAD' state. You can look around, make experimental > +changes and commit them, and you can discard any commits you make in this > +state without impacting any branches by performing another checkout. > + > +If you want to create a new branch to retain commits you create, you may > +do so (now or later) by using -b with the checkout command again. Example: > + > + git checkout -b > + > +HEAD is now at 7c7cd714e262 three > +EOF It looks somewhat strange to explicitly say 1> when redirecting the standard output. Also we prefer to indent our here-doc to the same depth as commands by using "<<-", i.e. cat >1st_detach <<-'EOF' && Note: checking out 'HEAD^'. ... EOF > + reset && check_not_detached && sane_unset GIT_PRINT_SHA1_ELLIPSIS && It also was a bit irritating to see multiple commands form an overlong single line, like this one. > + test_i18ncmp 1st_detach actual && sane_unset GIT_PRINT_SHA1_ELLIPSIS && > + > + GIT_PRINT_SHA1_ELLIPSIS="no" git -c 'core.abbrev=12' checkout HEAD^ > 1>actual 2>&1 && > + check_detached && > + test_i18ncmp 2nd_detach actual && sane_unset GIT_PRINT_SHA1_ELLIPSIS && This part uses "set and export only for a single invocation of git", and because the variable is unset at the end of the previous step after 1st_detach and actual are compared, this unset at the end feels redundant. > + GIT_PRINT_SHA1_ELLIPSIS='nope' && export GIT_PRINT_SHA1_ELLIPSIS && git > -c 'core.abbrev=12' checkout HEAD^ 1>actual 2>&1 && But now this part sets and exports the variable for the remainder of the script (until it is unset). Is the use of these two styles, i.e. VAR=value && export VAR && git -c core.abbrev=12 subcmd VAR=value git -c core.abbrev=12 subcmd intended? If so for what purpose? It's not like we are testing if the shell implements environment variables correctly, so I'd expect that the result would be easier to follow if it stuck to a single style and used it consistently.
Re: [PATCH v5 08/10] sha1_file: support lazily fetching missing objects
On 12/2/2017 1:29 PM, Christian Couder wrote: On Tue, Nov 21, 2017 at 10:07 PM, Jeff Hostetler wrote: From: Jonathan Tan [...] int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, unsigned flags) { [...] - if (!find_pack_entry(real, &e)) { - /* Most likely it's a loose object. */ - if (!sha1_loose_object_info(real, oi, flags)) - return 0; +retry: + if (find_pack_entry(real, &e)) + goto found_packed; - /* Not a loose object; someone else may have just packed it. */ - if (flags & OBJECT_INFO_QUICK) { - return -1; - } else { - reprepare_packed_git(); - if (!find_pack_entry(real, &e)) - return -1; - } + /* Most likely it's a loose object. */ + if (!sha1_loose_object_info(real, oi, flags)) + return 0; + + /* Not a loose object; someone else may have just packed it. */ + reprepare_packed_git(); + if (find_pack_entry(real, &e)) + goto found_packed; + + /* Check if it is a missing object */ + if (fetch_if_missing && repository_format_partial_clone && + !already_retried) { + fetch_object(repository_format_partial_clone, real); + already_retried = 1; + goto retry; } + return -1; + +found_packed: if (oi == &blank_oi) [...] The above is adding 2 different gotos into this function while there are quite simple ways to avoid them. See https://public-inbox.org/git/cap8ufd2thrj7+rxmismutuopxqv4wm7azsejpd_fheoycp+...@mail.gmail.com/ and the follow up email in the thread. Personally, I'm OK with limited goto's like these. Yes, they are ugly, but not that complicated. Alternatively, we could put everything between the 2 labels in a while (1) {...} and replace the goto's with break's and continue's. I think it would be better to revisit this later, as I'd rather not start refactoring sha1_file.c in the middle of what is already a 30+ commit patch series (for the 3 parts of partial clone). Thoughts? Jeff
Re: [PATCH v5 10/10] gc: do not repack promisor packfiles
On 12/2/2017 12:39 PM, Christian Couder wrote: On Tue, Nov 21, 2017 at 10:07 PM, Jeff Hostetler wrote: From: Jonathan Tan pack_as_from_promisor () { HASH=$(git -C repo pack-objects .git/objects/pack/pack) && >repo/.git/objects/pack/pack-$HASH.promisor + echo $HASH } Now the exit code of the above function will always be the exit code of "echo $HASH". It seems to me that it would be better to add "&&" at the end of the line above the "echo $HASH". Fixed. Thanks. Jeff
Re: gitattributes not read for diff-tree anymore in 2.15?
On Mon, Dec 04, 2017 at 15:03:55 -0800, Brandon Williams wrote: > Reading the attributes files should be done regardless if the gitmodules > file is read. The gitmodules file should only come into play if you are > dealing with submodules. Yeah, it doesn't seem to make sense why this commit breaks us, but there it is *shrug*. > Do you mind providing a reproduction recipe with expected outcome vs > actual outcome and I can take a closer look. I'll work on one. It isn't as simple as I thought it was :) . The setup we do before running the checks is apparently involved as running it from the command line is not exhibiting the difference. --Ben
Re: [PATCH] imap-send: URI encode server folder
Eric Sunshine writes: > ... Can you > expand the commit message a bit to make it more self-contained? At > minimum, perhaps show the error message you were experiencing, and > cite (as Daniel pointed out) RFC 3986 and the bit about a "legal" URL > not containing brackets. Thanks for a good suggestion. > > Also, a natural question which pops into the head of someone reading > this patch is whether other parts of the URL (host, user, etc.) also > need to be handled similarly. It's possible that you audited the code > and determined that they are handled fine already, but the reader of > the commit message is unable to infer that. Consequently, it might be > nice to have a sentence about that, as well ("other parts of the URL > are already encoded, thus are fine" or "other parts of the URL are not > subject to this problem because ..."). > > The patch itself looks okay (from a cursory read). > > Thanks. > >> Reported-by: Doron Behar >> Signed-off-by: Nicolas Morey-Chaisemartin >> --- >> imap-send.c | 8 +++- >> 1 file changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/imap-send.c b/imap-send.c >> index 54e6a80fd..36c7c1b4f 100644 >> --- a/imap-send.c >> +++ b/imap-send.c >> @@ -1412,6 +1412,7 @@ static CURL *setup_curl(struct imap_server_conf *srvc, >> struct credential *cred) >> { >> CURL *curl; >> struct strbuf path = STRBUF_INIT; >> + char *uri_encoded_folder; >> >> if (curl_global_init(CURL_GLOBAL_ALL) != CURLE_OK) >> die("curl_global_init failed"); >> @@ -1429,7 +1430,12 @@ static CURL *setup_curl(struct imap_server_conf >> *srvc, struct credential *cred) >> strbuf_addstr(&path, server.host); >> if (!path.len || path.buf[path.len - 1] != '/') >> strbuf_addch(&path, '/'); >> - strbuf_addstr(&path, server.folder); >> + >> + uri_encoded_folder = curl_easy_escape(curl, server.folder, 0); >> + if (!uri_encoded_folder) >> + die("failed to encode server folder"); >> + strbuf_addstr(&path, uri_encoded_folder); >> + curl_free(uri_encoded_folder); >> >> curl_easy_setopt(curl, CURLOPT_URL, path.buf); >> strbuf_release(&path); >> -- >> 2.15.1.272.g8e603414b
Re: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter
Hi, On 05/12/17 07:41 AM, Kerry, Richard wrote: > > "Shorted" is what happens when you put a piece of wire across the terminals > of a battery ... (bang, smoke, etc). > It's short for "short-circuited". > Yes, I think you mean "shortened" in this case. > Thanks for the explanation. Sorry, my eyes stopped at the lowercase 's' in Johannes message. Will fix. > Regards, > Richard. > > > > Richard Kerry > BNCS Engineer, SI SOL Telco & Media Vertical Practice > > T: +44 (0)20 3618 2669 > M: +44 (0)7812 325518 > Lync: +44 (0) 20 3618 0778 > Room G300, Stadium House, Wood Lane, London, W12 7TA > richard.ke...@atos.net > > [...] Thanks, Liam
[PATCH v3] git-gui: Prevent double UTF-8 conversion
Convert author's name and e-mail address from the UTF-8 (or any other) encoding in load_last_commit function the same way commit message is converted. Amending commits in git-gui without such conversion breaks UTF-8 strings. For example, "\305\201ukasz" (as written by git cat-file) becomes "\303\205\302\201ukasz" in an amended commit. Signed-off-by: Łukasz Stelmach --- git-gui/lib/commit.tcl | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/git-gui/lib/commit.tcl b/git-gui/lib/commit.tcl index 83620b7cb..75ea965da 100644 --- a/git-gui/lib/commit.tcl +++ b/git-gui/lib/commit.tcl @@ -25,6 +25,8 @@ You are currently in the middle of a merge that has not been fully completed. Y set msg {} set parents [list] if {[catch { + set name "" + set email "" set fd [git_read cat-file commit $curHEAD] fconfigure $fd -encoding binary -translation lf # By default commits are assumed to be in utf-8 @@ -34,9 +36,7 @@ You are currently in the middle of a merge that has not been fully completed. Y lappend parents [string range $line 7 end] } elseif {[string match {encoding *} $line]} { set enc [string tolower [string range $line 9 end]] - } elseif {[regexp "author (.*)\\s<(.*)>\\s(\\d.*$)" $line all name email time]} { - set commit_author [list name $name email $email date $time] - } + } elseif {[regexp "author (.*)\\s<(.*)>\\s(\\d.*$)" $line all name email time]} { } } set msg [read $fd] close $fd @@ -44,7 +44,13 @@ You are currently in the middle of a merge that has not been fully completed. Y set enc [tcl_encoding $enc] if {$enc ne {}} { set msg [encoding convertfrom $enc $msg] + set name [encoding convertfrom $enc $name] + set email [encoding convertfrom $enc $email] } + if {$name ne {} && $email ne {}} { + set commit_author [list name $name email $email date $time] + } + set msg [string trim $msg] } err]} { error_popup [strcat [mc "Error loading commit data for amend:"] "\n\n$err"] -- 2.11.0
Re: [PATCH 3/4] Makefile: use the sha1collisiondetection submodule by default
On Tue, Dec 05 2017, Junio C. Hamano jotted: > Ævar Arnfjörð Bjarmason writes: > >>> I'm not sure how I feel about this. I see your point that there's no >>> real value in maintaining two systems indefinitely. At the same time, I >>> wonder how much value the submodule strategy is actually bringing us. >>> >>> IOW, are we agreed that the path forward is to get everybody using the >>> submodule? >> ... >> In no particular order: >> >> * I don't feel strongly about 2-4/4 in this series. I just hacked this >>up because it occurred to me that I'd left this sha1dc stuff in some >>in-between state and we'd talked about eventually moving forward with >>this. > > Good. > >>We've had two releases with the submodule being purely optional, if >>we're going to keep it it seems logical to start at least using it by >>default. > > With a need for a patch like 1/4, I suspect two release cycles is > way too short for making a move like 2-4/4, though. You're conflating two unrelated things, which to be fair I'm confusingly doing by submitting all this together. 1) Since 2.14 we've had the "auto" rule and DC_SHA1_SUBMODULE=[YesPlease|auto], so we'll prefer the submodule if it's there. So we've been testing if the mere presence of a .gitmodules breaks something for someone, seems like it doesn't. 2) Then in the 2.15 release Takashi Iwai submitted a feature to link to an external SHA1DC. This is used in the SuSE 2.15 package here: http://download.opensuse.org/tumbleweed/repo/src-oss/suse/src/ However, as you'll see if you extract that package they don't run into that bug, because they're building it from a tarball which has an empty sha1collisiondetection/ directory as noted in my 87bmjdscdr@evledraar.booking.com. Takashi *would* run into an error with my 1/4 if he was building from git.git, or if "make dist" included sha1collisiondetection/, but I don't see a reason to hold anything back back on that account. The only users of DC_SHA1_EXTERNAL=YesPlease are going to be packagers who know what they're doing, and if we start erroring out for them on this obscure option that's going to be trivially solved. I don't see why this obscure edge case with #2 should keep us from deciding whatever we'd decide with #1. They're really unrelated, #2 practically speaking only impacts tarball consumers, #1 impacts git.git users. It seems logical to me if we're going to move forward with #1 at all by first making the submodule the default & then depending on how that turns out making it a hard dependency, we'd do it now. We'll learn nothing new by shipping a 2.16 with DC_SHA1_SUBMODULE=auto that we haven't already learned in 2.14 & 2.15.
Re: [PATCH 3/4] Makefile: use the sha1collisiondetection submodule by default
Ævar Arnfjörð Bjarmason writes: >> I'm not sure how I feel about this. I see your point that there's no >> real value in maintaining two systems indefinitely. At the same time, I >> wonder how much value the submodule strategy is actually bringing us. >> >> IOW, are we agreed that the path forward is to get everybody using the >> submodule? > ... > In no particular order: > > * I don't feel strongly about 2-4/4 in this series. I just hacked this >up because it occurred to me that I'd left this sha1dc stuff in some >in-between state and we'd talked about eventually moving forward with >this. Good. >We've had two releases with the submodule being purely optional, if >we're going to keep it it seems logical to start at least using it by >default. With a need for a patch like 1/4, I suspect two release cycles is way too short for making a move like 2-4/4, though.
Re: [PATCH v2 1/2] git-prompt: make __git_eread intended use explicit
Robert Abel writes: > On 05 Dec 2017 01:27, Junio C Hamano wrote: >> I know all of the above, but I think you misunderstood the point I >> wanted to raise, so let me try again. The thing is, none of what >> you just wrote changes the fact that lack of callers that want to do >> "multi-line" is IRRELEVANT. > > I disagree. The commit comment is meant to give context to the > introduced changes. One change is the additional comment for > __git_eread, which now clearly states that only a single line is read. I still do not understand why you think the 'next' person would care about the (lack of )multi-line aspect of the helper. Let's see how well the proposed log message gives the "context to the introduced changes" (from your v3). __git_eread is used to read a single line of a given file (if it exists) into a single variable without the EOL. All six current users of __git_eread use it that way and don't expect multi-line content. That does not include anything incorrect; but. The helper is about (1) reading the first line and (2) reading it as a whole into a single variable. Both are already covered by the first sentence, and there is no need to say 'and don't expect ...", unless you want to stress something. And it places a stress on the former, which is a less relevant thing, WITHOUT giving the same treatment to the latter, which is a more relevant thing. After all, this patch is not about replacing an earlier implementation that did $2=$(cat "$1") with read $2 <"$1" If that were the case, _then_ the fact that the purpose of the helper is to read from a single-liner file (i.e. we do not expect the input file to have more than one line) is VERY relevant. But this is not such a patch. And after readers read the above, they find this: Therefore, this patch removes the unused capability to split file conents into tokens by passing multiple variable names. And because the previous paragraph placed an emphasis on a wrong aspect of the context of the calls to the helper function, this "Therefore" does not quite "click" in the readers' minds. The reason why it is OK to remove the multi-variable feature is because the callers of the helper want to always read the result into a single variable, but the "no need for multi-variable" that they read in the first sentence of the previous paragraph is less strong in their mind by now, because they read an irrelevant (for the purpose of this "Therefore") mention of "no need for multi-line" aspect of the helper. Perhaps __git_eread is used to read the contents of a single-liner file into a single variable while dropping EOL. It is misleading to use the "read" built-in with "$@", as if some callers would want the contents read from the file to be split into multiple variables. Explicitly use a single variable, and also document that the helper only reads the first line (simply because the input files are designed to be single-liner files). would say it the same thing, but with emphasis on the right aspect of the facts. I would also rephase the new in-code comment # Helper function to read the first line of a file into a variable. to un-stress "the first line of a file" and place more stress on the fact that it is designed to read from a single-liner file (there is a subtle but important distinction between the two). # read the contents of a single-liner file into a variable, # while dropping the end-of-line from it. or something like that, perhaps.
RE: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter
"Shorted" is what happens when you put a piece of wire across the terminals of a battery ... (bang, smoke, etc). It's short for "short-circuited". Yes, I think you mean "shortened" in this case. Regards, Richard. Richard Kerry BNCS Engineer, SI SOL Telco & Media Vertical Practice T: +44 (0)20 3618 2669 M: +44 (0)7812 325518 Lync: +44 (0) 20 3618 0778 Room G300, Stadium House, Wood Lane, London, W12 7TA richard.ke...@atos.net > -Original Message- > From: git-ow...@vger.kernel.org [mailto:git-ow...@vger.kernel.org] On > Behalf Of Junio C Hamano > Sent: Tuesday, December 05, 2017 12:37 PM > To: liam Beguin > Cc: Johannes Schindelin ; > git@vger.kernel.org; p...@peff.net > Subject: Re: [PATCH v2 6/9] rebase -i: update functions to use a flags > parameter > > liam Beguin writes: > > > I'll change it to TODO_LIST_SHORTED_IDs. TODO_LIST_SHORTED_INSNS > would > > suggest the flag changes both parts of the todo. > > I am not a native speaker, but SHORTED does not sound like a right phrase. > When you make something shorter, that thing is "shortened", not "shorted". Atos, Atos Consulting, Worldline and Canopy The Open Cloud Company are trading names used by the Atos group. The following trading entities are registered in England and Wales: Atos IT Services UK Limited (registered number 01245534), Atos Consulting Limited (registered number 04312380), Atos Worldline UK Limited (registered number 08514184) and Canopy The Open Cloud Company Limited (registration number 08011902). The registered office for each is at 4 Triton Square, Regent’s Place, London, NW1 3HG.The VAT No. for each is: GB232327983. This e-mail and the documents attached are confidential and intended solely for the addressee, and may contain confidential or privileged information. If you receive this e-mail in error, you are not authorised to copy, disclose, use or retain it. Please notify the sender immediately and delete this email from your systems. As emails may be intercepted, amended or lost, they are not secure. Atos therefore can accept no liability for any errors or their content. Although Atos endeavours to maintain a virus-free network, we do not warrant that this transmission is virus-free and can accept no liability for any damages resulting from any virus transmitted. The risks are deemed to be accepted by everyone who communicates with Atos by email.
Re: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter
liam Beguin writes: > I'll change it to TODO_LIST_SHORTED_IDs. TODO_LIST_SHORTED_INSNS would > suggest the flag changes both parts of the todo. I am not a native speaker, but SHORTED does not sound like a right phrase. When you make something shorter, that thing is "shortened", not "shorted".
Re: [PATCH v2 4/9] rebase -i: refactor transform_todo_ids
liam Beguin writes: > Good suggestion. Would transform_todos() work too? If the function is about munging multiple of them, then todo"s" would work well; I wasn't focusing on singular vs plural, as I thought the choice between them needs much less thought to make correctly.
Re: [PATCH 2/2] progress: drop delay-threshold code
> On 05 Dec 2017, at 12:10, Ævar Arnfjörð Bjarmason wrote: > > On Tue, Dec 5, 2017 at 11:37 AM, Lars Schneider > wrote: >> >>> On 04 Dec 2017, at 23:07, Jeff King wrote: >>> >>> From: Lars Schneider >>> >>> Since 180a9f2268 (provide a facility for "delayed" progress >>> reporting, 2007-04-20), the progress code has allowed >>> callers to skip showing progress if they have reached a >>> percentage-threshold of the total work before the delay >>> period passes. >>> >>> But since 8aade107dd (progress: simplify "delayed" progress >>> API, 2017-08-19), that parameter is not available to outside >>> callers (we always passed zero after that commit, though >>> that was corrected in the previous commit to "100%"). >>> >>> Let's drop the threshold code, which never triggers in >>> any meaningful way. >>> >>> Signed-off-by: Jeff King >>> --- >>> I tweaked your patch slightly to clean up the now-simplified >>> conditional. >> >> Your first patch ("progress: set default delay threshold to 100%, not 0%") >> as well as the modifications to this one look good to me. Feel free >> to add my "Signed-off-by: Lars Schneider ". >> >> Thanks, >> Lars >> >> >> PS: How do you generate the commit references "hash (first line, date)"? >> Git log pretty print? > > $ git grep -A5 'Copy commit summary' Documentation/SubmittingPatches > Documentation/SubmittingPatches:151:The "Copy commit summary" command > of gitk can be used to obtain this > Documentation/SubmittingPatches-152-format, or this invocation of `git show`: > Documentation/SubmittingPatches-153- > Documentation/SubmittingPatches-154- > Documentation/SubmittingPatches-155-git show -s --date=short > --pretty='format:%h ("%s", %ad)' Thanks :-)
Re: [PATCH v4 9/9] t3512/t3513: remove KNOWN_FAILURE_CHERRY_PICK_SEES_EMPTY_COMMIT=1
On 04/12/17 19:24, Stefan Beller wrote: > On Fri, Nov 24, 2017 at 3:07 AM, Phillip Wood > wrote: >> From: Phillip Wood >> >> Now that the sequencer creates commits without forking 'git commit' it >> does not see an empty commit in these tests which fixes the known >> breakage. Note that logic for handling >> KNOWN_FAILURE_CHERRY_PICK_SEES_EMPTY_COMMIT=1 is not removed from >> lib-submodule-update.sh as it is still used by other tests. >> >> Signed-off-by: Phillip Wood >> --- >> >> Notes: >> The output of the tests with after the previous patch looks like the >> output of the merge tests (see below), so I'm hopeful that this is a >> genuine fix, but someone who knows about submodules should check that >> things are in fact working properly now. > > Looking at the patch only (in combination with the history of the > submodule tests, > 283f56a40b (cherry-pick: add t3512 for submodule updates, 2014-06-15)) > this patch > looks good to me. I wonder though if this needs to be squashed in another > commit > to keep the test suite working for each patch and have no intermittent > failure in the > series. Hi Stefan Thanks for looking at this, it's good to know you think it's OK. I'll leave it to Junio's discretion whether to squash this into the previous patch Thanks again Phillip > Thanks, > Stefan >
Re: [PATCH v4 7/9] sequencer: load commit related config
On 05/12/17 11:21, Phillip Wood wrote: > On 04/12/17 18:30, Junio C Hamano wrote: >> Phillip Wood writes: >> >>> --- a/builtin/rebase--helper.c >>> +++ b/builtin/rebase--helper.c >>> @@ -9,6 +9,17 @@ static const char * const builtin_rebase_helper_usage[] = { >>> NULL >>> }; >>> >>> +static int git_rebase_helper_config(const char *k, const char *v, void *cb) >>> +{ >>> + int status; >>> + >>> + status = git_sequencer_config(k, v, NULL); >>> + if (status) >>> + return status; >>> + >>> + return git_default_config(k, v, NULL); >>> +} >>> + >> >> Sorry for spotting the problem this late, but this code is >> unfortunate and we will need to revisit it later; we may want to do >> so sooner rather than later. > > If it needs fixing then doing it before it hits next makes sense. > >> When k,v is a valid configuration that is handled by >> sequencer_config() successfully, this code still needs to call into >> default_config() with the same k,v, only to get it ignored. > > I'm a bit confused by this sentence. Do you mean that when k,v is a > valid configuration that is successfully handled by sequencer_config() > this code unnecessarily calls default_config() as there is no need to > call default_config() if k,v have already been handled? > >> The problem lies in the (mis)design of git_sequencer_config(). The >> function should either allow the caller to notice that (k,v) has >> already been handled, or be the last one in the callback by making a >> call to default_config() itself. > > The problem is that git_gpg_config() provides no indication if it has > handled k,v so there's no way to avoid the call to default_config() in > that case. builtin/am.c and builtin/commit.c both do something like > > static int git_am_config(const char *k, const char *v, void *cb) > { > int status; > > status = git_gpg_config(k, v, NULL); > if (status) > return status; > > return git_default_config(k, v, NULL); > } > Looking through the callers of gpg_config() it should be fairly easy to convert it to take an extra parameter in the same way as you suggested for sequencer_config() > Looking more generally at sequencer_config() I wonder if we should be > providing a warning or an error if the config contains an invalid > cleanup mode. Also should it be calling git_diff_ui_config() to set > things up for print_commit_summary()? (I'm not sure if anything in that > function is affected by diff config settings) > > Let me know what you think. I should have time to update this patch set > later in the week. > > Best Wishes > > Phillip > >> For the former, because this helper does not have to be called >> directly as a git_config() callback, but instead it is always meant >> to be called indirectly from another git_config() callback (like >> git_rebase_helper_config() here, and common_config() in >> builtin/revert.c like we see below), it does *not* have to be >> constrained by the function signature required for it to be a config >> callback. It could take a pointer to an int that stores if 'k' was >> handled inside the function, >> >> int git_sequencer_config_helper(char *k, char *v, int *handled); >> >> for example. >> >
Re: [PATCH v4 7/9] sequencer: load commit related config
On 04/12/17 18:30, Junio C Hamano wrote: > Phillip Wood writes: > >> --- a/builtin/rebase--helper.c >> +++ b/builtin/rebase--helper.c >> @@ -9,6 +9,17 @@ static const char * const builtin_rebase_helper_usage[] = { >> NULL >> }; >> >> +static int git_rebase_helper_config(const char *k, const char *v, void *cb) >> +{ >> +int status; >> + >> +status = git_sequencer_config(k, v, NULL); >> +if (status) >> +return status; >> + >> +return git_default_config(k, v, NULL); >> +} >> + > > Sorry for spotting the problem this late, but this code is > unfortunate and we will need to revisit it later; we may want to do > so sooner rather than later. If it needs fixing then doing it before it hits next makes sense. > When k,v is a valid configuration that is handled by > sequencer_config() successfully, this code still needs to call into > default_config() with the same k,v, only to get it ignored. I'm a bit confused by this sentence. Do you mean that when k,v is a valid configuration that is successfully handled by sequencer_config() this code unnecessarily calls default_config() as there is no need to call default_config() if k,v have already been handled? > The problem lies in the (mis)design of git_sequencer_config(). The > function should either allow the caller to notice that (k,v) has > already been handled, or be the last one in the callback by making a > call to default_config() itself. The problem is that git_gpg_config() provides no indication if it has handled k,v so there's no way to avoid the call to default_config() in that case. builtin/am.c and builtin/commit.c both do something like static int git_am_config(const char *k, const char *v, void *cb) { int status; status = git_gpg_config(k, v, NULL); if (status) return status; return git_default_config(k, v, NULL); } Looking more generally at sequencer_config() I wonder if we should be providing a warning or an error if the config contains an invalid cleanup mode. Also should it be calling git_diff_ui_config() to set things up for print_commit_summary()? (I'm not sure if anything in that function is affected by diff config settings) Let me know what you think. I should have time to update this patch set later in the week. Best Wishes Phillip > For the former, because this helper does not have to be called > directly as a git_config() callback, but instead it is always meant > to be called indirectly from another git_config() callback (like > git_rebase_helper_config() here, and common_config() in > builtin/revert.c like we see below), it does *not* have to be > constrained by the function signature required for it to be a config > callback. It could take a pointer to an int that stores if 'k' was > handled inside the function, > > int git_sequencer_config_helper(char *k, char *v, int *handled); > > for example. >
Re: [PATCH 2/2] progress: drop delay-threshold code
On Tue, Dec 5, 2017 at 11:37 AM, Lars Schneider wrote: > >> On 04 Dec 2017, at 23:07, Jeff King wrote: >> >> From: Lars Schneider >> >> Since 180a9f2268 (provide a facility for "delayed" progress >> reporting, 2007-04-20), the progress code has allowed >> callers to skip showing progress if they have reached a >> percentage-threshold of the total work before the delay >> period passes. >> >> But since 8aade107dd (progress: simplify "delayed" progress >> API, 2017-08-19), that parameter is not available to outside >> callers (we always passed zero after that commit, though >> that was corrected in the previous commit to "100%"). >> >> Let's drop the threshold code, which never triggers in >> any meaningful way. >> >> Signed-off-by: Jeff King >> --- >> I tweaked your patch slightly to clean up the now-simplified >> conditional. > > Your first patch ("progress: set default delay threshold to 100%, not 0%") > as well as the modifications to this one look good to me. Feel free > to add my "Signed-off-by: Lars Schneider ". > > Thanks, > Lars > > > PS: How do you generate the commit references "hash (first line, date)"? > Git log pretty print? $ git grep -A5 'Copy commit summary' Documentation/SubmittingPatches Documentation/SubmittingPatches:151:The "Copy commit summary" command of gitk can be used to obtain this Documentation/SubmittingPatches-152-format, or this invocation of `git show`: Documentation/SubmittingPatches-153- Documentation/SubmittingPatches-154- Documentation/SubmittingPatches-155-git show -s --date=short --pretty='format:%h ("%s", %ad)'
Re: [PATCH 2/2] progress: drop delay-threshold code
> On 04 Dec 2017, at 23:07, Jeff King wrote: > > From: Lars Schneider > > Since 180a9f2268 (provide a facility for "delayed" progress > reporting, 2007-04-20), the progress code has allowed > callers to skip showing progress if they have reached a > percentage-threshold of the total work before the delay > period passes. > > But since 8aade107dd (progress: simplify "delayed" progress > API, 2017-08-19), that parameter is not available to outside > callers (we always passed zero after that commit, though > that was corrected in the previous commit to "100%"). > > Let's drop the threshold code, which never triggers in > any meaningful way. > > Signed-off-by: Jeff King > --- > I tweaked your patch slightly to clean up the now-simplified > conditional. Your first patch ("progress: set default delay threshold to 100%, not 0%") as well as the modifications to this one look good to me. Feel free to add my "Signed-off-by: Lars Schneider ". Thanks, Lars PS: How do you generate the commit references "hash (first line, date)"? Git log pretty print?