Re: [PATCH v3 2/3] mergetool: don't require a work tree for --tool-help
On Mon, Oct 13, 2014 at 12:16:55PM -0700, Junio C Hamano wrote: David Aguilar dav...@gmail.com writes: From: Charles Bailey cbaile...@bloomberg.net Signed-off-by: Charles Bailey cbaile...@bloomberg.net Signed-off-by: David Aguilar dav...@gmail.com --- Changes since v2: This now uses the new git_dir_init function. git-mergetool.sh | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/git-mergetool.sh b/git-mergetool.sh index 96a61ba..cddb533 100755 --- a/git-mergetool.sh +++ b/git-mergetool.sh @@ -10,11 +10,11 @@ USAGE='[--tool=tool] [--tool-help] [-y|--no-prompt|--prompt] [file to merge] ...' SUBDIRECTORY_OK=Yes +NONGIT_OK=Yes OPTIONS_SPEC= TOOL_MODE=merge . git-sh-setup . git-mergetool--lib -require_work_tree # Returns true if the mode reflects a symlink is_symlink () { @@ -378,6 +378,9 @@ prompt_after_failed_merge () { done } +require_work_tree +git_dir_init This is somewhat curious. Shouldn't the order of these swapped? Yes. I'll send a replacement patch for 2/3 only. -- David -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mergetool: add an option for writing to a temporary directory
On Mon, Oct 13, 2014 at 12:24:41PM -0700, Junio C Hamano wrote: David Aguilar dav...@gmail.com writes: Teach mergetool to write files in a temporary directory when 'mergetool.writeToTemp' is true. This is helpful for tools such as Eclipse which cannot cope with multiple copies of the same file in the worktree. With this can we drop the change the temporary file name patch by Robin Rosenberg? http://thread.gmane.org/gmane.comp.version-control.git/255457/focus=255599 Message-Id: 1408607240-11369-1-git-send-email-robin.rosenb...@dewire.com I would think so but I'm biased ;-) -- David -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 2/3] mergetool: don't require a work tree for --tool-help
On Tue, Oct 14, 2014 at 11:35:11PM -0700, David Aguilar wrote: On Mon, Oct 13, 2014 at 12:16:55PM -0700, Junio C Hamano wrote: David Aguilar dav...@gmail.com writes: From: Charles Bailey cbaile...@bloomberg.net Signed-off-by: Charles Bailey cbaile...@bloomberg.net Signed-off-by: David Aguilar dav...@gmail.com --- Changes since v2: This now uses the new git_dir_init function. git-mergetool.sh | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/git-mergetool.sh b/git-mergetool.sh index 96a61ba..cddb533 100755 --- a/git-mergetool.sh +++ b/git-mergetool.sh @@ -10,11 +10,11 @@ USAGE='[--tool=tool] [--tool-help] [-y|--no-prompt|--prompt] [file to merge] ...' SUBDIRECTORY_OK=Yes +NONGIT_OK=Yes OPTIONS_SPEC= TOOL_MODE=merge . git-sh-setup . git-mergetool--lib -require_work_tree # Returns true if the mode reflects a symlink is_symlink () { @@ -378,6 +378,9 @@ prompt_after_failed_merge () { done } +require_work_tree +git_dir_init This is somewhat curious. Shouldn't the order of these swapped? Yes. I'll send a replacement patch for 2/3 only. Nevermind, I noticed you already fixed this up in pu. Thank you, -- David -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] mergetools/meld: do not rely on the output of `meld --help`
We cannot rely on the output of `meld --help` when determining whether or not meld understands the --output option. Newer versions of meld print a generic help message that does not mention --output even though it is supported. Add a mergetool.meld.compat variable to enable the historical behavior and make the --output mode the default. Reported-by: Andrey Novoseltsev novos...@gmail.com Signed-off-by: David Aguilar dav...@gmail.com --- Documentation/config.txt | 7 +++ mergetools/meld | 4 ++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 04a1e2f..a942bfe 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1755,6 +1755,13 @@ mergetool.tool.trustExitCode:: if the file has been updated, otherwise the user is prompted to indicate the success of the merge. +mergetool.meld.compat:: + Git passes the `--output` option to `meld` by default when + using the `meld` merge tool. Older versions of `meld` do not + understand the `--output` option. Set `mergetool.meld.compat` + to `true` if your version of `meld` is older and does not + understand the `--output` option. Defaults to false. + mergetool.keepBackup:: After performing a merge, the original file with conflict markers can be saved as a file with a `.orig` extension. If this variable diff --git a/mergetools/meld b/mergetools/meld index cb672a5..9e2b8d2 100644 --- a/mergetools/meld +++ b/mergetools/meld @@ -18,12 +18,12 @@ merge_cmd () { check_unchanged } -# Check whether 'meld --output file' is supported +# Use 'meld --output file' unless mergetool.meld.compat is true. check_meld_for_output_version () { meld_path=$(git config mergetool.meld.path) meld_path=${meld_path:-meld} - if $meld_path --help 21 | grep -e --output /dev/null + if test $(git config --bool mergetool.meld.compat) != true then meld_has_output_option=true else -- 2.1.2.453.g1b015e3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] t7610-mergetool: prefer test_config over git config
Signed-off-by: David Aguilar dav...@gmail.com --- t/t7610-mergetool.sh | 20 +--- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/t/t7610-mergetool.sh b/t/t7610-mergetool.sh index 1a15e06..7eeb207 100755 --- a/t/t7610-mergetool.sh +++ b/t/t7610-mergetool.sh @@ -14,7 +14,7 @@ Testing basic merge tool invocation' # running mergetool test_expect_success 'setup' ' - git config rerere.enabled true + test_config rerere.enabled true echo master file1 echo master spaced spaced name echo master file11 file11 @@ -112,7 +112,7 @@ test_expect_success 'custom mergetool' ' ' test_expect_success 'mergetool crlf' ' - git config core.autocrlf true + test_config core.autocrlf true git checkout -b test2 branch1 test_must_fail git merge master /dev/null 21 ( yes | git mergetool file1 /dev/null 21 ) @@ -129,7 +129,7 @@ test_expect_success 'mergetool crlf' ' git submodule update -N test $(cat submod/bar) = master submodule git commit -m branch1 resolved with mergetool - autocrlf - git config core.autocrlf false + test_config core.autocrlf false git reset --hard ' @@ -176,7 +176,7 @@ test_expect_success 'mergetool skips autoresolved' ' test_expect_success 'mergetool merges all from subdir' ' ( cd subdir - git config rerere.enabled false + test_config rerere.enabled false test_must_fail git merge master ( yes r | git mergetool ../submod ) ( yes d d | git mergetool --no-prompt ) @@ -190,7 +190,7 @@ test_expect_success 'mergetool merges all from subdir' ' ' test_expect_success 'mergetool skips resolved paths when rerere is active' ' - git config rerere.enabled true + test_config rerere.enabled true rm -rf .git/rr-cache git checkout -b test5 branch1 git submodule update -N @@ -204,7 +204,7 @@ test_expect_success 'mergetool skips resolved paths when rerere is active' ' ' test_expect_success 'conflicted stash sets up rerere' ' - git config rerere.enabled true + test_config rerere.enabled true git checkout stash1 echo Conflicting stash content file11 git stash @@ -232,7 +232,7 @@ test_expect_success 'conflicted stash sets up rerere' ' test_expect_success 'mergetool takes partial path' ' git reset --hard - git config rerere.enabled false + test_config rerere.enabled false git checkout -b test12 branch1 git submodule update -N test_must_fail git merge master @@ -505,14 +505,12 @@ test_expect_success 'file with no base' ' test_expect_success 'custom commands override built-ins' ' git checkout -b test14 branch1 - git config mergetool.defaults.cmd cat \\$REMOTE\ \\$MERGED\ - git config mergetool.defaults.trustExitCode true + test_config mergetool.defaults.cmd cat \\$REMOTE\ \\$MERGED\ + test_config mergetool.defaults.trustExitCode true test_must_fail git merge master git mergetool --no-prompt --tool defaults -- both echo master both added expected test_cmp both expected - git config --unset mergetool.defaults.cmd - git config --unset mergetool.defaults.trustExitCode git reset --hard master /dev/null 21 ' -- 2.1.2.453.g1b015e3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/5] test-lib-functions: adjust style to match CodingGuidelines
Prefer test over [ ] for conditionals. Prefer $() over backticks for command substitutions. Avoid control structures on a single line with semicolons. Signed-off-by: David Aguilar dav...@gmail.com --- t/test-lib-functions.sh | 30 ++ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index dafd6ad..fc77cc6 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -413,7 +413,7 @@ test_external () { # test_run_, but keep its stdout on our stdout even in # non-verbose mode. $@ 24 - if [ $? = 0 ] + if test $? = 0 then if test $test_external_has_tap -eq 0; then test_ok_ $descr @@ -440,11 +440,12 @@ test_external_without_stderr () { tmp=${TMPDIR:-/tmp} stderr=$tmp/git-external-stderr.$$.tmp test_external $@ 4 $stderr - [ -f $stderr ] || error Internal error: $stderr disappeared. + test -f $stderr || error Internal error: $stderr disappeared. descr=no stderr: $1 shift say 3 # expecting no stderr from previous command - if [ ! -s $stderr ]; then + if test ! -s $stderr + then rm $stderr if test $test_external_has_tap -eq 0; then @@ -454,8 +455,9 @@ test_external_without_stderr () { test_success=$(($test_success + 1)) fi else - if [ $verbose = t ]; then - output=`echo; echo # Stderr is:; cat $stderr` + if test $verbose = t + then + output=$(echo; echo # Stderr is:; cat $stderr) else output= fi @@ -474,7 +476,7 @@ test_external_without_stderr () { # The commands test the existence or non-existence of $1. $2 can be # given to provide a more precise diagnosis. test_path_is_file () { - if ! [ -f $1 ] + if ! test -f $1 then echo File $1 doesn't exist. $* false @@ -482,7 +484,7 @@ test_path_is_file () { } test_path_is_dir () { - if ! [ -d $1 ] + if ! test -d $1 then echo Directory $1 doesn't exist. $* false @@ -501,11 +503,12 @@ test_dir_is_empty () { } test_path_is_missing () { - if [ -e $1 ] + if test -e $1 then echo Path exists: ls -ld $1 - if [ $# -ge 1 ]; then + if test $# -ge 1 + then echo $* fi false @@ -657,9 +660,12 @@ test_cmp_rev () { # similar to GNU seq(1), but the latter might not be available # everywhere (and does not do letters). It may be used like: # -# for i in `test_seq 100`; do -# for j in `test_seq 10 20`; do -# for k in `test_seq a z`; do +# for i in $(test_seq 100) +# do +# for j in $(test_seq 10 20) +# do +# for k in $(test_seq a z) +# do # echo $i-$j-$k # done # done -- 2.1.2.453.g1b015e3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] t7610-mergetool: add test cases for mergetool.writeToTemp
Add tests to ensure that filenames start with ./ when mergetool.writeToTemp is false and do not start with ./ when mergetool.writeToTemp is true. Signed-off-by: David Aguilar dav...@gmail.com --- t/t7610-mergetool.sh | 23 +++ 1 file changed, 23 insertions(+) diff --git a/t/t7610-mergetool.sh b/t/t7610-mergetool.sh index 214edfb..1a15e06 100755 --- a/t/t7610-mergetool.sh +++ b/t/t7610-mergetool.sh @@ -516,4 +516,27 @@ test_expect_success 'custom commands override built-ins' ' git reset --hard master /dev/null 21 ' +test_expect_success 'filenames seen by tools start with ./' ' + git checkout -b test15 branch1 + test_config mergetool.writeToTemp false + test_config mergetool.myecho.cmd echo \\$LOCAL\ + test_config mergetool.myecho.trustExitCode true + test_must_fail git merge master + git mergetool --no-prompt --tool myecho -- both actual + grep ^\./both_LOCAL_ actual /dev/null + git reset --hard master /dev/null 21 +' + +test_expect_success 'temporary filenames are used with mergetool.writeToTemp' ' + git checkout -b test16 branch1 + test_config mergetool.writeToTemp true + test_config mergetool.myecho.cmd echo \\$LOCAL\ + test_config mergetool.myecho.trustExitCode true + test_must_fail git merge master + git mergetool --no-prompt --tool myecho -- both actual + test_must_fail grep ^\./both_LOCAL_ actual /dev/null + grep /both_LOCAL_ actual /dev/null + git reset --hard master /dev/null 21 +' + test_done -- 2.1.2.453.g1b015e3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] t7610-mergetool: use tabs instead of a mix of tabs and spaces
Signed-off-by: David Aguilar dav...@gmail.com --- t/t7610-mergetool.sh | 918 +-- 1 file changed, 459 insertions(+), 459 deletions(-) diff --git a/t/t7610-mergetool.sh b/t/t7610-mergetool.sh index 05d9db0..875c8af 100755 --- a/t/t7610-mergetool.sh +++ b/t/t7610-mergetool.sh @@ -14,512 +14,512 @@ Testing basic merge tool invocation' # running mergetool test_expect_success 'setup' ' -git config rerere.enabled true -echo master file1 -echo master spaced spaced name -echo master file11 file11 -echo master file12 file12 -echo master file13 file13 -echo master file14 file14 -mkdir subdir -echo master sub subdir/file3 -test_create_repo submod -( - cd submod - : foo - git add foo - git commit -m Add foo -) -git submodule add git://example.com/submod submod -git add file1 spaced name file1[1-4] subdir/file3 .gitmodules submod -git commit -m add initial versions - -git checkout -b branch1 master -git submodule update -N -echo branch1 change file1 -echo branch1 newfile file2 -echo branch1 spaced spaced name -echo branch1 both added both -echo branch1 change file11 file11 -echo branch1 change file13 file13 -echo branch1 sub subdir/file3 -( - cd submod - echo branch1 submodule bar - git add bar - git commit -m Add bar on branch1 - git checkout -b submod-branch1 -) -git add file1 spaced name file11 file13 file2 subdir/file3 submod -git add both -git rm file12 -git commit -m branch1 changes - -git checkout -b stash1 master -echo stash1 change file11 file11 -git add file11 -git commit -m stash1 changes - -git checkout -b stash2 master -echo stash2 change file11 file11 -git add file11 -git commit -m stash2 changes - -git checkout master -git submodule update -N -echo master updated file1 -echo master new file2 -echo master updated spaced spaced name -echo master both added both -echo master updated file12 file12 -echo master updated file14 file14 -echo master new sub subdir/file3 -( - cd submod - echo master submodule bar - git add bar - git commit -m Add bar on master - git checkout -b submod-master -) -git add file1 spaced name file12 file14 file2 subdir/file3 submod -git add both -git rm file11 -git commit -m master updates - -git config merge.tool mytool -git config mergetool.mytool.cmd cat \\$REMOTE\ \\$MERGED\ -git config mergetool.mytool.trustExitCode true -git config mergetool.mybase.cmd cat \\$BASE\ \\$MERGED\ -git config mergetool.mybase.trustExitCode true + git config rerere.enabled true + echo master file1 + echo master spaced spaced name + echo master file11 file11 + echo master file12 file12 + echo master file13 file13 + echo master file14 file14 + mkdir subdir + echo master sub subdir/file3 + test_create_repo submod + ( + cd submod + : foo + git add foo + git commit -m Add foo + ) + git submodule add git://example.com/submod submod + git add file1 spaced name file1[1-4] subdir/file3 .gitmodules submod + git commit -m add initial versions + + git checkout -b branch1 master + git submodule update -N + echo branch1 change file1 + echo branch1 newfile file2 + echo branch1 spaced spaced name + echo branch1 both added both + echo branch1 change file11 file11 + echo branch1 change file13 file13 + echo branch1 sub subdir/file3 + ( + cd submod + echo branch1 submodule bar + git add bar + git commit -m Add bar on branch1 + git checkout -b submod-branch1 + ) + git add file1 spaced name file11 file13 file2 subdir/file3 submod + git add both + git rm file12 + git commit -m branch1 changes + + git checkout -b stash1 master + echo stash1 change file11 file11 + git add file11 + git commit -m stash1 changes + + git checkout -b stash2 master + echo stash2 change file11 file11 + git add file11 + git commit -m stash2 changes + + git checkout master + git submodule update -N + echo master updated file1 + echo master new file2 + echo master updated spaced spaced name + echo master both added both + echo master updated file12 file12 + echo master updated file14 file14 + echo master new sub subdir/file3 + ( + cd submod + echo master submodule bar + git add bar + git commit -m Add bar on
[PATCH 2/5] t7610-mergetool: add missing and remove commented-out code
Signed-off-by: David Aguilar dav...@gmail.com --- t/t7610-mergetool.sh | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/t/t7610-mergetool.sh b/t/t7610-mergetool.sh index 875c8af..214edfb 100755 --- a/t/t7610-mergetool.sh +++ b/t/t7610-mergetool.sh @@ -192,7 +192,7 @@ test_expect_success 'mergetool merges all from subdir' ' test_expect_success 'mergetool skips resolved paths when rerere is active' ' git config rerere.enabled true rm -rf .git/rr-cache - git checkout -b test5 branch1 + git checkout -b test5 branch1 git submodule update -N test_must_fail git merge master /dev/null 21 ( yes l | git mergetool --no-prompt submod /dev/null 21 ) @@ -231,18 +231,12 @@ test_expect_success 'conflicted stash sets up rerere' ' ' test_expect_success 'mergetool takes partial path' ' - git reset --hard + git reset --hard git config rerere.enabled false git checkout -b test12 branch1 git submodule update -N test_must_fail git merge master - #should not need these lines - #( yes d | git mergetool file11 /dev/null 21 ) - #( yes d | git mergetool file12 /dev/null 21 ) - #( yes l | git mergetool submod /dev/null 21 ) - #( yes | git mergetool file1 file2 /dev/null 21 ) - ( yes | git mergetool subdir ) test $(cat subdir/file3) = master new sub -- 2.1.2.453.g1b015e3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Allow building Git with Asciidoctor
Am 14.10.2014 um 11:51 schrieb Jeff King: What's the status on AsciiDoc versus AsciiDoctor? The latter seems more actively developed these days, but perhaps that is just my perception. The incompatibilities seem fairly minimal (if those first two patches are the extent of it, I have no problem at all trying to remain compatible with both). Would it ever make sense to switch to AsciiDoctor as our official command-line build program? I know it is supposed to be much faster (though a lot of the slowness in our build chain is due to docbook, not asciidoc itself). Just recently we added the AsciiDoc toolchain to our git-for-windows/sdk (formerly known as msysgit). So I'm not really fond of switching now to something different again. Remaining compatible with both would therefore be my choice. Thomas -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] git-svn: merge: fix rooturl/branchurl match check
When populating svn:mergeinfo, git-svn merge checks if the merge parent of the merged branch is under the same root as the git-svn repository. This was implemented comparing $gs-repos_root with the return value of of cmt_metadata for the merge parent. However, the first may contain a username, whereas the second does not. In this case the comparison fails. Remove the username from $gs-repos_root before performing the comparison. --- git-svn.perl | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/git-svn.perl b/git-svn.perl index b6e2186..0a5a5ff 100755 --- a/git-svn.perl +++ b/git-svn.perl @@ -707,7 +707,8 @@ sub populate_merge_info { my $all_parents_ok = 1; my $aggregate_mergeinfo = ''; my $rooturl = $gs-repos_root; - my ($target_branch) = $gs-full_pushurl =~ /^\Q$rooturl\E(.*)/; + Git::SVN::remove_username($rooturl); + my $target_branch = $gs-path; if (defined($rewritten_parent)) { # Replace first parent with newly-rewritten version @@ -729,7 +730,7 @@ sub populate_merge_info { } my $branchpath = $1; - my $ra = Git::SVN::Ra-new($branchurl); + my $ra = Git::SVN::Ra-new(add_path_to_url($gs-repos_root, $branchpath)); my (undef, undef, $props) = $ra-get_dir(canonicalize_path(.), $svnrev); my $par_mergeinfo = $props-{'svn:mergeinfo'}; @@ -921,6 +922,7 @@ sub cmd_dcommit { # information from different SVN repos, and paths # which are not underneath this repository root. my $rooturl = $gs-repos_root; + Git::SVN::remove_username($rooturl); foreach my $d (@$linear_refs) { my %parentshash; read_commit_parents(\%parentshash, $d); -- 2.1.2.443.g670a3c1.dirty -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Custom hunk-header with ignore case setting
Hi, I'm working with a proprietary programming language which ignores case. I now started to write a custom version of diff.*.xfuncname and it is kind of ugly to always spell out all cases like [Ff][Uu][Nn][cC][Tt][Ii][oO][Nn]. I've seen that the builtin diff patterns in userdiff.c can be specified ignoring case using the IPATTERN macro. One of the possible solutions would be to patch userdiff.c (patch courtesy of Johannes Schindelin): -- snip -- diff --git a/userdiff.c b/userdiff.c index fad52d6..f089e50 100644 --- a/userdiff.c +++ b/userdiff.c @@ -228,6 +228,9 @@ int userdiff_config(const char *k, const char *v) return parse_funcname(drv-funcname, k, v, 0); if (!strcmp(type, xfuncname)) return parse_funcname(drv-funcname, k, v, REG_EXTENDED); + if (!strcmp(type, ixfuncname)) + return parse_funcname(drv-funcname, k, v, + REG_EXTENDED | REG_ICASE); if (!strcmp(type, binary)) return parse_tristate(drv-binary, k, v); if (!strcmp(type, command)) -- snap - With a patch like that I would, of course, supply documentation and tests. Is that something worth a try from my side? An alternative solution would be to add something like pxfuncname which understands PCREs which we we already support in git grep. And yet another alternative would be that I send a patch enhancing userdiff.c with a reasonable pattern for the programming language. But its community, and especially intersected with git users, is not that large. Thanks, Thomas -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Multiple worktrees vs. submodules fixes
On Wed, Oct 15, 2014 at 3:31 AM, Max Kirillov m...@max630.net wrote: On Tue, Oct 14, 2014 at 07:09:45PM +0200, Jens Lehmann wrote: Until that problem is solved it looks wrong to pass GIT_COMMON_DIR into submodule recursion, I believe GIT_COMMON_DIR should be added to the local_repo_env array (and even if it is passed on later, we might have to append /modules/submodule_name to make it point to the correct location). Actually, why there should be an _environment_ variable GIT_COMMON_DIR at all? I mean, gitdir is resolved to some directory (through link or environment), and it contains the shared data directly or referes to it with the commondir link. In which case anyone would want to override that location? It's how the implementation was built up. First I split the repo in two and I need something to point the new/shared part. $GIT_DIR/worktrees comes later to glue them up. Keeping it an environment variable gives us a possibility to glue things up in a different way than using $GIT_DIR/worktrees. Of course the odds of anybody doing that are probably small or even near zero. Still consuming the rest of this thread. Thanks all for working on the submodule support for this. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Multiple worktrees vs. submodules fixes
On Wed, Oct 15, 2014 at 5:15 AM, Max Kirillov m...@max630.net wrote: Hmm, so I tend towards adding GIT_COMMON_DIR to local_repo_env until we figured out how to handle this. Without that I fear bad things will happen, at least for a superproject with multiple checkout-to work trees where the same submodule is initialized more than once ... I learned about local_repo_env and agree it should include GIT_COMMON_DIR. Unless it is removed at all... It's in the same class as GIT_DIR and GIT_WORK_TREE so yeah it should be in local_repo_env. Removing it would break t0060-path-utils.sh at least. Unless we have a very good reason to remove it, I think we should keep it as is. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] clone: --dissociate option to mark that reference is only temporary
On 14-10-14 03:57 PM, Junio C Hamano wrote: While use of the --reference option to borrow objects from an existing local repository of the same project is an effective way to reduce traffic when cloning a project over the network, it makes the resulting borrowing repository dependent on the borrowed repository. After running git clone --reference=P $URL Q the resulting repository Q will be broken if the borrowed repository P disappears. The way to allow the borrowed repository to be removed is to repack the borrowing repository (i.e. run git repack -a -d in Q); while power users may know it very well, it is not easily discoverable. Teach a new --dissociate option to git clone to run this repacking for the user. After reading the above I thought the option would be better named --derference. It seemed to me to be something one would like to run after the first --reference clone. Looking more closely I see that's not the case. In fact, the option only makes sense if --reference is also used. I think things would be more understandable if the option was --dissociate repository and was an explicit alternative to --reference: [[--reference | --dissociate] repository] I'm still not liking the name --dissociate though. The original suggestion of --borrow is better. Perhaps --library or --local-cache? I dunno... So now I'm wondering if the implementation would be more efficient as an extension of the --local operation. That is, instead of a post-clone repack, do a --local clone first followed by a simple git fetch from the source repo. M. Signed-off-by: Junio C Hamano gits...@pobox.com --- * This comes from http://thread.gmane.org/gmane.comp.version-control.git/243918/focus=245397 which is one of the low-hanging entries in the leftover-bits list http://git-blame.blogspot.com/p/leftover-bits.html Yes, I must have been really bored to do this ;-) Documentation/git-clone.txt | 11 +-- builtin/clone.c | 25 + t/t5700-clone-reference.sh | 17 + 3 files changed, 51 insertions(+), 2 deletions(-) diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt index 0363d00..f1f2a3f 100644 --- a/Documentation/git-clone.txt +++ b/Documentation/git-clone.txt @@ -12,7 +12,7 @@ SYNOPSIS 'git clone' [--template=template_directory] [-l] [-s] [--no-hardlinks] [-q] [-n] [--bare] [--mirror] [-o name] [-b name] [-u upload-pack] [--reference repository] - [--separate-git-dir git dir] + [--dissociate] [--separate-git-dir git dir] [--depth depth] [--[no-]single-branch] [--recursive | --recurse-submodules] [--] repository [directory] @@ -98,7 +98,14 @@ objects from the source repository into a pack in the cloned repository. require fewer objects to be copied from the repository being cloned, reducing network and local storage costs. + -*NOTE*: see the NOTE for the `--shared` option. +*NOTE*: see the NOTE for the `--shared` option, and also the +`--dissociate` option. + +--dissociate:: + Borrow the objects from reference repositories specified + with the `--reference` options only to reduce network + transfer and stop borrowing from them after a clone is made + by making necessary local copies of borrowed objects. --quiet:: -q:: diff --git a/builtin/clone.c b/builtin/clone.c index bbd169c..780fbd5 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -48,6 +48,7 @@ static int option_verbosity; static int option_progress = -1; static struct string_list option_config; static struct string_list option_reference; +static int option_dissociate; static int opt_parse_reference(const struct option *opt, const char *arg, int unset) { @@ -93,6 +94,8 @@ static struct option builtin_clone_options[] = { N_(create a shallow clone of that depth)), OPT_BOOL(0, single-branch, option_single_branch, N_(clone only one branch, HEAD or --branch)), + OPT_BOOL(0, dissociate, option_dissociate, + N_(use --reference only while cloning)), OPT_STRING(0, separate-git-dir, real_git_dir, N_(gitdir), N_(separate git dir from working tree)), OPT_STRING_LIST('c', config, option_config, N_(key=value), @@ -736,6 +739,21 @@ static void write_refspec_config(const char* src_ref_prefix, strbuf_release(value); } +static void dissociate_from_references(void) +{ + struct child_process cmd; + + memset(cmd, 0, sizeof(cmd)); + argv_array_pushl(cmd.args, repack, -a, -d, NULL); + cmd.git_cmd = 1; + cmd.out = -1; + cmd.no_stdin = 1; + if (run_command(cmd)) + die(_(cannot repack to clean up)); + if (unlink(git_path(objects/info/alternates)) errno != ENOENT) + die_errno(_(cannot unlink temporary alternates
[PATCH v2 05/11] refs.c: refactor resolve_ref_unsafe() to use strbuf internally
From: Nguyễn Thái Ngọc Duy pclo...@gmail.com In the beginning, we had resolve_ref() that returns a buffer owned by this function. Then we started to move away from that direction because the buffer could be overwritten by the next resolve_ref() call and introduced two new functions: resolve_ref_unsafe() and resolve_refdup(). The static buffer is still kept internally. This patch makes the core of resolve_ref use a strbuf instead of static buffer. Which makes resolve_refdup() more efficient (no need to copy from the static buffer to a new buffer). It also removes the (random?) 256 char limit. In future, resolve_ref() could be used directly without going through resolve_refdup() wrapper. A minor bonus. resolve_ref(dup) are now more thread-friendly (although I'm not 100% sure if they are thread-safe yet). Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- cache.h | 1 + refs.c | 115 +--- 2 files changed, 67 insertions(+), 49 deletions(-) diff --git a/cache.h b/cache.h index 3e6a914..e36084d 100644 --- a/cache.h +++ b/cache.h @@ -1004,6 +1004,7 @@ extern int read_ref(const char *refname, unsigned char *sha1); */ extern const char *resolve_ref_unsafe(const char *ref, unsigned char *sha1, int reading, int *flag); extern char *resolve_refdup(const char *ref, unsigned char *sha1, int reading, int *flag); +extern int resolve_ref(const char *refname, struct strbuf *result, unsigned char *sha1, int reading, int *flag); extern int dwim_ref(const char *str, int len, unsigned char *sha1, char **ref); extern int dwim_log(const char *str, int len, unsigned char *sha1, char **ref); diff --git a/refs.c b/refs.c index 020ee3f..29ea7e0 100644 --- a/refs.c +++ b/refs.c @@ -1397,34 +1397,41 @@ static int handle_missing_loose_ref(const char *refname, } } -/* This function needs to return a meaningful errno on failure */ -const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int reading, int *flag) +/* + * 'result' content will be destroyed. Its value may be undefined if + * resolve_ref returns -1. + * + * This function needs to return a meaningful errno on failure + */ +int resolve_ref(const char *refname, struct strbuf *result, + unsigned char *sha1, int reading, int *flag) { + struct strbuf buffer = STRBUF_INIT; int depth = MAXDEPTH; - ssize_t len; - char buffer[256]; - static char refname_buffer[256]; + int ret = -1; if (flag) *flag = 0; if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) { errno = EINVAL; - return NULL; + return -1; } + strbuf_reset(result); + strbuf_addstr(result, refname); + for (;;) { char path[PATH_MAX]; struct stat st; const char *buf; - int fd; if (--depth 0) { errno = ELOOP; - return NULL; + break; } - git_snpath(path, sizeof(path), %s, refname); + git_snpath(path, sizeof(path), %s, result-buf); /* * We might have to loop back here to avoid a race @@ -1437,30 +1444,26 @@ const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int rea */ stat_ref: if (lstat(path, st) 0) { - if (errno == ENOENT) { - if (handle_missing_loose_ref(refname, sha1, -reading, flag)) - return NULL; - return refname; - } else - return NULL; + if (errno == ENOENT) + ret = handle_missing_loose_ref(result-buf, sha1, + reading, flag); + break; } /* Follow normalized - ie refs/.. symlinks by hand */ if (S_ISLNK(st.st_mode)) { - len = readlink(path, buffer, sizeof(buffer)-1); - if (len 0) { + /* no need to reset buffer, strbuf_readlink does that */ + if (strbuf_readlink(buffer, path, 256) 0) { if (errno == ENOENT || errno == EINVAL) /* inconsistent with lstat; retry */ goto stat_ref; else - return NULL; + break; } - buffer[len] = 0; -
[PATCH v2 08/11] resolve_gitlink_ref_recursive(): drop arbitrary refname length limit
This limit was added in 0ebde32 (Add 'resolve_gitlink_ref()' helper function - 2007-04-09) Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- Theoretically, somebody else might be relying on resolve_gitlink_ref_recursive() to fail for too-long reference names to prevent path.c's pitiful error handling from returning /bad-path/ and causing a nonsensical file lookup. I doubt it, but if somebody is worried about it we could leave out this patch and instead build the MAXREFLEN check into parse_ref(). Long-term, I think we should fix up path.c to remove its PATH_MAX limits. I've started working on that. refs.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/refs.c b/refs.c index 9f2a0f8..e1aa6a4 100644 --- a/refs.c +++ b/refs.c @@ -1279,7 +1279,6 @@ static struct ref_dir *get_loose_refs(struct ref_cache *refs) /* We allow recursive symbolic refs. Only within reason, though */ #define MAXDEPTH 5 -#define MAXREFLEN (1024) /* * Called by resolve_gitlink_ref_recursive() after it failed to read @@ -1308,7 +1307,7 @@ static int resolve_gitlink_ref_recursive(struct ref_cache *refs, char buffer[128], *p; char *path; - if (recursion MAXDEPTH || strlen(refname) MAXREFLEN) + if (recursion MAXDEPTH) return -1; path = *refs-name ? git_path_submodule(refs-name, %s, refname) -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 09/11] refs.c: rewrite resolve_gitlink_ref() to use parse_ref()
From: Nguyễn Thái Ngọc Duy pclo...@gmail.com resolve_gitlink_ref_recursive() did about the same thing as parse_ref(), but didn't know as many tricks. It also had another random limit of 128 bytes for symrefs. So base resolve_gitlink_ref() on parse_ref() instead. Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- refs.c | 69 ++ 1 file changed, 27 insertions(+), 42 deletions(-) diff --git a/refs.c b/refs.c index e1aa6a4..a7c8abd 100644 --- a/refs.c +++ b/refs.c @@ -1299,48 +1299,11 @@ static int resolve_gitlink_packed_ref(struct ref_cache *refs, return 0; } -static int resolve_gitlink_ref_recursive(struct ref_cache *refs, -const char *refname, unsigned char *sha1, -int recursion) -{ - int fd, len; - char buffer[128], *p; - char *path; - - if (recursion MAXDEPTH) - return -1; - path = *refs-name - ? git_path_submodule(refs-name, %s, refname) - : git_path(%s, refname); - fd = open(path, O_RDONLY); - if (fd 0) - return resolve_gitlink_packed_ref(refs, refname, sha1); - - len = read(fd, buffer, sizeof(buffer)-1); - close(fd); - if (len 0) - return -1; - while (len isspace(buffer[len-1])) - len--; - buffer[len] = 0; - - /* Was it a detached head or an old-fashioned symlink? */ - if (!get_sha1_hex(buffer, sha1)) - return 0; - - /* Symref? */ - if (strncmp(buffer, ref:, 4)) - return -1; - p = buffer + 4; - while (isspace(*p)) - p++; - - return resolve_gitlink_ref_recursive(refs, p, sha1, recursion+1); -} - int resolve_gitlink_ref(const char *path, const char *refname, unsigned char *sha1) { - int len = strlen(path), retval; + struct strbuf result = STRBUF_INIT; + int len = strlen(path), parseval, ret; + int depth = MAXDEPTH; char *submodule; struct ref_cache *refs; @@ -1352,8 +1315,30 @@ int resolve_gitlink_ref(const char *path, const char *refname, unsigned char *sh refs = get_ref_cache(submodule); free(submodule); - retval = resolve_gitlink_ref_recursive(refs, refname, sha1, 0); - return retval; + strbuf_addstr(result, refname); + do { + if (--depth 0) { + errno = ELOOP; + ret = -1; + goto out; + } + path = *refs-name + ? git_path_submodule(refs-name, %s, result.buf) + : git_path(%s, result.buf); + parseval = parse_ref(path, result, sha1, NULL); + } while (!parseval); + + if (parseval == 1) { + ret = 0; + } else if (parseval == -2) { + ret = resolve_gitlink_packed_ref(refs, result.buf, sha1) ? -1 : 0; + } else { + ret = -1; + } + +out: + strbuf_release(result); + return ret; } /* -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 01/11] strbuf_read_file(): preserve errno on failure
From: Nguyễn Thái Ngọc Duy pclo...@gmail.com This will allow the function to be used in a later patch. Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- strbuf.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/strbuf.c b/strbuf.c index 0346e74..f1fec58 100644 --- a/strbuf.c +++ b/strbuf.c @@ -482,15 +482,18 @@ int strbuf_getwholeline_fd(struct strbuf *sb, int fd, int term) int strbuf_read_file(struct strbuf *sb, const char *path, size_t hint) { - int fd, len; + int fd, len, saved_errno; fd = open(path, O_RDONLY); if (fd 0) return -1; len = strbuf_read(sb, fd, hint); + saved_errno = errno; close(fd); - if (len 0) + if (len 0) { + errno = saved_errno; return -1; + } return len; } -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 03/11] resolve_ref_unsafe(): reverse the logic of the symref conditional
It is clearer if the check whether a loose reference file is a symref is followed immediately by the code to handle the symref, rather than the current code, which has the if statement the other way around. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- refs.c | 36 ++-- 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/refs.c b/refs.c index ceba23c..771941b 100644 --- a/refs.c +++ b/refs.c @@ -1497,35 +1497,35 @@ const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int rea len--; buffer[len] = '\0'; - /* -* Is it a symbolic ref? -*/ - if (!starts_with(buffer, ref:)) { - /* -* Please note that FETCH_HEAD has a second -* line containing other data. -*/ - if (get_sha1_hex(buffer, sha1) || - (buffer[40] != '\0' !isspace(buffer[40]))) { + if (starts_with(buffer, ref:)) { + /* It is a symbolic ref */ + if (flag) + *flag |= REF_ISSYMREF; + buf = buffer + 4; + while (isspace(*buf)) + buf++; + if (check_refname_format(buf, REFNAME_ALLOW_ONELEVEL)) { if (flag) *flag |= REF_ISBROKEN; errno = EINVAL; return NULL; } - return refname; + refname = strcpy(refname_buffer, buf); + continue; } - if (flag) - *flag |= REF_ISSYMREF; - buf = buffer + 4; - while (isspace(*buf)) - buf++; - if (check_refname_format(buf, REFNAME_ALLOW_ONELEVEL)) { + + /* +* It must be a normal ref. Please note that +* FETCH_HEAD has a second line containing other data. +*/ + if (get_sha1_hex(buffer, sha1) || + (buffer[40] != '\0' !isspace(buffer[40]))) { if (flag) *flag |= REF_ISBROKEN; errno = EINVAL; return NULL; } - refname = strcpy(refname_buffer, buf); + return refname; } } -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 02/11] handle_missing_loose_ref(): return an int
From: Nguyễn Thái Ngọc Duy pclo...@gmail.com The return value of handle_missing_loose_ref() was either NULL to signify an error or the input parameter refname on success. So instead of returning a string, just return a 0 on success or -1 on error, so the reader doesn't have to wonder what string the return value points at. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- refs.c | 24 +--- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/refs.c b/refs.c index ffd45e9..ceba23c 100644 --- a/refs.c +++ b/refs.c @@ -1370,10 +1370,10 @@ static struct ref_entry *get_packed_ref(const char *refname) * A loose ref file doesn't exist; check for a packed ref. The * options are forwarded from resolve_safe_unsafe(). */ -static const char *handle_missing_loose_ref(const char *refname, - unsigned char *sha1, - int reading, - int *flag) +static int handle_missing_loose_ref(const char *refname, + unsigned char *sha1, + int reading, + int *flag) { struct ref_entry *entry; @@ -1386,14 +1386,14 @@ static const char *handle_missing_loose_ref(const char *refname, hashcpy(sha1, entry-u.value.sha1); if (flag) *flag |= REF_ISPACKED; - return refname; + return 0; } /* The reference is not a packed reference, either. */ if (reading) { - return NULL; + return -1; } else { hashclr(sha1); - return refname; + return 0; } } @@ -1437,10 +1437,12 @@ const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int rea */ stat_ref: if (lstat(path, st) 0) { - if (errno == ENOENT) - return handle_missing_loose_ref(refname, sha1, - reading, flag); - else + if (errno == ENOENT) { + if (handle_missing_loose_ref(refname, sha1, +reading, flag)) + return NULL; + return refname; + } else return NULL; } -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 11/11] resolve_gitlink_ref(): remove redundant test
At this point we know that refs-name is a non-empty string, so we know we don't have to look up the reference in the main repository. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- refs.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/refs.c b/refs.c index 8edcc3b..3f4b95a 100644 --- a/refs.c +++ b/refs.c @@ -1303,9 +1303,7 @@ int resolve_gitlink_ref(const char *path, const char *refname, unsigned char *sh ret = -1; goto out; } - path = *refs-name - ? git_path_submodule(refs-name, %s, result.buf) - : git_path(%s, result.buf); + path = git_path_submodule(refs-name, %s, result.buf); parseval = parse_ref(path, result, sha1, NULL); } while (!parseval); -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 07/11] handle_missing_loose_ref(): inline function
It only had one remaining caller. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- refs.c | 70 -- 1 file changed, 30 insertions(+), 40 deletions(-) diff --git a/refs.c b/refs.c index 3acd83e..9f2a0f8 100644 --- a/refs.c +++ b/refs.c @@ -1366,37 +1366,6 @@ static struct ref_entry *get_packed_ref(const char *refname) return find_ref(get_packed_refs(ref_cache), refname); } -/* - * A loose ref file doesn't exist; check for a packed ref. The - * options are forwarded from resolve_safe_unsafe(). - */ -static int handle_missing_loose_ref(const char *refname, - unsigned char *sha1, - int reading, - int *flag) -{ - struct ref_entry *entry; - - /* -* The loose reference file does not exist; check for a packed -* reference. -*/ - entry = get_packed_ref(refname); - if (entry) { - hashcpy(sha1, entry-u.value.sha1); - if (flag) - *flag |= REF_ISPACKED; - return 0; - } - /* The reference is not a packed reference, either. */ - if (reading) { - return -1; - } else { - hashclr(sha1); - return 0; - } -} - int parse_ref(const char *path, struct strbuf *refname, unsigned char *sha1, int *flag) { @@ -1505,7 +1474,7 @@ int resolve_ref(const char *refname, struct strbuf *result, unsigned char *sha1, int reading, int *flag) { int depth = MAXDEPTH; - int ret = 0; + int ret; if (flag) *flag = 0; @@ -1518,24 +1487,45 @@ int resolve_ref(const char *refname, struct strbuf *result, strbuf_reset(result); strbuf_addstr(result, refname); - while (!ret) { + do { char path[PATH_MAX]; if (--depth 0) { errno = ELOOP; - ret = -1; - break; + return -1; } git_snpath(path, sizeof(path), %s, result-buf); ret = parse_ref(path, result, sha1, flag); - if (ret == -2) { - ret = handle_missing_loose_ref(result-buf, sha1, - reading, flag); - ret = ret ? -1 : 1; + } while (!ret); + + if (ret == 1) { + return 0; + } else if (ret == -2) { + /* +* The loose reference file does not exist; check for a packed +* reference. +*/ + struct ref_entry *entry; + + entry = get_packed_ref(result-buf); + if (entry) { + hashcpy(sha1, entry-u.value.sha1); + if (flag) + *flag |= REF_ISPACKED; + return 0; + } + + /* The reference is not a packed reference, either. */ + if (reading) { + return -1; + } else { + hashclr(sha1); + return 0; } + } else { + return -1; } - return ret 0 ? 0 : -1; } const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int reading, int *flag) -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 06/11] refs.c: move ref parsing code out of resolve_ref()
From: Nguyễn Thái Ngọc Duy pclo...@gmail.com This function will soon have a second caller. Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- I leave parse_ref() a public function for the reason explained by Duy [1]. [1] http://article.gmane.org/gmane.comp.version-control.git/254274 cache.h | 11 refs.c | 200 ++-- 2 files changed, 118 insertions(+), 93 deletions(-) diff --git a/cache.h b/cache.h index e36084d..d2b8399 100644 --- a/cache.h +++ b/cache.h @@ -1005,6 +1005,17 @@ extern int read_ref(const char *refname, unsigned char *sha1); extern const char *resolve_ref_unsafe(const char *ref, unsigned char *sha1, int reading, int *flag); extern char *resolve_refdup(const char *ref, unsigned char *sha1, int reading, int *flag); extern int resolve_ref(const char *refname, struct strbuf *result, unsigned char *sha1, int reading, int *flag); +/* + * Given a ref in ref and its path, returns + * + * -2 failed to open with ENOENT, the caller is responsible for + * checking missing loose ref (see resolve_ref for example) + * -1 if there's an error, refname can no longer be trusted, flag may + * be set. errno is valid. + * 0 this is a symref, refname now contains the linked ref + * +1 normal ref, sha1 is valid + */ +extern int parse_ref(const char *path, struct strbuf *refname, unsigned char *sha1, int *flag); extern int dwim_ref(const char *str, int len, unsigned char *sha1, char **ref); extern int dwim_log(const char *str, int len, unsigned char *sha1, char **ref); diff --git a/refs.c b/refs.c index 29ea7e0..3acd83e 100644 --- a/refs.c +++ b/refs.c @@ -1397,6 +1397,104 @@ static int handle_missing_loose_ref(const char *refname, } } +int parse_ref(const char *path, struct strbuf *refname, + unsigned char *sha1, int *flag) +{ + struct strbuf buffer = STRBUF_INIT; + struct stat st; + const char *buf; + int ret; + + /* +* We might have to loop back here to avoid a race condition: +* first we lstat() the file, then we try to read it as a link +* or as a file. But if somebody changes the type of the file +* (file - directory - symlink) between the lstat() and +* reading, then we don't want to report that as an error but +* rather try again starting with the lstat(). +*/ +stat_ref: + if (lstat(path, st) 0) + return errno == ENOENT ? -2 : -1; + + /* Follow normalized - ie refs/.. symlinks by hand */ + if (S_ISLNK(st.st_mode)) { + struct strbuf new_path = STRBUF_INIT; + if (strbuf_readlink(new_path, path, 256) 0) { + strbuf_release(new_path); + if (errno == ENOENT || errno == EINVAL) + /* inconsistent with lstat; retry */ + goto stat_ref; + else + return -1; + } + if (starts_with(new_path.buf, refs/) + !check_refname_format(new_path.buf, 0)) { + strbuf_reset(refname); + strbuf_addbuf(refname, new_path); + if (flag) + *flag |= REF_ISSYMREF; + strbuf_release(new_path); + return 0; + } + strbuf_release(new_path); + } + + /* Is it a directory? */ + if (S_ISDIR(st.st_mode)) { + errno = EISDIR; + return -1; + } + + /* +* Anything else, just open it and try to use it as +* a ref +*/ + if (strbuf_read_file(buffer, path, 256) 0) { + strbuf_release(buffer); + if (errno == ENOENT) + /* inconsistent with lstat; retry */ + goto stat_ref; + else + return -1; + } + strbuf_rtrim(buffer); + + if (skip_prefix(buffer.buf, ref:, buf)) { + /* It is a symbolic ref */ + if (flag) + *flag |= REF_ISSYMREF; + while (isspace(*buf)) + buf++; + if (check_refname_format(buf, REFNAME_ALLOW_ONELEVEL)) { + if (flag) + *flag |= REF_ISBROKEN; + strbuf_release(buffer); + errno = EINVAL; + return -1; + } + strbuf_reset(refname); + strbuf_add(refname, buf, buffer.buf + buffer.len - buf); + strbuf_release(buffer); + return 0; + } + + /* +* Please note that FETCH_HEAD has a second line +* containing other data. +*/ + if
[PATCH v2 10/11] resove_gitlink_packed_ref(): inline function
Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- refs.c | 31 +++ 1 file changed, 11 insertions(+), 20 deletions(-) diff --git a/refs.c b/refs.c index a7c8abd..8edcc3b 100644 --- a/refs.c +++ b/refs.c @@ -1280,25 +1280,6 @@ static struct ref_dir *get_loose_refs(struct ref_cache *refs) /* We allow recursive symbolic refs. Only within reason, though */ #define MAXDEPTH 5 -/* - * Called by resolve_gitlink_ref_recursive() after it failed to read - * from the loose refs in ref_cache refs. Find refname in the - * packed-refs file for the submodule. - */ -static int resolve_gitlink_packed_ref(struct ref_cache *refs, - const char *refname, unsigned char *sha1) -{ - struct ref_entry *ref; - struct ref_dir *dir = get_packed_refs(refs); - - ref = find_ref(dir, refname); - if (ref == NULL) - return -1; - - hashcpy(sha1, ref-u.value.sha1); - return 0; -} - int resolve_gitlink_ref(const char *path, const char *refname, unsigned char *sha1) { struct strbuf result = STRBUF_INIT; @@ -1331,7 +1312,17 @@ int resolve_gitlink_ref(const char *path, const char *refname, unsigned char *sh if (parseval == 1) { ret = 0; } else if (parseval == -2) { - ret = resolve_gitlink_packed_ref(refs, result.buf, sha1) ? -1 : 0; + /* Loose ref doesn't exist; check for a packed ref */ + struct ref_entry *entry; + struct ref_dir *dir = get_packed_refs(refs); + + entry = find_ref(dir, result.buf); + if (entry) { + hashcpy(sha1, entry-u.value.sha1); + ret = 0; + } else { + ret = -1; + } } else { ret = -1; } -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 00/11] Consolidate ref parsing code
This is a rif on Duy's oldish patch series [1]. I started reviewing his patch series, but found that some of his patches did multiple things, making it harder to review. I started pulling it apart into smaller changes to aid my review, and I guess I got carried away :-/ As far as I know, Duy isn't actively working on this, so I hope my reroll is not unwelcome. As far as I can tell, Duy's patch series was correct aside from a couple of minor cosmetic blemishes [2]. So if you want to accept Duy's original patch series, I hereby endorse it. This version does the following things beyond Duy's original: * Split commits up into smaller pieces. * Get rid of the MAXREFLEN limitation. * Rename the parse_ref() parameter from ref to refname. * Inline resolve_gitlink_packed_ref() and handle_missing_loose_ref(). * Invert the if statement for dealing with symbolic references in parse_ref() to make the logic flow more linear. * Change a couple of while loops to do..while. * Change resolve_refdup() to return strbuf_detach(buf, NULL) instead of buf.buf directly (as suggested by Eric Sunshine). I retained Duy as author on commits that are derived straightforwardly from his. I hope I haven't broken any of them. I am calling this patch series v2 because I propose it as a successor to Duy's version. [1] http://thread.gmane.org/gmane.comp.version-control.git/254203/focus=254203 [2] Aside from the couple of things pointed out on the mailing list, * The parse_ref() parameter should be named refname instead of ref, for consistency with other refs code. * The local variable ref = result-buf in resolve_ref() just obscures things and should be inlined. Michael Haggerty (5): resolve_ref_unsafe(): reverse the logic of the symref conditional handle_missing_loose_ref(): inline function resolve_gitlink_ref_recursive(): drop arbitrary refname length limit resove_gitlink_packed_ref(): inline function resolve_gitlink_ref(): remove redundant test Nguyễn Thái Ngọc Duy (6): strbuf_read_file(): preserve errno on failure handle_missing_loose_ref(): return an int resolve_ref_unsafe(): use skip_prefix() to skip over ref: refs.c: refactor resolve_ref_unsafe() to use strbuf internally refs.c: move ref parsing code out of resolve_ref() refs.c: rewrite resolve_gitlink_ref() to use parse_ref() cache.h | 12 +++ refs.c | 371 +++ strbuf.c | 7 +- 3 files changed, 200 insertions(+), 190 deletions(-) -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 04/11] resolve_ref_unsafe(): use skip_prefix() to skip over ref:
From: Nguyễn Thái Ngọc Duy pclo...@gmail.com This requires buf to be declared (const char *), which is how it is used anyway. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- refs.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/refs.c b/refs.c index 771941b..020ee3f 100644 --- a/refs.c +++ b/refs.c @@ -1416,7 +1416,7 @@ const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int rea for (;;) { char path[PATH_MAX]; struct stat st; - char *buf; + const char *buf; int fd; if (--depth 0) { @@ -1497,11 +1497,10 @@ const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int rea len--; buffer[len] = '\0'; - if (starts_with(buffer, ref:)) { + if (skip_prefix(buffer, ref:, buf)) { /* It is a symbolic ref */ if (flag) *flag |= REF_ISSYMREF; - buf = buffer + 4; while (isspace(*buf)) buf++; if (check_refname_format(buf, REFNAME_ALLOW_ONELEVEL)) { -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
İngilizce artık daha kolay .
Merhaba ! Ekim Kampanyasından faydalanmak ve özel ders ingilizce,toefl,ielts , fransızca eğitimlerimiz için bizimle iletişime geçebilirsiniz. . Not: Yerlerimiz SINIRLI sayıdadır. Saygılarımızla KANADA KÜLTÜR MERKEZİ 0212 252 90 35-36 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Multiple worktrees vs. submodules fixes
Duy Nguyen pclo...@gmail.com writes: On Wed, Oct 15, 2014 at 3:31 AM, Max Kirillov m...@max630.net wrote: On Tue, Oct 14, 2014 at 07:09:45PM +0200, Jens Lehmann wrote: Until that problem is solved it looks wrong to pass GIT_COMMON_DIR into submodule recursion, I believe GIT_COMMON_DIR should be added to the local_repo_env array (and even if it is passed on later, we might have to append /modules/submodule_name to make it point to the correct location). Actually, why there should be an _environment_ variable GIT_COMMON_DIR at all? I mean, gitdir is resolved to some directory (through link or environment), and it contains the shared data directly or referes to it with the commondir link. In which case anyone would want to override that location? It's how the implementation was built up. First I split the repo in two and I need something to point the new/shared part. $GIT_DIR/worktrees comes later to glue them up. Keeping it an environment variable gives us a possibility to glue things up in a different way than using $GIT_DIR/worktrees. Of course the odds of anybody doing that are probably small or even near zero. Still consuming the rest of this thread. Thanks all for working on the submodule support for this. Hmph. I was hoping that the multiple-work-trees topic was ready for 'next' by now, but we may want to wait to see how the interaction with submodule plays out to have another chance of a clean reroll before it happens. This is a topic with large impact and is quite intrusive code-wise, even though the intrusive parts are cleanly done. So we'd want to take more time to unleash it to the general public than more usual small scale topics, anyway. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] clone: --dissociate option to mark that reference is only temporary
Marc Branchaud marcn...@xiplink.com writes: I think things would be more understandable if the option was --dissociate repository and was an explicit alternative to --reference: [[--reference | --dissociate] repository] I'm still not liking the name --dissociate though. The original suggestion of --borrow is better. Perhaps --library or --local-cache? I dunno... I was not thinking when I originally started the topic with --borrow, until I realized that it would not make much sense, primarily because we allow multiple references. What should this command line do, and how would you implement such a behaviour? $ git clone \ --reference=/local/pool/linux.git \ --borrow=../my/neighbour/linux-hack.git \ git://git.kernel.org/./linux.git With do the usual --reference thing, but then dissociate the result from referents option, there is no ambiguity and that is why I did not go with the --borrow option suggested in the original thread. So now I'm wondering if the implementation would be more efficient as an extension of the --local operation. That is, instead of a post-clone repack, do a --local clone first followed by a simple git fetch from the source repo. The network overhead may be comparable to the --reference optimization, but if your clone --local ends up copying (instead of hard-linking), the initial cost to copy locally would be a pure extra price over clone --reference and then --dissociate. If the local clone uses hard-linking, it would be cheaper, but it still costs more than dropping an entry into .git/objects/info/alternates, I would imagine. You will pay with your scheme the same cost to run repack -a -d, which is paid by --dissociate at the end of clone, eventually at the first gc, so there is no efficiency advantage, either. The above is my knee-jerk assessment without any measuring, though. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Allow building Git with Asciidoctor
brian m. carlson sand...@crustytoothpaste.net writes: On Tue, Oct 14, 2014 at 10:08:19AM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: On Sat, Oct 11, 2014 at 11:37:32PM +, brian m. carlson wrote: Specifically I'm not excited about getting into a state where we have to maintain both an asciidoc.conf file _and_ ruby extensions for asciidoctor. I don't mind if somebody wants to step up and keep the asciidoctor bits in sync with the asciidoc.conf, but I feel like one of them needs to be considered the master. I do not mind to have the machinery to run AsciiDoctor too much in my tree. It may make it easier for those who use it to spot places in *.txt that need (in)compatibility workarounds between the two formatters than keeping it outside. Alternately, I'm happy to be responsible for maintaining the extensions.rb file. Let's see how well the patches 3 and 4 work for other people with AsciiDoctor and then decide to go in that direction. I do not forsee that changes to allow our *.txt to be used with AsciiDoctor interfere with what GitHub folks do with their own documentation toolchain, but I am not sure how the AsciiDoctor specific alternative build infrastructure we would eventually ship would interact with them---maybe they are not affected at all, or maybe they can even take advantage of it. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] subtree: Add an install-html target
Also adjust ignore rules accordingly. Signed-off-by: Sebastian Schuberth sschube...@gmail.com --- contrib/subtree/.gitignore | 3 ++- contrib/subtree/Makefile | 9 +++-- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/contrib/subtree/.gitignore b/contrib/subtree/.gitignore index 91360a3..0b9381a 100644 --- a/contrib/subtree/.gitignore +++ b/contrib/subtree/.gitignore @@ -1,6 +1,7 @@ *~ git-subtree -git-subtree.xml git-subtree.1 +git-subtree.html +git-subtree.xml mainline subproj diff --git a/contrib/subtree/Makefile b/contrib/subtree/Makefile index c2bd703..3071baf 100644 --- a/contrib/subtree/Makefile +++ b/contrib/subtree/Makefile @@ -5,9 +5,10 @@ all:: -include ../../config.mak prefix ?= /usr/local -mandir ?= $(prefix)/share/man gitexecdir ?= $(prefix)/libexec/git-core +mandir ?= $(prefix)/share/man man1dir ?= $(mandir)/man1 +htmldir ?= $(prefix)/share/doc/git-doc ../../GIT-VERSION-FILE: FORCE $(MAKE) -C ../../ GIT-VERSION-FILE @@ -49,12 +50,16 @@ install: $(GIT_SUBTREE) $(INSTALL) -d -m 755 $(DESTDIR)$(gitexecdir) $(INSTALL) -m 755 $(GIT_SUBTREE) $(DESTDIR)$(gitexecdir) -install-doc: install-man +install-doc: install-man install-html install-man: $(GIT_SUBTREE_DOC) $(INSTALL) -d -m 755 $(DESTDIR)$(man1dir) $(INSTALL) -m 644 $^ $(DESTDIR)$(man1dir) +install-html: $(GIT_SUBTREE_HTML) + $(INSTALL) -d -m 755 $(DESTDIR)$(htmldir) + $(INSTALL) -m 644 $^ $(DESTDIR)$(htmldir) + $(GIT_SUBTREE_DOC): $(GIT_SUBTREE_XML) $(XMLTO) -m $(MANPAGE_XSL) man $^ -- 1.9.4.msysgit.2 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] contacts: Add a Makefile to generate docs and install
Also add a gitignore file for generated files. Signed-off-by: Sebastian Schuberth sschube...@gmail.com --- contrib/contacts/.gitignore | 3 ++ contrib/contacts/Makefile | 71 + 2 files changed, 74 insertions(+) create mode 100644 contrib/contacts/.gitignore create mode 100644 contrib/contacts/Makefile diff --git a/contrib/contacts/.gitignore b/contrib/contacts/.gitignore new file mode 100644 index 000..bc9ae70 --- /dev/null +++ b/contrib/contacts/.gitignore @@ -0,0 +1,3 @@ +git-contacts.1 +git-contacts.html +git-contacts.xml diff --git a/contrib/contacts/Makefile b/contrib/contacts/Makefile new file mode 100644 index 000..a2990f0 --- /dev/null +++ b/contrib/contacts/Makefile @@ -0,0 +1,71 @@ +# The default target of this Makefile is... +all:: + +-include ../../config.mak.autogen +-include ../../config.mak + +prefix ?= /usr/local +gitexecdir ?= $(prefix)/libexec/git-core +mandir ?= $(prefix)/share/man +man1dir ?= $(mandir)/man1 +htmldir ?= $(prefix)/share/doc/git-doc + +../../GIT-VERSION-FILE: FORCE + $(MAKE) -C ../../ GIT-VERSION-FILE + +-include ../../GIT-VERSION-FILE + +# this should be set to a 'standard' bsd-type install program +INSTALL ?= install +RM ?= rm -f + +ASCIIDOC = asciidoc +XMLTO= xmlto + +ifndef SHELL_PATH + SHELL_PATH = /bin/sh +endif +SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH)) + +ASCIIDOC_CONF = ../../Documentation/asciidoc.conf +MANPAGE_XSL = ../../Documentation/manpage-normal.xsl + +GIT_CONTACTS := git-contacts + +GIT_CONTACTS_DOC := git-contacts.1 +GIT_CONTACTS_XML := git-contacts.xml +GIT_CONTACTS_TXT := git-contacts.txt +GIT_CONTACTS_HTML := git-contacts.html + +doc: $(GIT_CONTACTS_DOC) $(GIT_CONTACTS_HTML) + +install: $(GIT_CONTACTS) + $(INSTALL) -d -m 755 $(DESTDIR)$(gitexecdir) + $(INSTALL) -m 755 $(GIT_CONTACTS) $(DESTDIR)$(gitexecdir) + +install-doc: install-man install-html + +install-man: $(GIT_CONTACTS_DOC) + $(INSTALL) -d -m 755 $(DESTDIR)$(man1dir) + $(INSTALL) -m 644 $^ $(DESTDIR)$(man1dir) + +install-html: $(GIT_CONTACTS_HTML) + $(INSTALL) -d -m 755 $(DESTDIR)$(htmldir) + $(INSTALL) -m 644 $^ $(DESTDIR)$(htmldir) + +$(GIT_CONTACTS_DOC): $(GIT_CONTACTS_XML) + $(XMLTO) -m $(MANPAGE_XSL) man $^ + +$(GIT_CONTACTS_XML): $(GIT_CONTACTS_TXT) + $(ASCIIDOC) -b docbook -d manpage -f $(ASCIIDOC_CONF) \ + -agit_version=$(GIT_VERSION) $^ + +$(GIT_CONTACTS_HTML): $(GIT_CONTACTS_TXT) + $(ASCIIDOC) -b xhtml11 -d manpage -f $(ASCIIDOC_CONF) \ + -agit_version=$(GIT_VERSION) $^ + +clean: + $(RM) $(GIT_CONTACTS) + $(RM) *.xml *.html *.1 + +.PHONY: FORCE -- 1.9.4.msysgit.2 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Multiple worktrees vs. submodules fixes
Am 15.10.2014 um 00:15 schrieb Max Kirillov: On Tue, Oct 14, 2014 at 09:51:22PM +0200, Jens Lehmann wrote: Am 14.10.2014 um 20:34 schrieb Max Kirillov: But here are a lot of nuances. For example, it makes sense to have a superproject checkout without submodules being initialized (so that they don't waste space and machine time for working tree, which often is more than repository data). Hmm, I'm not sure if this is a problem. If the GIT_COMMON_DIR does have the submodule repo but it isn't initialized locally, we shouldn't have a problem (except for wasting some disk space if not a single checkout-to superproject initializes this submodule). If initially a repository is clone without submodules, it will not have anything in the GIT_COMMON_DIR. Ok. And if GIT_COMMON_DIR does not have the submodule repo yet, wouldn't it be cloned the moment we init the submodule in the checkout-to? Or would that need extra functionality? I cannot say I like this. Network operations should be caused only by clone and submodules. Sure (and please add fetch to the list ;-). Maybe I confused you by saying init when I meant the submodule update run after initializing the submodule? I think the logic can be simple: it a submodule is not checked-out in the repository checkout --to is called from, then it is not checked-out to the new one also. If it is, then checkout calls itself recursively in the submodule and works like being run in standalone repository. But when I later decide to populate the submodule in a checkout --to work tree, should it automagically also use the central storage, creating the modules/name directory there if it doesn't exist yet? I think that'd make sense to avoid having the work tree layout depend on the order commands were ran in. And imagine new submodules, they should not be handled differently from those already present. Then, a checkout copy of a submodule can be standalone (for example, git and git-html-docs are submodules of msysgit). Or, it can even belong to some other superproject. And in that cases they still should be able to be linked. Maybe such configurations would have to be handled manually to achieve maximum savings. At least I could live with that. To make manual handling of the cases, and to skip checking-out a module. I would think about the following interface: $ git checkout --to ... - does not checkout submodules, creates empty directory. This is what checkout should always do (at least until it learns --recurse-submodules, then it would populate the submodule directories). $ git checkout --recursive --to ... - if a submodule is checked-out in source repository, recursed there and run checkout --recursive again. If a submodule is not checked-out, does not checkout it, creates an empty directory. Hmm, I believe that when the user requests recursion explicitly it should always be checked out, no matter in what state the GIT_COMMON_DIR is in. Otherwise we'll see problems when a new submodule shows up and is populated depending on the point in time the checkout --to was done ... not good. By the way, I have found your branch recursive_submodule_checkout. Would you like to revive it? Then it can be done with the same option. No need to revive it, I'm currently working on that branch whenever I find some time (but I'll still need some time before I can post the next iteration). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mergetools/meld: do not rely on the output of `meld --help`
David Aguilar dav...@gmail.com writes: We cannot rely on the output of `meld --help` when determining whether or not meld understands the --output option. Newer versions of meld print a generic help message that does not mention --output even though it is supported. This obviously breaks those who have happily been using their installed version of meld that understands and shows --output in the help text. Is that a minority that is rapidly diminishing? I would understand it if the change were - a configuration tells us to use or not use --output; when it is set, then we do not try auto-detect by reading --help output - when that new configuration is not set, we keep the current code to read --help output, which may fail for recent meld but that is not a regression. When versions of meld that support --output but do not mention it in their --help text are overwhelming majority, we would want to flip the fallback codepath from read --help and decide to assume that --output can be used, but I do not know if now is the time to do so. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mergetools/meld: do not rely on the output of `meld --help`
Junio C Hamano gits...@pobox.com writes: This obviously breaks those who have happily been using their installed version of meld that understands and shows --output in the help text. Is that a minority that is rapidly diminishing? I would understand it if the change were - a configuration tells us to use or not use --output; when it is set, then we do not try auto-detect by reading --help output - when that new configuration is not set, we keep the current code to read --help output, which may fail for recent meld but that is not a regression. When versions of meld that support --output but do not mention it in their --help text are overwhelming majority, we would want to flip the fallback codepath from read --help and decide to assume that --output can be used, but I do not know if now is the time to do so. In other words, I am wondering if a milder fix would be along this line of change instead. Older versions seem to list --output explicitly, and we assume newer ones including the one reported by Andrey begin their output like so: $ meld --help Usage: meld [OPTION...] hence we catch either of these patterns. mergetools/meld | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mergetools/meld b/mergetools/meld index cb672a5..b6169c9 100644 --- a/mergetools/meld +++ b/mergetools/meld @@ -23,8 +23,12 @@ check_meld_for_output_version () { meld_path=$(git config mergetool.meld.path) meld_path=${meld_path:-meld} - if $meld_path --help 21 | grep -e --output /dev/null + if meld_has_output_option=$(git config --bool mergetool.meld.hasOutput) then + : use whatever is configured + elif $meld_path --help 21 | grep -e '--output=' -e '\[OPTION\.\.\.\]' /dev/null + then + : old ones explicitly listed --output and new ones just say OPTION... meld_has_output_option=true else meld_has_output_option=false -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mergetool: add an option for writing to a temporary directory
- Ursprungligt meddelande - Från: David Aguilar dav...@gmail.com Till: Junio C Hamano gits...@pobox.com Kopia: Robin Rosenberg robin.rosenb...@dewire.com, git@vger.kernel.org, Charles Bailey char...@hashpling.org Skickat: onsdag, 15 okt 2014 8:38:49 Ämne: Re: [PATCH] mergetool: add an option for writing to a temporary directory On Mon, Oct 13, 2014 at 12:24:41PM -0700, Junio C Hamano wrote: David Aguilar dav...@gmail.com writes: Teach mergetool to write files in a temporary directory when 'mergetool.writeToTemp' is true. This is helpful for tools such as Eclipse which cannot cope with multiple copies of the same file in the worktree. With this can we drop the change the temporary file name patch by Robin Rosenberg? http://thread.gmane.org/gmane.comp.version-control.git/255457/focus=255599 Message-Id: 1408607240-11369-1-git-send-email-robin.rosenb...@dewire.com I would think so but I'm biased ;-) The new patch solves my problem. -- robin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Custom hunk-header with ignore case setting
Thomas Braun thomas.br...@virtuell-zuhause.de writes: I've seen that the builtin diff patterns in userdiff.c can be specified ignoring case using the IPATTERN macro. One of the possible solutions would be to patch userdiff.c (patch courtesy of Johannes Schindelin): -- snip -- diff --git a/userdiff.c b/userdiff.c index fad52d6..f089e50 100644 --- a/userdiff.c +++ b/userdiff.c @@ -228,6 +228,9 @@ int userdiff_config(const char *k, const char *v) return parse_funcname(drv-funcname, k, v, 0); if (!strcmp(type, xfuncname)) return parse_funcname(drv-funcname, k, v, REG_EXTENDED); + if (!strcmp(type, ixfuncname)) + return parse_funcname(drv-funcname, k, v, + REG_EXTENDED | REG_ICASE); if (!strcmp(type, binary)) return parse_tristate(drv-binary, k, v); if (!strcmp(type, command)) I am not sure if we care deeply about supporting case insensitive payload in the first place, but the above change, unlike other possibilities, adds only small burden to the end users' cognitive load, and it looks like a sensible way to go forward. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] clone: --dissociate option to mark that reference is only temporary
Am 14.10.2014 um 21:57 schrieb Junio C Hamano: +static void dissociate_from_references(void) +{ + struct child_process cmd; + + memset(cmd, 0, sizeof(cmd)); We have CHILD_PROCESS_INIT these days. + argv_array_pushl(cmd.args, repack, -a, -d, NULL); + cmd.git_cmd = 1; + cmd.out = -1; This requests a pipe, but you don't use it nor need it. + cmd.no_stdin = 1; + if (run_command(cmd)) + die(_(cannot repack to clean up)); + if (unlink(git_path(objects/info/alternates)) errno != ENOENT) + die_errno(_(cannot unlink temporary alternates file)); +} Unless you have a secret plan, you can do it even shorter with our helpers: diff --git a/builtin/clone.c b/builtin/clone.c index 8649090..81efb07 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -743,14 +743,9 @@ static void write_refspec_config(const char *src_ref_prefix, static void dissociate_from_references(void) { - struct child_process cmd; - - memset(cmd, 0, sizeof(cmd)); - argv_array_pushl(cmd.args, repack, -a, -d, NULL); - cmd.git_cmd = 1; - cmd.out = -1; - cmd.no_stdin = 1; - if (run_command(cmd)) + static const char* argv[] = { repack, -a, -d, NULL }; + + if (run_command_v_opt(argv, RUN_GIT_CMD|RUN_COMMAND_NO_STDIN)) die(_(cannot repack to clean up)); if (unlink(git_path(objects/info/alternates)) errno != ENOENT) die_errno(_(cannot unlink temporary alternates file)); -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] git-prompt.sh: Option to hide prompt for ignored pwd
On 2014-10-15 00:06, Jess Austin wrote: @@ -501,6 +506,13 @@ __git_ps1 () local f=$w$i$s$u local gitstring=$c$b${f:+$z$f}$r$p + if [ -n ${GIT_PS1_HIDE_IF_PWD_IGNORED} ] +[ $(git config --bool bash.hideIfPwdIgnored) != false ] +git check-ignore -q . + then + printf_format= + fi + if [ $pcmode = yes ]; then if [ ${__git_printf_supports_v-} != yes ]; then gitstring=$(printf -- $printf_format $gitstring) This is broken in pcmode due to a Bash bug. The command: printf -v foo asdf is a no-op in Bash. The variable foo is never changed in any way -- it is neither unset nor set to the empty string. Also, I noticed that I get an error message if I cd into .git: fatal: This operation must be run in a work tree I think the following change will fix the above issues, and it has the advantage of avoiding unnecessary work if the directory is ignored: diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh index 6a4ce53..68ac82a 100644 --- a/contrib/completion/git-prompt.sh +++ b/contrib/completion/git-prompt.sh @@ -374,6 +374,17 @@ __git_ps1 () local inside_gitdir=${repo_info##*$'\n'} local g=${repo_info%$'\n'*} + if [ true = $inside_worktree ] + [ -n ${GIT_PS1_HIDE_IF_PWD_IGNORED} ] + [ $(git config --bool bash.hideIfPwdIgnored) != false ] + git check-ignore -q . + then + if [ $pcmode = yes ]; then + PS1=$ps1pc_start$ps1pc_end + fi + return + fi + local r= local b= local step= @@ -506,13 +517,6 @@ __git_ps1 () local f=$w$i$s$u local gitstring=$c$b${f:+$z$f}$r$p - if [ -n ${GIT_PS1_HIDE_IF_PWD_IGNORED} ] - [ $(git config --bool bash.hideIfPwdIgnored) != false ] - git check-ignore -q . - then - printf_format= - fi - if [ $pcmode = yes ]; then if [ ${__git_printf_supports_v-} != yes ]; then gitstring=$(printf -- $printf_format $gitstring) It would be good to add additional test cases for pcmode (two or three arguments to __git_ps1) and 'cd .git' so that the above issues don't reappear. Thanks, Richard -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/4] Documentation: move some AsciiDoc parameters into variables
brian m. carlson sand...@crustytoothpaste.net writes: Asciidoctor takes slightly different arguments from AsciiDoc in some cases. It has a different name for the HTML backend and the docbook backend produces DocBook 5, not DocBook 4.5. Also, Asciidoctor does not accept the -f option. Move these values into variables so that they can be overridden by users wishing to use Asciidoctor instead of Asciidoc. Signed-off-by: brian m. carlson sand...@crustytoothpaste.net --- I think it makes sense to make these customizable, but I wonder if it makes the result easier to maintain if we make units of logical definitions larger, e.g. ASCIIDOC = asciidoc TXT_TO_MANHTML = $(ASCIIDOC) -b xhtml11 -d manpage $(ASCIIDOC_CONF) TXT_TO_ARTICLE = $(ASCIIDOC) -b docbook -d article ... ASCIIDOC_EXTRA may want to apply all of them, even though I see that we do not feed it to OBSOLETE_HTML right now. It may also be that $(ASCIIDOC_CONF) and -agit_version=$(GIT_VERSION) could be shared among the ones that currently do not have. Then the above would become something like: ASCIIDOC = asciidoc ASCIIDOC_COMMON = $(ASCIIDOC) \ $(ASCIIDOC_EXTRA) $(ASCIIDOC_CONF) -agit_version=$(GIT_VERSION) TXT_TO_MANHTML = $(ASCIIDOC_COMMON) -b xhtml11 -d manpage ... and would further simplify this part $(MAN_HTML): %.html : %.txt asciidoc.conf $(QUIET_ASCIIDOC)$(RM) $@+ $@ \ - $(ASCIIDOC) -b xhtml11 -d manpage -f asciidoc.conf \ + $(ASCIIDOC) -b $(ASCIIDOC_HTML) -d manpage $(ASCIIDOC_CONF) \ $(ASCIIDOC_EXTRA) -agit_version=$(GIT_VERSION) -o $@+ $ \ into just $(TXT_TO_MANHTML) -o $@+ $ After all, our output formats are fairly limited, I would think. Are there too many different variants and exceptions to make such an approach infeasible? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] clone: --dissociate option to mark that reference is only temporary
On 14-10-15 01:29 PM, Junio C Hamano wrote: Marc Branchaud marcn...@xiplink.com writes: I think things would be more understandable if the option was --dissociate repository and was an explicit alternative to --reference: [[--reference | --dissociate] repository] I'm still not liking the name --dissociate though. The original suggestion of --borrow is better. Perhaps --library or --local-cache? I dunno... I was not thinking when I originally started the topic with --borrow, until I realized that it would not make much sense, primarily because we allow multiple references. I hadn't realized that was possible. There's no indication in the man page that multiple --references are allowed (or forbidden, for that matter). What should this command line do, and how would you implement such a behaviour? $ git clone \ --reference=/local/pool/linux.git \ --borrow=../my/neighbour/linux-hack.git \ git://git.kernel.org/./linux.git With do the usual --reference thing, but then dissociate the result from referents option, there is no ambiguity and that is why I did not go with the --borrow option suggested in the original thread. I had not considered this case. My limited imagination has a hard time coming up with a scenario where more than one --reference (or --borrow/--dissociate) would make sense. In this example, the --borrow seems useless. How would clone decide that it even needed objects from the neighbour repo? None of the refs on gko need any of the neighbour's unique objects. (I get the feeling I don't understand how clone works...) So now I'm wondering if the implementation would be more efficient as an extension of the --local operation. That is, instead of a post-clone repack, do a --local clone first followed by a simple git fetch from the source repo. The network overhead may be comparable to the --reference optimization, but if your clone --local ends up copying (instead of hard-linking), the initial cost to copy locally would be a pure extra price over clone --reference and then --dissociate. If the local clone uses hard-linking, it would be cheaper, but it still costs more than dropping an entry into .git/objects/info/alternates, I would imagine. You will pay with your scheme the same cost to run repack -a -d, which is paid by --dissociate at the end of clone, eventually at the first gc, so there is no efficiency advantage, either. The above is my knee-jerk assessment without any measuring, though. That makes sense to me, at least. I agree with your assessment. M. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/5] t7610-mergetool: prefer test_config over git config
David Aguilar dav...@gmail.com writes: Signed-off-by: David Aguilar dav...@gmail.com --- The reason why this change favours test_config over git config is not described here, but if we think about that, some of the changes in this may become questionable. diff --git a/t/t7610-mergetool.sh b/t/t7610-mergetool.sh index 1a15e06..7eeb207 100755 --- a/t/t7610-mergetool.sh +++ b/t/t7610-mergetool.sh @@ -14,7 +14,7 @@ Testing basic merge tool invocation' # running mergetool test_expect_success 'setup' ' - git config rerere.enabled true + test_config rerere.enabled true echo master file1 echo master spaced spaced name echo master file11 file11 This one does not have corresponding git config to remove the configuration when the setup is done, so this changes the behaviour. The remainder of the tests will run without rerere.enabled set. If this change does not make a difference, perhaps we do not even need to set rerere.enabled here in the first place? But do later tests perform merges that can potentially be affected by the setting of rerere.enabled? If so, this change can break them. If this change does not break existing tests, I would say this is a good change that anticipates that we may add more tests in the future that work in subdir and that makes sure to leave subdir in a predictable state for them. @@ -112,7 +112,7 @@ test_expect_success 'custom mergetool' ' ' test_expect_success 'mergetool crlf' ' - git config core.autocrlf true + test_config core.autocrlf true git checkout -b test2 branch1 test_must_fail git merge master /dev/null 21 ( yes | git mergetool file1 /dev/null 21 ) @@ -129,7 +129,7 @@ test_expect_success 'mergetool crlf' ' git submodule update -N test $(cat submod/bar) = master submodule git commit -m branch1 resolved with mergetool - autocrlf - git config core.autocrlf false + test_config core.autocrlf false git reset --hard ' This one wants to set core.autocrlf to true while it runs, and then wants to reset to the original. When the test fails in the middle, however, we may abort this test and move on to the next one, while leaving core.autcrlf still set to true, which may break later tests, hence use of test_config to ensure that the setting is reverted at the end of the test is the right thing to do. So the hunk #112 is correct, but the hunk #129 is questionable and even misleading. @@ -176,7 +176,7 @@ test_expect_success 'mergetool skips autoresolved' ' test_expect_success 'mergetool merges all from subdir' ' ( cd subdir - git config rerere.enabled false + test_config rerere.enabled false test_must_fail git merge master ( yes r | git mergetool ../submod ) ( yes d d | git mergetool --no-prompt ) Same comment as the one for hunk #14 applies here. In principle, this change will make this test more isolated and is a good thing. @@ -190,7 +190,7 @@ test_expect_success 'mergetool merges all from subdir' ' ' test_expect_success 'mergetool skips resolved paths when rerere is active' ' - git config rerere.enabled true + test_config rerere.enabled true rm -rf .git/rr-cache git checkout -b test5 branch1 git submodule update -N Ditto. @@ -204,7 +204,7 @@ test_expect_success 'mergetool skips resolved paths when rerere is active' ' ' test_expect_success 'conflicted stash sets up rerere' ' - git config rerere.enabled true + test_config rerere.enabled true git checkout stash1 echo Conflicting stash content file11 git stash Ditto. @@ -232,7 +232,7 @@ test_expect_success 'conflicted stash sets up rerere' ' test_expect_success 'mergetool takes partial path' ' git reset --hard - git config rerere.enabled false + test_config rerere.enabled false git checkout -b test12 branch1 git submodule update -N test_must_fail git merge master Ditto. @@ -505,14 +505,12 @@ test_expect_success 'file with no base' ' test_expect_success 'custom commands override built-ins' ' git checkout -b test14 branch1 - git config mergetool.defaults.cmd cat \\$REMOTE\ \\$MERGED\ - git config mergetool.defaults.trustExitCode true + test_config mergetool.defaults.cmd cat \\$REMOTE\ \\$MERGED\ + test_config mergetool.defaults.trustExitCode true test_must_fail git merge master git mergetool --no-prompt --tool defaults -- both echo master both added expected test_cmp both expected - git config --unset mergetool.defaults.cmd - git config --unset mergetool.defaults.trustExitCode git reset --hard master /dev/null 21 ' This one is good; with test_config, you do not have to do the unsetting yourself. -- To unsubscribe from this list: send the line
Re: [PATCH] clone: --dissociate option to mark that reference is only temporary
Marc Branchaud marcn...@xiplink.com writes: On 14-10-15 01:29 PM, Junio C Hamano wrote: $ git clone \ --reference=/local/pool/linux.git \ --borrow=../my/neighbour/linux-hack.git \ git://git.kernel.org/./linux.git With do the usual --reference thing, but then dissociate the result from referents option, there is no ambiguity and that is why I did not go with the --borrow option suggested in the original thread. I had not considered this case. My limited imagination has a hard time coming up with a scenario where more than one --reference (or In this example, the --borrow seems useless. How would clone decide that it even needed objects from the neighbour repo? None of the refs on gko need any of the neighbour's unique objects. A probable scenario might go like this. The company-wide pool is designed for everybody's use and will stay, even if it lags behind because it fetches every other day, so it is safe to keep referring to via alternates. My neighbour is following the linux-next repository and has changes that are meant to land in the future to the mainline, but it can disappear without notice so I cannot afford to depend on its presense forever. Under that particular scenario, what should happen is fairly clear; we want to dissociate from neibour's immediately after clone is done, while being still dependent on the shared pool. But there is the question of how would you implement such a behaviour (even if you know that is the single only behaviour you would want to see). Also I am not confident enough that it is the only plausible way any user may want to mix reference and borrow together. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] clone: --dissociate option to mark that reference is only temporary
Johannes Sixt j...@kdbg.org writes: Unless you have a secret plan, you can do it even shorter with our helpers: Thanks. No there isn't a secret plan. It was just me being rusty. diff --git a/builtin/clone.c b/builtin/clone.c index 8649090..81efb07 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -743,14 +743,9 @@ static void write_refspec_config(const char *src_ref_prefix, static void dissociate_from_references(void) { - struct child_process cmd; - - memset(cmd, 0, sizeof(cmd)); - argv_array_pushl(cmd.args, repack, -a, -d, NULL); - cmd.git_cmd = 1; - cmd.out = -1; - cmd.no_stdin = 1; - if (run_command(cmd)) + static const char* argv[] = { repack, -a, -d, NULL }; + + if (run_command_v_opt(argv, RUN_GIT_CMD|RUN_COMMAND_NO_STDIN)) die(_(cannot repack to clean up)); if (unlink(git_path(objects/info/alternates)) errno != ENOENT) die_errno(_(cannot unlink temporary alternates file)); -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v23 0/25] rs/ref-transaction (Use ref transactions, part 3)
Thanks; queued. Hopefully we can merge to 'next' and go incremental. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] clone: --dissociate option to mark that reference is only temporary
On 14-10-15 05:33 PM, Junio C Hamano wrote: Marc Branchaud marcn...@xiplink.com writes: On 14-10-15 01:29 PM, Junio C Hamano wrote: $ git clone \ --reference=/local/pool/linux.git \ --borrow=../my/neighbour/linux-hack.git \ git://git.kernel.org/./linux.git With do the usual --reference thing, but then dissociate the result from referents option, there is no ambiguity and that is why I did not go with the --borrow option suggested in the original thread. I had not considered this case. My limited imagination has a hard time coming up with a scenario where more than one --reference (or In this example, the --borrow seems useless. How would clone decide that it even needed objects from the neighbour repo? None of the refs on gko need any of the neighbour's unique objects. A probable scenario might go like this. The company-wide pool is designed for everybody's use and will stay, even if it lags behind because it fetches every other day, so it is safe to keep referring to via alternates. My neighbour is following the linux-next repository and has changes that are meant to land in the future to the mainline, but it can disappear without notice so I cannot afford to depend on its presense forever. Under that particular scenario, what should happen is fairly clear; we want to dissociate from neibour's immediately after clone is done, while being still dependent on the shared pool. Yes, but we're cloning gko, not the neighbour. Doesn't that mean that the clone operation won't know about any of the neighbour's refs? In order to get any of the neighbour's refs (and its unique objects) you have to either clone the neighbour directly or (post-clone) fetch from it, no? M. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] clone: --dissociate option to mark that reference is only temporary
Marc Branchaud marcn...@xiplink.com writes: Yes, but we're cloning gko, not the neighbour. Doesn't that mean that the clone operation won't know about any of the neighbour's refs? No. --reference (and a natural implementation of --borrow, I would imagine) peeks the refs of the repository we borrow from and that is how clone can say I already have objects reachable from these refs, so please send me the remainder to the repository it is cloning from. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/25] prune-safety
Here's a re-roll of the patch series I posted earlier to make git prune keep more contiguous chunks of the object graph. The cleanups to t5304 were spun off into their own series, and are dropped here. However, the other patches seem to have multiplied in number (I must have fed them after midnight). Here are the changes since the first round (thanks everybody for your comments): - fix bogus return values from freshen_file, foreach_alt_odb, and for_each_packed_object - make for_each_object_in_pack static - clarify commit message for keep objects reachable from recent objects patch (this was the one that confused Junio, and I elaborated based on our discussion) - clarify the definition of loose object dirs in the comment above for_each_loose_file_in_object_dir - in for_each_loose_file, traverse hashed loose object directories in numeric order, and pass the number to the subdir callback (this is used by prune-packed for its progress updates); as a side effect, this fixes the bugs Michael noticed with the subdir callback. - prune-packed now reuses the for_each_loose_file interface - use revs-ignore_missing_links so we don't barf on already-missing unreferenced objects - convert reachable.c to use traverse_commit_list instead of its own custom walk; this gives support for ignore_missing_links above, and saves us a fair bit of code. - while in the area, I noticed that reachable.c's reflog handling is the same as rev-list's --reflog option; it now builds on what's in revision.c. That takes us up to patch 17. While working in reachable.c, I noticed an oddity: we consider objects in the index to be reachable during prune (which is good), but we do not when dropping them during a repack that uses --unpack-unreachable=expiration. The remaining patches fix that, which needed a fair bit of preparatory cleanup. I'm really beginning to question whether the just drop objects that are about to be pruned optimization done in 7e52f56 (gc: do not explode objects which will be immediately pruned, 2012-04-07). It really complicates things as pack-objects and prune need to have the exact same rules (and implementing it naively, by having pack-objects run the same code as prune, is not desirable because pack-objects has _already_ done a complete expensive traversal to generate the packing list). And I fear it will get even worse if we implement some of the race-condition fixes that Michael suggested earlier. On the other hand, the loosening behavior without 7e52f56 has some severe pathological cases. A repository which has had a chunk of history deleted can easily increase in size several orders of magnitude due to loosening (since we lose the benefit of all deltas in the loosened objects). Finally, there are a few things that were discussed that I didn't address/fix. I don't think any of them is a critical blocker, but I did want to summarize the state: - when refreshing, we may update a pack's mtime multiple times. It probably wouldn't be too hard to cache this and only update once per program run, but I also don't think it's that big a deal in practice. - We will munge mtimes of objects found in alternates. If we don't have write access to the alternate, we'll create a local duplicate of the object. This is the safer thing, but I'm not sure if there are cases where we might try to write out a large number of objects which exist in an alternate (OTOH, we will eventually drop them at the next repack). - I didn't implement the sort by inode trick that fsck does when traversing the loose objects. It wouldn't be too hard, but I'm not convinced it's actually important. - I didn't convert fsck to the for_each_loose_file interface (mostly because I didn't do the inode-sorting trick, and while I don't think it matters, I didn't go to the work to show that it _doesn't_). Here are the patches: [01/25]: foreach_alt_odb: propagate return value from callback [02/25]: isxdigit: cast input to unsigned char [03/25]: object_array: factor out slopbuf-freeing logic [04/25]: object_array: add a clear function [05/25]: clean up name allocation in prepare_revision_walk [06/25]: reachable: use traverse_commit_list instead of custom walk [07/25]: reachable: reuse revision.c add all reflogs code [08/25]: prune: factor out loose-object directory traversal [09/25]: reachable: mark index blobs as SEEN [10/25]: prune-packed: use for_each_loose_file_in_objdir [11/25]: count-objects: do not use xsize_t when counting object size [12/25]: count-objects: use for_each_loose_file_in_objdir [13/25]: sha1_file: add for_each iterators for loose and packed objects [14/25]: prune: keep objects reachable from recent objects [15/25]: pack-objects: refactor unpack-unreachable expiration check [16/25]: pack-objects: match prune logic for discarding objects [17/25]: write_sha1_file: freshen
[PATCH v2 01/25] foreach_alt_odb: propagate return value from callback
We check the return value of the callback and stop iterating if it is non-zero. However, we do not make the non-zero return value available to the caller, so they have no way of knowing whether the operation succeeded or not (technically they can keep their own error flag in the callback data, but that is unlike our other for_each functions). Signed-off-by: Jeff King p...@peff.net --- cache.h | 2 +- sha1_file.c | 12 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/cache.h b/cache.h index 5b86065..13fadb6 100644 --- a/cache.h +++ b/cache.h @@ -1125,7 +1125,7 @@ extern void prepare_alt_odb(void); extern void read_info_alternates(const char * relative_base, int depth); extern void add_to_alternates_file(const char *reference); typedef int alt_odb_fn(struct alternate_object_database *, void *); -extern void foreach_alt_odb(alt_odb_fn, void*); +extern int foreach_alt_odb(alt_odb_fn, void*); struct pack_window { struct pack_window *next; diff --git a/sha1_file.c b/sha1_file.c index 83f77f0..fa881bf 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -413,14 +413,18 @@ void add_to_alternates_file(const char *reference) link_alt_odb_entries(alt, strlen(alt), '\n', NULL, 0); } -void foreach_alt_odb(alt_odb_fn fn, void *cb) +int foreach_alt_odb(alt_odb_fn fn, void *cb) { struct alternate_object_database *ent; + int r = 0; prepare_alt_odb(); - for (ent = alt_odb_list; ent; ent = ent-next) - if (fn(ent, cb)) - return; + for (ent = alt_odb_list; ent; ent = ent-next) { + r = fn(ent, cb); + if (r) + break; + } + return r; } void prepare_alt_odb(void) -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 02/25] isxdigit: cast input to unsigned char
Otherwise, callers must do so or risk triggering warnings -Wchar-subscript (and rightfully so; a signed char might cause us to use a bogus negative index into the hexval_table). While we are dropping the now-unnecessary casts from the caller in urlmatch.c, we can get rid of similar casts in actually parsing the hex by using the hexval() helper, which implicitly casts to unsigned (but note that we cannot implement isxdigit in terms of hexval(), as it also casts its return value to unsigned). Signed-off-by: Jeff King p...@peff.net --- The patch that added more calls to isxdigit later in the series actually got reworked. So this is purely a cleanup and can be dropped if need be, though I still think it is an improvement. git-compat-util.h | 2 +- urlmatch.c| 8 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/git-compat-util.h b/git-compat-util.h index fb41118..44890d5 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -677,7 +677,7 @@ extern const unsigned char sane_ctype[256]; #define iscntrl(x) (sane_istest(x,GIT_CNTRL)) #define ispunct(x) sane_istest(x, GIT_PUNCT | GIT_REGEX_SPECIAL | \ GIT_GLOB_SPECIAL | GIT_PATHSPEC_MAGIC) -#define isxdigit(x) (hexval_table[x] != -1) +#define isxdigit(x) (hexval_table[(unsigned char)(x)] != -1) #define tolower(x) sane_case((unsigned char)(x), 0x20) #define toupper(x) sane_case((unsigned char)(x), 0) #define is_pathspec_magic(x) sane_istest(x,GIT_PATHSPEC_MAGIC) diff --git a/urlmatch.c b/urlmatch.c index 3d4c54b..618d216 100644 --- a/urlmatch.c +++ b/urlmatch.c @@ -43,11 +43,11 @@ static int append_normalized_escapes(struct strbuf *buf, from_len--; if (ch == '%') { if (from_len 2 || - !isxdigit((unsigned char)from[0]) || - !isxdigit((unsigned char)from[1])) + !isxdigit(from[0]) || + !isxdigit(from[1])) return 0; - ch = hexval_table[(unsigned char)*from++] 4; - ch |= hexval_table[(unsigned char)*from++]; + ch = hexval(*from++) 4; + ch |= hexval(*from++); from_len -= 2; was_esc = 1; } -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 03/25] object_array: factor out slopbuf-freeing logic
This is not a lot of code, but it's a logical construct that should not need to be repeated (and we are about to add a third repetition). Signed-off-by: Jeff King p...@peff.net --- object.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/object.c b/object.c index ca9d790..60f4864 100644 --- a/object.c +++ b/object.c @@ -355,6 +355,16 @@ void add_object_array_with_context(struct object *obj, const char *name, struct add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context); } +/* + * Free all memory associated with an entry; the result is + * in an unspecified state and should not be examined. + */ +static void object_array_release_entry(struct object_array_entry *ent) +{ + if (ent-name != object_array_slopbuf) + free(ent-name); +} + void object_array_filter(struct object_array *array, object_array_each_func_t want, void *cb_data) { @@ -367,8 +377,7 @@ void object_array_filter(struct object_array *array, objects[dst] = objects[src]; dst++; } else { - if (objects[src].name != object_array_slopbuf) - free(objects[src].name); + object_array_release_entry(objects[src]); } } array-nr = dst; @@ -400,8 +409,7 @@ void object_array_remove_duplicates(struct object_array *array) objects[array-nr] = objects[src]; array-nr++; } else { - if (objects[src].name != object_array_slopbuf) - free(objects[src].name); + object_array_release_entry(objects[src]); } } } -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 04/25] object_array: add a clear function
There's currently no easy way to free the memory associated with an object_array (and in most cases, we simply leak the memory in a rev_info's pending array). Let's provide a helper to make this easier to handle. We can make use of it in list-objects.c, which does the same thing by hand (but fails to free the name field of each entry, potentially leaking memory). Signed-off-by: Jeff King p...@peff.net --- list-objects.c | 7 +-- object.c | 10 ++ object.h | 6 ++ 3 files changed, 17 insertions(+), 6 deletions(-) diff --git a/list-objects.c b/list-objects.c index 3595ee7..fad6808 100644 --- a/list-objects.c +++ b/list-objects.c @@ -228,11 +228,6 @@ void traverse_commit_list(struct rev_info *revs, die(unknown pending object %s (%s), sha1_to_hex(obj-sha1), name); } - if (revs-pending.nr) { - free(revs-pending.objects); - revs-pending.nr = 0; - revs-pending.alloc = 0; - revs-pending.objects = NULL; - } + object_array_clear(revs-pending); strbuf_release(base); } diff --git a/object.c b/object.c index 60f4864..6aeb1bb 100644 --- a/object.c +++ b/object.c @@ -383,6 +383,16 @@ void object_array_filter(struct object_array *array, array-nr = dst; } +void object_array_clear(struct object_array *array) +{ + int i; + for (i = 0; i array-nr; i++) + object_array_release_entry(array-objects[i]); + free(array-objects); + array-objects = NULL; + array-nr = array-alloc = 0; +} + /* * Return true iff array already contains an entry with name. */ diff --git a/object.h b/object.h index e028ced..2a755a2 100644 --- a/object.h +++ b/object.h @@ -133,6 +133,12 @@ void object_array_filter(struct object_array *array, */ void object_array_remove_duplicates(struct object_array *array); +/* + * Remove any objects from the array, freeing all used memory; afterwards + * the array is ready to store more objects with add_object_array(). + */ +void object_array_clear(struct object_array *array); + void clear_object_flags(unsigned flags); #endif /* OBJECT_H */ -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 05/25] clean up name allocation in prepare_revision_walk
When we enter prepare_revision_walk, we have zero or more entries in our pending array. We disconnect that array from the rev_info, and then process each entry: 1. If the entry is a commit and the --source option is in effect, we keep a pointer to the object name. 2. Otherwise, we re-add the item to the pending list with a blank name. We then throw away the old array by freeing the array itself, but do not touch the name field of each entry. For any items of type (2), we leak the memory associated with the name. This commit fixes that by calling object_array_clear, which handles the cleanup for us. That breaks (1), though, because it depends on the memory pointed to by the name to last forever. We can solve that by making a copy of the name. This is slightly less efficient, but it shouldn't matter in practice, as we do it only for the tip commits of the traversal. Signed-off-by: Jeff King p...@peff.net --- revision.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/revision.c b/revision.c index e498b7c..01cc276 100644 --- a/revision.c +++ b/revision.c @@ -300,7 +300,7 @@ static struct commit *handle_commit(struct rev_info *revs, revs-limited = 1; } if (revs-show_source !commit-util) - commit-util = (void *) name; + commit-util = xstrdup(name); return commit; } @@ -2656,15 +2656,16 @@ void reset_revision_walk(void) int prepare_revision_walk(struct rev_info *revs) { - int nr = revs-pending.nr; - struct object_array_entry *e, *list; + int i; + struct object_array old_pending; struct commit_list **next = revs-commits; - e = list = revs-pending.objects; + memcpy(old_pending, revs-pending, sizeof(old_pending)); revs-pending.nr = 0; revs-pending.alloc = 0; revs-pending.objects = NULL; - while (--nr = 0) { + for (i = 0; i old_pending.nr; i++) { + struct object_array_entry *e = old_pending.objects + i; struct commit *commit = handle_commit(revs, e-item, e-name); if (commit) { if (!(commit-object.flags SEEN)) { @@ -2672,10 +2673,9 @@ int prepare_revision_walk(struct rev_info *revs) next = commit_list_append(commit, next); } } - e++; } if (!revs-leak_pending) - free(list); + object_array_clear(old_pending); /* Signal whether we need per-parent treesame decoration */ if (revs-simplify_merges || -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 06/25] reachable: use traverse_commit_list instead of custom walk
To find the set of reachable objects, we add a bunch of possible sources to our rev_info, call prepare_revision_walk, and then launch into a custom walker that handles each object top. This is a subset of what traverse_commit_list does, so we can just reuse that code (it can also handle more complex cases like UNINTERESTING commits and pathspecs, but we don't use those features). Signed-off-by: Jeff King p...@peff.net --- I was concerned this would be slower because traverse_commit_list is more featureful. To my surprise, it was consistently about 3-4% faster! The major difference is that traverse_commit_list will hit all of the commits first, and then the trees. For reachability that doesn't matter either way, but I suspect the new way has slightly better cache locality, leading to the minor speedup. reachable.c | 130 1 file changed, 17 insertions(+), 113 deletions(-) diff --git a/reachable.c b/reachable.c index 6f6835b..02bf6c2 100644 --- a/reachable.c +++ b/reachable.c @@ -8,6 +8,7 @@ #include reachable.h #include cache-tree.h #include progress.h +#include list-objects.h struct connectivity_progress { struct progress *progress; @@ -21,118 +22,6 @@ static void update_progress(struct connectivity_progress *cp) display_progress(cp-progress, cp-count); } -static void process_blob(struct blob *blob, -struct object_array *p, -struct name_path *path, -const char *name, -struct connectivity_progress *cp) -{ - struct object *obj = blob-object; - - if (!blob) - die(bad blob object); - if (obj-flags SEEN) - return; - obj-flags |= SEEN; - update_progress(cp); - /* Nothing to do, really .. The blob lookup was the important part */ -} - -static void process_gitlink(const unsigned char *sha1, - struct object_array *p, - struct name_path *path, - const char *name) -{ - /* I don't think we want to recurse into this, really. */ -} - -static void process_tree(struct tree *tree, -struct object_array *p, -struct name_path *path, -const char *name, -struct connectivity_progress *cp) -{ - struct object *obj = tree-object; - struct tree_desc desc; - struct name_entry entry; - struct name_path me; - - if (!tree) - die(bad tree object); - if (obj-flags SEEN) - return; - obj-flags |= SEEN; - update_progress(cp); - if (parse_tree(tree) 0) - die(bad tree object %s, sha1_to_hex(obj-sha1)); - add_object(obj, p, path, name); - me.up = path; - me.elem = name; - me.elem_len = strlen(name); - - init_tree_desc(desc, tree-buffer, tree-size); - - while (tree_entry(desc, entry)) { - if (S_ISDIR(entry.mode)) - process_tree(lookup_tree(entry.sha1), p, me, entry.path, cp); - else if (S_ISGITLINK(entry.mode)) - process_gitlink(entry.sha1, p, me, entry.path); - else - process_blob(lookup_blob(entry.sha1), p, me, entry.path, cp); - } - free_tree_buffer(tree); -} - -static void process_tag(struct tag *tag, struct object_array *p, - const char *name, struct connectivity_progress *cp) -{ - struct object *obj = tag-object; - - if (obj-flags SEEN) - return; - obj-flags |= SEEN; - update_progress(cp); - - if (parse_tag(tag) 0) - die(bad tag object %s, sha1_to_hex(obj-sha1)); - if (tag-tagged) - add_object(tag-tagged, p, NULL, name); -} - -static void walk_commit_list(struct rev_info *revs, -struct connectivity_progress *cp) -{ - int i; - struct commit *commit; - struct object_array objects = OBJECT_ARRAY_INIT; - - /* Walk all commits, process their trees */ - while ((commit = get_revision(revs)) != NULL) { - process_tree(commit-tree, objects, NULL, , cp); - update_progress(cp); - } - - /* Then walk all the pending objects, recursively processing them too */ - for (i = 0; i revs-pending.nr; i++) { - struct object_array_entry *pending = revs-pending.objects + i; - struct object *obj = pending-item; - const char *name = pending-name; - if (obj-type == OBJ_TAG) { - process_tag((struct tag *) obj, objects, name, cp); - continue; - } - if (obj-type == OBJ_TREE) { - process_tree((struct tree *)obj, objects, NULL, name, cp);
[PATCH v2 07/25] reachable: reuse revision.c add all reflogs code
We want to add all reflog entries as tips for finding reachable objects. The revision machinery can already do this (to support rev-list --reflog); we can reuse that code. Signed-off-by: Jeff King p...@peff.net --- This one is not strictly necessary, but it seems like a nice cleanup. Note that the big difference in the revision.c code is that it will print a warning for broken reflogs, but I think that's fine (and perhaps even desirable) here. reachable.c | 24 +--- revision.c | 4 ++-- revision.h | 1 + 3 files changed, 4 insertions(+), 25 deletions(-) diff --git a/reachable.c b/reachable.c index 02bf6c2..4e68cfa 100644 --- a/reachable.c +++ b/reachable.c @@ -22,22 +22,6 @@ static void update_progress(struct connectivity_progress *cp) display_progress(cp-progress, cp-count); } -static int add_one_reflog_ent(unsigned char *osha1, unsigned char *nsha1, - const char *email, unsigned long timestamp, int tz, - const char *message, void *cb_data) -{ - struct object *object; - struct rev_info *revs = (struct rev_info *)cb_data; - - object = parse_object(osha1); - if (object) - add_pending_object(revs, object, ); - object = parse_object(nsha1); - if (object) - add_pending_object(revs, object, ); - return 0; -} - static int add_one_ref(const char *path, const unsigned char *sha1, int flag, void *cb_data) { struct object *object = parse_object_or_die(sha1, path); @@ -48,12 +32,6 @@ static int add_one_ref(const char *path, const unsigned char *sha1, int flag, vo return 0; } -static int add_one_reflog(const char *path, const unsigned char *sha1, int flag, void *cb_data) -{ - for_each_reflog_ent(path, add_one_reflog_ent, cb_data); - return 0; -} - static void add_one_tree(const unsigned char *sha1, struct rev_info *revs) { struct tree *tree = lookup_tree(sha1); @@ -138,7 +116,7 @@ void mark_reachable_objects(struct rev_info *revs, int mark_reflog, /* Add all reflog info */ if (mark_reflog) - for_each_reflog(add_one_reflog, revs); + add_reflogs_to_pending(revs, 0); cp.progress = progress; cp.count = 0; diff --git a/revision.c b/revision.c index 01cc276..b8e02e2 100644 --- a/revision.c +++ b/revision.c @@ -1275,7 +1275,7 @@ static int handle_one_reflog(const char *path, const unsigned char *sha1, int fl return 0; } -static void handle_reflog(struct rev_info *revs, unsigned flags) +void add_reflogs_to_pending(struct rev_info *revs, unsigned flags) { struct all_refs_cb cb; cb.all_revs = revs; @@ -2061,7 +2061,7 @@ static int handle_revision_pseudo_opt(const char *submodule, for_each_glob_ref_in(handle_one_ref, arg + 10, refs/remotes/, cb); clear_ref_exclusion(revs-ref_excludes); } else if (!strcmp(arg, --reflog)) { - handle_reflog(revs, *flags); + add_reflogs_to_pending(revs, *flags); } else if (!strcmp(arg, --not)) { *flags ^= UNINTERESTING | BOTTOM; } else if (!strcmp(arg, --no-walk)) { diff --git a/revision.h b/revision.h index a620530..e644044 100644 --- a/revision.h +++ b/revision.h @@ -276,6 +276,7 @@ extern void add_pending_sha1(struct rev_info *revs, unsigned int flags); extern void add_head_to_pending(struct rev_info *); +extern void add_reflogs_to_pending(struct rev_info *, unsigned int flags); enum commit_action { commit_ignore, -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 08/25] prune: factor out loose-object directory traversal
Prune has to walk $GIT_DIR/objects/?? in order to find the set of loose objects to prune. Other parts of the code (e.g., count-objects) want to do the same. Let's factor it out into a reusable for_each-style function. Note that this is not quite a straight code movement. The original code had strange behavior when it found a file of the form [0-9a-f]{2}/.{38} that did _not_ contain all hex digits. It executed a break from the loop, meaning that we stopped pruning in that directory (but still pruned other directories!). This was probably a bug; we do not want to process the file as an object, but we should keep going otherwise (and that is how the new code handles it). We are also a little more careful with loose object directories which fail to open. The original code silently ignored any failures, but the new code will complain about any problems besides ENOENT. Signed-off-by: Jeff King p...@peff.net --- builtin/prune.c | 87 + cache.h | 33 ++ sha1_file.c | 84 +++ 3 files changed, 143 insertions(+), 61 deletions(-) diff --git a/builtin/prune.c b/builtin/prune.c index 144a3bd..763f53e 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -31,11 +31,23 @@ static int prune_tmp_file(const char *fullpath) return 0; } -static int prune_object(const char *fullpath, const unsigned char *sha1) +static int prune_object(const unsigned char *sha1, const char *fullpath, + void *data) { struct stat st; - if (lstat(fullpath, st)) - return error(Could not stat '%s', fullpath); + + /* +* Do we know about this object? +* It must have been reachable +*/ + if (lookup_object(sha1)) + return 0; + + if (lstat(fullpath, st)) { + /* report errors, but do not stop pruning */ + error(Could not stat '%s', fullpath); + return 0; + } if (st.st_mtime expire) return 0; if (show_only || verbose) { @@ -48,68 +60,20 @@ static int prune_object(const char *fullpath, const unsigned char *sha1) return 0; } -static int prune_dir(int i, struct strbuf *path) +static int prune_cruft(const char *basename, const char *path, void *data) { - size_t baselen = path-len; - DIR *dir = opendir(path-buf); - struct dirent *de; - - if (!dir) - return 0; - - while ((de = readdir(dir)) != NULL) { - char name[100]; - unsigned char sha1[20]; - - if (is_dot_or_dotdot(de-d_name)) - continue; - if (strlen(de-d_name) == 38) { - sprintf(name, %02x, i); - memcpy(name+2, de-d_name, 39); - if (get_sha1_hex(name, sha1) 0) - break; - - /* -* Do we know about this object? -* It must have been reachable -*/ - if (lookup_object(sha1)) - continue; - - strbuf_addf(path, /%s, de-d_name); - prune_object(path-buf, sha1); - strbuf_setlen(path, baselen); - continue; - } - if (starts_with(de-d_name, tmp_obj_)) { - strbuf_addf(path, /%s, de-d_name); - prune_tmp_file(path-buf); - strbuf_setlen(path, baselen); - continue; - } - fprintf(stderr, bad sha1 file: %s/%s\n, path-buf, de-d_name); - } - closedir(dir); - if (!show_only) - rmdir(path-buf); + if (starts_with(basename, tmp_obj_)) + prune_tmp_file(path); + else + fprintf(stderr, bad sha1 file: %s\n, path); return 0; } -static void prune_object_dir(const char *path) +static int prune_subdir(int nr, const char *path, void *data) { - struct strbuf buf = STRBUF_INIT; - size_t baselen; - int i; - - strbuf_addstr(buf, path); - strbuf_addch(buf, '/'); - baselen = buf.len; - - for (i = 0; i 256; i++) { - strbuf_addf(buf, %02x, i); - prune_dir(i, buf); - strbuf_setlen(buf, baselen); - } + if (!show_only) + rmdir(path); + return 0; } /* @@ -173,7 +137,8 @@ int cmd_prune(int argc, const char **argv, const char *prefix) mark_reachable_objects(revs, 1, progress); stop_progress(progress); - prune_object_dir(get_object_directory()); + for_each_loose_file_in_objdir(get_object_directory(), prune_object, + prune_cruft, prune_subdir, NULL);
[PATCH v2 11/25] count-objects: do not use xsize_t when counting object size
The point of xsize_t is to safely cast an off_t into a size_t (because we are about to mmap). But in count-objects, we are summing the sizes in an off_t. Using xsize_t means that count-objects could fail on a 32-bit system with a 4G object (not likely, as other parts of git would fail, but we should at least be correct here). Signed-off-by: Jeff King p...@peff.net --- builtin/count-objects.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/count-objects.c b/builtin/count-objects.c index a7f70cb..316a805 100644 --- a/builtin/count-objects.c +++ b/builtin/count-objects.c @@ -53,7 +53,7 @@ static void count_objects(DIR *d, char *path, int len, int verbose, if (lstat(path, st) || !S_ISREG(st.st_mode)) bad = 1; else - (*loose_size) += xsize_t(on_disk_bytes(st)); + (*loose_size) += on_disk_bytes(st); } if (bad) { if (verbose) { -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 10/25] prune-packed: use for_each_loose_file_in_objdir
This saves us from manually traversing the directory structure ourselves. Signed-off-by: Jeff King p...@peff.net --- builtin/prune-packed.c | 69 +- 1 file changed, 23 insertions(+), 46 deletions(-) diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c index d430731..f24a2c2 100644 --- a/builtin/prune-packed.c +++ b/builtin/prune-packed.c @@ -10,65 +10,42 @@ static const char * const prune_packed_usage[] = { static struct progress *progress; -static void prune_dir(int i, DIR *dir, struct strbuf *pathname, int opts) +static int prune_subdir(int nr, const char *path, void *data) { - struct dirent *de; - char hex[40]; - int top_len = pathname-len; + int *opts = data; + display_progress(progress, nr + 1); + if (!(*opts PRUNE_PACKED_DRY_RUN)) + rmdir(path); + return 0; +} + +static int prune_object(const unsigned char *sha1, const char *path, +void *data) +{ + int *opts = data; - sprintf(hex, %02x, i); - while ((de = readdir(dir)) != NULL) { - unsigned char sha1[20]; - if (strlen(de-d_name) != 38) - continue; - memcpy(hex + 2, de-d_name, 38); - if (get_sha1_hex(hex, sha1)) - continue; - if (!has_sha1_pack(sha1)) - continue; + if (!has_sha1_pack(sha1)) + return 0; - strbuf_add(pathname, de-d_name, 38); - if (opts PRUNE_PACKED_DRY_RUN) - printf(rm -f %s\n, pathname-buf); - else - unlink_or_warn(pathname-buf); - display_progress(progress, i + 1); - strbuf_setlen(pathname, top_len); - } + if (*opts PRUNE_PACKED_DRY_RUN) + printf(rm -f %s\n, path); + else + unlink_or_warn(path); + return 0; } void prune_packed_objects(int opts) { - int i; - const char *dir = get_object_directory(); - struct strbuf pathname = STRBUF_INIT; - int top_len; - - strbuf_addstr(pathname, dir); if (opts PRUNE_PACKED_VERBOSE) progress = start_progress_delay(_(Removing duplicate objects), 256, 95, 2); - if (pathname.len pathname.buf[pathname.len - 1] != '/') - strbuf_addch(pathname, '/'); - - top_len = pathname.len; - for (i = 0; i 256; i++) { - DIR *d; + for_each_loose_file_in_objdir(get_object_directory(), + prune_object, NULL, prune_subdir, opts); - display_progress(progress, i + 1); - strbuf_setlen(pathname, top_len); - strbuf_addf(pathname, %02x/, i); - d = opendir(pathname.buf); - if (!d) - continue; - prune_dir(i, d, pathname, opts); - closedir(d); - strbuf_setlen(pathname, top_len + 2); - rmdir(pathname.buf); - } + /* Ensure we show 100% before finishing progress */ + display_progress(progress, 256); stop_progress(progress); - strbuf_release(pathname); } int cmd_prune_packed(int argc, const char **argv, const char *prefix) -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 09/25] reachable: mark index blobs as SEEN
When we mark all reachable objects for pruning, that includes blobs mentioned by the index. However, we do not mark these with the SEEN flag, as we do for objects that we find by traversing (we also do not add them to the pending list, but that is because there is nothing further to traverse with them). This doesn't cause any problems with prune, because it checks only that the object exists in the global object hash, and not its flags. However, let's mark these objects to be consistent and avoid any later surprises. Signed-off-by: Jeff King p...@peff.net --- This code actually gets dropped later on in the series (when we teach the revision machinery to look at index objects), so this could technically be omitted. But it also keeps the minor behavior change here by itself, where it is explained, instead of as a side effect of the movement. reachable.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/reachable.c b/reachable.c index 4e68cfa..d03f829 100644 --- a/reachable.c +++ b/reachable.c @@ -55,6 +55,8 @@ static void add_cache_refs(struct rev_info *revs) read_cache(); for (i = 0; i active_nr; i++) { + struct blob *blob; + /* * The index can contain blobs and GITLINKs, GITLINKs are hashes * that don't actually point to objects in the repository, it's @@ -65,7 +67,10 @@ static void add_cache_refs(struct rev_info *revs) if (S_ISGITLINK(active_cache[i]-ce_mode)) continue; - lookup_blob(active_cache[i]-sha1); + blob = lookup_blob(active_cache[i]-sha1); + if (blob) + blob-object.flags |= SEEN; + /* * We could add the blobs to the pending list, but quite * frankly, we don't care. Once we've looked them up, and -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 13/25] sha1_file: add for_each iterators for loose and packed objects
We typically iterate over the reachable objects in a repository by starting at the tips and walking the graph. There's no easy way to iterate over all of the objects, including unreachable ones. Let's provide a way of doing so. Signed-off-by: Jeff King p...@peff.net --- cache.h | 11 +++ sha1_file.c | 62 + 2 files changed, 73 insertions(+) diff --git a/cache.h b/cache.h index 8ffefaa..d24dd32 100644 --- a/cache.h +++ b/cache.h @@ -1254,6 +1254,17 @@ int for_each_loose_file_in_objdir(const char *path, each_loose_subdir_fn subdir_cb, void *data); +/* + * Iterate over loose and packed objects in both the local + * repository and any alternates repositories. + */ +typedef int each_packed_object_fn(const unsigned char *sha1, + struct packed_git *pack, + uint32_t pos, + void *data); +extern int for_each_loose_object(each_loose_object_fn, void *); +extern int for_each_packed_object(each_packed_object_fn, void *); + struct object_info { /* Request */ enum object_type *typep; diff --git a/sha1_file.c b/sha1_file.c index a20240b..eb410a2 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -3349,3 +3349,65 @@ int for_each_loose_file_in_objdir(const char *path, strbuf_release(buf); return r; } + +struct loose_alt_odb_data { + each_loose_object_fn *cb; + void *data; +}; + +static int loose_from_alt_odb(struct alternate_object_database *alt, + void *vdata) +{ + struct loose_alt_odb_data *data = vdata; + return for_each_loose_file_in_objdir(alt-base, +data-cb, NULL, NULL, +data-data); +} + +int for_each_loose_object(each_loose_object_fn cb, void *data) +{ + struct loose_alt_odb_data alt; + int r; + + r = for_each_loose_file_in_objdir(get_object_directory(), + cb, NULL, NULL, data); + if (r) + return r; + + alt.cb = cb; + alt.data = data; + return foreach_alt_odb(loose_from_alt_odb, alt); +} + +static int for_each_object_in_pack(struct packed_git *p, each_packed_object_fn cb, void *data) +{ + uint32_t i; + int r = 0; + + for (i = 0; i p-num_objects; i++) { + const unsigned char *sha1 = nth_packed_object_sha1(p, i); + + if (!sha1) + return error(unable to get sha1 of object %u in %s, +i, p-pack_name); + + r = cb(sha1, p, i, data); + if (r) + break; + } + return r; +} + +int for_each_packed_object(each_packed_object_fn cb, void *data) +{ + struct packed_git *p; + int r = 0; + + prepare_packed_git(); + for (p = packed_git; p; p = p-next) { + r = for_each_object_in_pack(p, cb, data); + if (r) + break; + } + return r; +} -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 12/25] count-objects: use for_each_loose_file_in_objdir
This drops our line count considerably, and should make things more readable by keeping the counting logic separate from the traversal. Signed-off-by: Jeff King p...@peff.net --- builtin/count-objects.c | 101 ++-- 1 file changed, 30 insertions(+), 71 deletions(-) diff --git a/builtin/count-objects.c b/builtin/count-objects.c index 316a805..e47ef0b 100644 --- a/builtin/count-objects.c +++ b/builtin/count-objects.c @@ -11,6 +11,9 @@ static unsigned long garbage; static off_t size_garbage; +static int verbose; +static unsigned long loose, packed, packed_loose; +static off_t loose_size; static void real_report_garbage(const char *desc, const char *path) { @@ -21,61 +24,31 @@ static void real_report_garbage(const char *desc, const char *path) garbage++; } -static void count_objects(DIR *d, char *path, int len, int verbose, - unsigned long *loose, - off_t *loose_size, - unsigned long *packed_loose) +static void loose_garbage(const char *path) { - struct dirent *ent; - while ((ent = readdir(d)) != NULL) { - char hex[41]; - unsigned char sha1[20]; - const char *cp; - int bad = 0; + if (verbose) + report_garbage(garbage found, path); +} - if (is_dot_or_dotdot(ent-d_name)) - continue; - for (cp = ent-d_name; *cp; cp++) { - int ch = *cp; - if (('0' = ch ch = '9') || - ('a' = ch ch = 'f')) - continue; - bad = 1; - break; - } - if (cp - ent-d_name != 38) - bad = 1; - else { - struct stat st; - memcpy(path + len + 3, ent-d_name, 38); - path[len + 2] = '/'; - path[len + 41] = 0; - if (lstat(path, st) || !S_ISREG(st.st_mode)) - bad = 1; - else - (*loose_size) += on_disk_bytes(st); - } - if (bad) { - if (verbose) { - struct strbuf sb = STRBUF_INIT; - strbuf_addf(sb, %.*s/%s, - len + 2, path, ent-d_name); - report_garbage(garbage found, sb.buf); - strbuf_release(sb); - } - continue; - } - (*loose)++; - if (!verbose) - continue; - memcpy(hex, path+len, 2); - memcpy(hex+2, ent-d_name, 38); - hex[40] = 0; - if (get_sha1_hex(hex, sha1)) - die(internal error); - if (has_sha1_pack(sha1)) - (*packed_loose)++; +static int count_loose(const unsigned char *sha1, const char *path, void *data) +{ + struct stat st; + + if (lstat(path, st) || !S_ISREG(st.st_mode)) + loose_garbage(path); + else { + loose_size += on_disk_bytes(st); + loose++; + if (verbose has_sha1_pack(sha1)) + packed_loose++; } + return 0; +} + +static int count_cruft(const char *basename, const char *path, void *data) +{ + loose_garbage(path); + return 0; } static char const * const count_objects_usage[] = { @@ -85,12 +58,7 @@ static char const * const count_objects_usage[] = { int cmd_count_objects(int argc, const char **argv, const char *prefix) { - int i, verbose = 0, human_readable = 0; - const char *objdir = get_object_directory(); - int len = strlen(objdir); - char *path = xmalloc(len + 50); - unsigned long loose = 0, packed = 0, packed_loose = 0; - off_t loose_size = 0; + int human_readable = 0; struct option opts[] = { OPT__VERBOSE(verbose, N_(be verbose)), OPT_BOOL('H', human-readable, human_readable, @@ -104,19 +72,10 @@ int cmd_count_objects(int argc, const char **argv, const char *prefix) usage_with_options(count_objects_usage, opts); if (verbose) report_garbage = real_report_garbage; - memcpy(path, objdir, len); - if (len objdir[len-1] != '/') - path[len++] = '/'; - for (i = 0; i 256; i++) { - DIR *d; - sprintf(path + len, %02x, i); - d = opendir(path); - if (!d) - continue; - count_objects(d, path, len, verbose, - loose, loose_size, packed_loose); -
[PATCH v2 14/25] prune: keep objects reachable from recent objects
Our current strategy with prune is that an object falls into one of three categories: 1. Reachable (from ref tips, reflogs, index, etc). 2. Not reachable, but recent (based on the --expire time). 3. Not reachable and not recent. We keep objects from (1) and (2), but prune objects in (3). The point of (2) is that these objects may be part of an in-progress operation that has not yet updated any refs. However, it is not always the case that objects for an in-progress operation will have a recent mtime. For example, the object database may have an old copy of a blob (from an abandoned operation, a branch that was deleted, etc). If we create a new tree that points to it, a simultaneous prune will leave our tree, but delete the blob. Referencing that tree with a commit will then work (we check that the tree is in the object database, but not that all of its referred objects are), as will mentioning the commit in a ref. But the resulting repo is corrupt; we are missing the blob reachable from a ref. One way to solve this is to be more thorough when referencing a sha1: make sure that not only do we have that sha1, but that we have objects it refers to, and so forth recursively. The problem is that this is very expensive. Creating a parent link would require traversing the entire object graph! Instead, this patch pushes the extra work onto prune, which runs less frequently (and has to look at the whole object graph anyway). It creates a new category of objects: objects which are not recent, but which are reachable from a recent object. We do not prune these objects, just like the reachable and recent ones. This lets us avoid the recursive check above, because if we have an object, even if it is unreachable, we should have its referent. We can make a simple inductive argument that with this patch, this property holds (that there are no objects with missing referents in the repository): 0. When we have no objects, we have nothing to refer or be referred to, so the property holds. 1. If we add objects to the repository, their direct referents must generally exist (e.g., if you create a tree, the blobs it references must exist; if you create a commit to point at the tree, the tree must exist). This is already the case before this patch. And it is not 100% foolproof (you can make bogus objects using `git hash-object`, for example), but it should be the case for normal usage. Therefore for any sequence of object additions, the property will continue to hold. 2. If we remove objects from the repository, then we will not remove a child object (like a blob) if an object that refers to it is being kept. That is the part implemented by this patch. Note, however, that our reachability check and the actual pruning are not atomic. So it _is_ still possible to violate the property (e.g., an object becomes referenced just as we are deleting it). This patch is shooting for eliminating problems where the mtimes of dependent objects differ by hours or days, and one is dropped without the other. It does nothing to help with short races. Naively, the simplest way to implement this would be to add all recent objects as tips to the reachability traversal. However, this does not perform well. In a recently-packed repository, all reachable objects will also be recent, and therefore we have to look at each object twice. This patch instead performs the reachability traversal, then follows up with a second traversal for recent objects, skipping any that have already been marked. Signed-off-by: Jeff King p...@peff.net --- builtin/prune.c| 2 +- builtin/reflog.c | 2 +- reachable.c| 112 + reachable.h| 3 +- t/t6501-freshen-objects.sh | 88 +++ 5 files changed, 204 insertions(+), 3 deletions(-) create mode 100755 t/t6501-freshen-objects.sh diff --git a/builtin/prune.c b/builtin/prune.c index 763f53e..04d3b12 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -135,7 +135,7 @@ int cmd_prune(int argc, const char **argv, const char *prefix) if (show_progress) progress = start_progress_delay(_(Checking connectivity), 0, 0, 2); - mark_reachable_objects(revs, 1, progress); + mark_reachable_objects(revs, 1, expire, progress); stop_progress(progress); for_each_loose_file_in_objdir(get_object_directory(), prune_object, prune_cruft, prune_subdir, NULL); diff --git a/builtin/reflog.c b/builtin/reflog.c index b6388f7..2d85d26 100644 --- a/builtin/reflog.c +++ b/builtin/reflog.c @@ -649,7 +649,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix) init_revisions(cb.revs, prefix); if (cb.verbose) printf(Marking
[PATCH v2 16/25] pack-objects: match prune logic for discarding objects
A recent commit taught git-prune to keep non-recent objects that are reachable from recent ones. However, pack-objects, when loosening unreachable objects, tries to optimize out the write in the case that the object will be immediately pruned. It now gets this wrong, since its rule does not reflect the new prune code (and this can be seen by running t6501 with a strategically placed repack). Let's teach pack-objects similar logic. Signed-off-by: Jeff King p...@peff.net --- builtin/pack-objects.c | 39 +++ reachable.c| 4 +- reachable.h| 2 + t/t6501-freshen-objects.sh | 93 +++--- 4 files changed, 98 insertions(+), 40 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 2fe2ab0..4df9499 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -20,6 +20,8 @@ #include streaming.h #include thread-utils.h #include pack-bitmap.h +#include reachable.h +#include sha1-array.h static const char *pack_usage[] = { N_(git pack-objects --stdout [options...] [ ref-list | object-list]), @@ -2407,6 +2409,15 @@ static int has_sha1_pack_kept_or_nonlocal(const unsigned char *sha1) return 0; } +/* + * Store a list of sha1s that are should not be discarded + * because they are either written too recently, or are + * reachable from another object that was. + * + * This is filled by get_object_list. + */ +static struct sha1_array recent_objects; + static int loosened_object_can_be_discarded(const unsigned char *sha1, unsigned long mtime) { @@ -2414,6 +2425,8 @@ static int loosened_object_can_be_discarded(const unsigned char *sha1, return 0; if (mtime unpack_unreachable_expiration) return 0; + if (sha1_array_lookup(recent_objects, sha1) = 0) + return 0; return 1; } @@ -2470,6 +2483,19 @@ static int get_object_list_from_bitmap(struct rev_info *revs) return 0; } +static void record_recent_object(struct object *obj, +const struct name_path *path, +const char *last, +void *data) +{ + sha1_array_append(recent_objects, obj-sha1); +} + +static void record_recent_commit(struct commit *commit, void *data) +{ + sha1_array_append(recent_objects, commit-object.sha1); +} + static void get_object_list(int ac, const char **av) { struct rev_info revs; @@ -2517,10 +2543,23 @@ static void get_object_list(int ac, const char **av) mark_edges_uninteresting(revs, show_edge); traverse_commit_list(revs, show_commit, show_object, NULL); + if (unpack_unreachable_expiration) { + revs.ignore_missing_links = 1; + if (add_unseen_recent_objects_to_traversal(revs, + unpack_unreachable_expiration)) + die(unable to add recent objects); + if (prepare_revision_walk(revs)) + die(revision walk setup failed); + traverse_commit_list(revs, record_recent_commit, +record_recent_object, NULL); + } + if (keep_unreachable) add_objects_in_unpacked_packs(revs); if (unpack_unreachable) loosen_unused_packed_objects(revs); + + sha1_array_clear(recent_objects); } static int option_parse_index_version(const struct option *opt, diff --git a/reachable.c b/reachable.c index 55589a0..0176a88 100644 --- a/reachable.c +++ b/reachable.c @@ -183,8 +183,8 @@ static int add_recent_packed(const unsigned char *sha1, return 0; } -static int add_unseen_recent_objects_to_traversal(struct rev_info *revs, - unsigned long timestamp) +int add_unseen_recent_objects_to_traversal(struct rev_info *revs, + unsigned long timestamp) { struct recent_data data; int r; diff --git a/reachable.h b/reachable.h index 141fe30..d23efc3 100644 --- a/reachable.h +++ b/reachable.h @@ -2,6 +2,8 @@ #define REACHEABLE_H struct progress; +extern int add_unseen_recent_objects_to_traversal(struct rev_info *revs, + unsigned long timestamp); extern void mark_reachable_objects(struct rev_info *revs, int mark_reflog, unsigned long mark_recent, struct progress *); diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh index de941c2..e25c47d 100755 --- a/t/t6501-freshen-objects.sh +++ b/t/t6501-freshen-objects.sh @@ -39,50 +39,67 @@ commit () { git commit -m $1 } -test_expect_success 'disable reflogs' ' - git config core.logallrefupdates false - rm -rf .git/logs -' +maybe_repack () { + if test -n $repack; then +
[PATCH v2 17/25] write_sha1_file: freshen existing objects
When we try to write a loose object file, we first check whether that object already exists. If so, we skip the write as an optimization. However, this can interfere with prune's strategy of using mtimes to mark files in progress. For example, if a branch contains a particular tree object and is deleted, that tree object may become unreachable, and have an old mtime. If a new operation then tries to write the same tree, this ends up as a noop; we notice we already have the object and do nothing. A prune running simultaneously with this operation will see the object as old, and may delete it. We can solve this by freshening objects that we avoid writing by updating their mtime. The algorithm for doing so is essentially the same as that of has_sha1_file. Therefore we provide a new (static) interface check_and_freshen, which finds and optionally freshens the object. It's trivial to implement freshening and simple checking by tweaking a single parameter. Signed-off-by: Jeff King p...@peff.net --- sha1_file.c| 51 +++--- t/t6501-freshen-objects.sh | 27 2 files changed, 71 insertions(+), 7 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index eb410a2..d7f1838 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -443,27 +443,53 @@ void prepare_alt_odb(void) read_info_alternates(get_object_directory(), 0); } -static int has_loose_object_local(const unsigned char *sha1) +static int freshen_file(const char *fn) { - return !access(sha1_file_name(sha1), F_OK); + struct utimbuf t; + t.actime = t.modtime = time(NULL); + return !utime(fn, t); } -int has_loose_object_nonlocal(const unsigned char *sha1) +static int check_and_freshen_file(const char *fn, int freshen) +{ + if (access(fn, F_OK)) + return 0; + if (freshen freshen_file(fn)) + return 0; + return 1; +} + +static int check_and_freshen_local(const unsigned char *sha1, int freshen) +{ + return check_and_freshen_file(sha1_file_name(sha1), freshen); +} + +static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen) { struct alternate_object_database *alt; prepare_alt_odb(); for (alt = alt_odb_list; alt; alt = alt-next) { fill_sha1_path(alt-name, sha1); - if (!access(alt-base, F_OK)) + if (check_and_freshen_file(alt-base, freshen)) return 1; } return 0; } +static int check_and_freshen(const unsigned char *sha1, int freshen) +{ + return check_and_freshen_local(sha1, freshen) || + check_and_freshen_nonlocal(sha1, freshen); +} + +int has_loose_object_nonlocal(const unsigned char *sha1) +{ + return check_and_freshen_nonlocal(sha1, 0); +} + static int has_loose_object(const unsigned char *sha1) { - return has_loose_object_local(sha1) || - has_loose_object_nonlocal(sha1); + return check_and_freshen(sha1, 0); } static unsigned int pack_used_ctr; @@ -2966,6 +2992,17 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen, return move_temp_to_file(tmp_file, filename); } +static int freshen_loose_object(const unsigned char *sha1) +{ + return check_and_freshen(sha1, 1); +} + +static int freshen_packed_object(const unsigned char *sha1) +{ + struct pack_entry e; + return find_pack_entry(sha1, e) freshen_file(e.p-pack_name); +} + int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *returnsha1) { unsigned char sha1[20]; @@ -2978,7 +3015,7 @@ int write_sha1_file(const void *buf, unsigned long len, const char *type, unsign write_sha1_file_prepare(buf, len, type, sha1, hdr, hdrlen); if (returnsha1) hashcpy(returnsha1, sha1); - if (has_sha1_file(sha1)) + if (freshen_loose_object(sha1) || freshen_packed_object(sha1)) return 0; return write_loose_object(sha1, hdr, hdrlen, buf, len, 0); } diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh index e25c47d..157f3f9 100755 --- a/t/t6501-freshen-objects.sh +++ b/t/t6501-freshen-objects.sh @@ -100,6 +100,33 @@ for repack in '' true; do test_expect_success repository passes fsck ($title) ' git fsck ' + + test_expect_success abandon objects again ($title) ' + git reset --hard HEAD^ + find .git/objects -type f | + xargs test-chmtime -v -86400 + ' + + test_expect_success start writing new commit with same tree ($title) ' + tree=$( + GIT_INDEX_FILE=index.tmp + export GIT_INDEX_FILE + git read-tree HEAD + add abandon + add unrelated + git write-tree + ) +
[PATCH v2 15/25] pack-objects: refactor unpack-unreachable expiration check
When we are loosening unreachable packed objects, we do not bother to process objects that would simply be pruned immediately anyway. The would be pruned check is a simple comparison, but is about to get more complicated. Let's pull it out into a separate function. Note that this is slightly less efficient than the original, which avoided even opening old packs, since no object in them could pass the current check, which cares only about the pack mtime. But the new rules will depend on the exact object, so we need to perform the check even for old packs. Note also that we fix a minor buglet when the pack mtime is exactly the same as the expiration time. The prune code considers that worth pruning, whereas our check here considered it worth keeping. This wasn't a big deal. Besides being unlikely to happen, the result was simply that the object was loosened and then pruned, missing the optimization. Still, we can easily fix it while we are here. Signed-off-by: Jeff King p...@peff.net --- builtin/pack-objects.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index d391934..2fe2ab0 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2407,6 +2407,16 @@ static int has_sha1_pack_kept_or_nonlocal(const unsigned char *sha1) return 0; } +static int loosened_object_can_be_discarded(const unsigned char *sha1, + unsigned long mtime) +{ + if (!unpack_unreachable_expiration) + return 0; + if (mtime unpack_unreachable_expiration) + return 0; + return 1; +} + static void loosen_unused_packed_objects(struct rev_info *revs) { struct packed_git *p; @@ -2417,17 +2427,14 @@ static void loosen_unused_packed_objects(struct rev_info *revs) if (!p-pack_local || p-pack_keep) continue; - if (unpack_unreachable_expiration - p-mtime unpack_unreachable_expiration) - continue; - if (open_pack_index(p)) die(cannot open pack index); for (i = 0; i p-num_objects; i++) { sha1 = nth_packed_object_sha1(p, i); if (!packlist_find(to_pack, sha1, NULL) - !has_sha1_pack_kept_or_nonlocal(sha1)) + !has_sha1_pack_kept_or_nonlocal(sha1) + !loosened_object_can_be_discarded(sha1, p-mtime)) if (force_object_loose(sha1, p-mtime)) die(unable to force loose object); } -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 18/25] make add_object_array_with_context interface more sane
When you resolve a sha1, you can optionally keep any context found during the resolution, including the path and mode of a tree entry (e.g., when looking up HEAD:subdir/file.c). The add_object_array_with_context function lets you then attach that context to an entry in a list. Unfortunately, the interface for doing so is horrible. The object_context structure is large and most object_array users do not use it. Therefore we keep a pointer to the structure to avoid burdening other users too much. But that means when we do use it that we must allocate the struct ourselves. And the struct contains a fixed PATH_MAX-sized buffer, which makes this wholly unsuitable for any large arrays. We can observe that there is only a single user of the with_context variant: builtin/grep.c. And in that use case, the only element we care about is the path. We can therefore store only the path as a pointer (the context's mode field was redundant with the object_array_entry itself, and nobody actually cared about the surrounding tree). This still requires a strdup of the pathname, but at least we are only consuming the minimum amount of memory for each string. We can also handle the copying ourselves in add_object_array_*, and free it as appropriate in object_array_release_entry. Signed-off-by: Jeff King p...@peff.net --- builtin/grep.c | 8 object.c | 23 +-- object.h | 4 ++-- 3 files changed, 15 insertions(+), 20 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index c86a142..4063882 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -456,10 +456,10 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec, } static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec, - struct object *obj, const char *name, struct object_context *oc) + struct object *obj, const char *name, const char *path) { if (obj-type == OBJ_BLOB) - return grep_sha1(opt, obj-sha1, name, 0, oc ? oc-path : NULL); + return grep_sha1(opt, obj-sha1, name, 0, path); if (obj-type == OBJ_COMMIT || obj-type == OBJ_TREE) { struct tree_desc tree; void *data; @@ -501,7 +501,7 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec, for (i = 0; i nr; i++) { struct object *real_obj; real_obj = deref_tag(list-objects[i].item, NULL, 0); - if (grep_object(opt, pathspec, real_obj, list-objects[i].name, list-objects[i].context)) { + if (grep_object(opt, pathspec, real_obj, list-objects[i].name, list-objects[i].path)) { hit = 1; if (opt-status_only) break; @@ -821,7 +821,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix) struct object *object = parse_object_or_die(sha1, arg); if (!seen_dashdash) verify_non_filename(prefix, arg); - add_object_array_with_context(object, arg, list, xmemdupz(oc, sizeof(struct object_context))); + add_object_array_with_path(object, arg, list, oc.mode, oc.path); continue; } if (!strcmp(arg, --)) { diff --git a/object.c b/object.c index 6aeb1bb..df86bdd 100644 --- a/object.c +++ b/object.c @@ -307,10 +307,9 @@ int object_list_contains(struct object_list *list, struct object *obj) */ static char object_array_slopbuf[1]; -static void add_object_array_with_mode_context(struct object *obj, const char *name, - struct object_array *array, - unsigned mode, - struct object_context *context) +void add_object_array_with_path(struct object *obj, const char *name, + struct object_array *array, + unsigned mode, const char *path) { unsigned nr = array-nr; unsigned alloc = array-alloc; @@ -333,7 +332,10 @@ static void add_object_array_with_mode_context(struct object *obj, const char *n else entry-name = xstrdup(name); entry-mode = mode; - entry-context = context; + if (path) + entry-path = xstrdup(path); + else + entry-path = NULL; array-nr = ++nr; } @@ -344,15 +346,7 @@ void add_object_array(struct object *obj, const char *name, struct object_array void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode) { - add_object_array_with_mode_context(obj, name, array, mode, NULL); -} - -void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context
[PATCH v2 19/25] traverse_commit_list: support pending blobs/trees with paths
When we call traverse_commit_list, we may have trees and blobs in the pending array. As we process these, we pass the name field from the pending entry as the path of the object within the tree (which then becomes the root path if we recurse into subtrees). When we set up the traversal in prepare_revision_walk, though, the name field of any pending trees and blobs is likely to be the ref at which we found the object. We would not want to make this part of the path (e.g., doing so would make git rev-list --objects v2.6.11-tree in linux.git show paths like v2.6.11-tree/Makefile, which is nonsensical). Therefore prepare_revision_walk sets the name field of each pending tree and blobs to the empty string. However, this leaves no room for a caller who does know the correct path of a pending object to propagate that information to the revision walker. We can fix this by making two related changes: 1. Use the path field as the path instead of the name field in traverse_commit_list. If the path is not set, default to (which is what we always ended up with in the current code, because of prepare_revision_walk). 2. In prepare_revision_walk, make a complete copy of the entry. This makes the path field available to the walker (if there is one), solving our problem. Leaving the name field intact is now OK, as we do not use it as a path due to point (1) above (and we can use it to make more meaningful error messages if we want). We also make the original mode field available to the walker, though it does not actually use it. Note that we still re-add the pending objects and free the old ones (so we may strdup the path and name only to free the old ones). This could be made more efficient by simply copying the object_array entries that we are keeping. However, that would require more restructuring of the code, and is not done here. Signed-off-by: Jeff King p...@peff.net --- list-objects.c | 7 +-- revision.c | 30 +++--- 2 files changed, 28 insertions(+), 9 deletions(-) diff --git a/list-objects.c b/list-objects.c index fad6808..2910bec 100644 --- a/list-objects.c +++ b/list-objects.c @@ -208,6 +208,7 @@ void traverse_commit_list(struct rev_info *revs, struct object_array_entry *pending = revs-pending.objects + i; struct object *obj = pending-item; const char *name = pending-name; + const char *path = pending-path; if (obj-flags (UNINTERESTING | SEEN)) continue; if (obj-type == OBJ_TAG) { @@ -215,14 +216,16 @@ void traverse_commit_list(struct rev_info *revs, show_object(obj, NULL, name, data); continue; } + if (!path) + path = ; if (obj-type == OBJ_TREE) { process_tree(revs, (struct tree *)obj, show_object, -NULL, base, name, data); +NULL, base, path, data); continue; } if (obj-type == OBJ_BLOB) { process_blob(revs, (struct blob *)obj, show_object, -NULL, name, data); +NULL, path, data); continue; } die(unknown pending object %s (%s), diff --git a/revision.c b/revision.c index b8e02e2..9a0f99a 100644 --- a/revision.c +++ b/revision.c @@ -198,9 +198,10 @@ void mark_parents_uninteresting(struct commit *commit) } } -static void add_pending_object_with_mode(struct rev_info *revs, +static void add_pending_object_with_path(struct rev_info *revs, struct object *obj, -const char *name, unsigned mode) +const char *name, unsigned mode, +const char *path) { if (!obj) return; @@ -220,7 +221,20 @@ static void add_pending_object_with_mode(struct rev_info *revs, if (st) return; } - add_object_array_with_mode(obj, name, revs-pending, mode); + add_object_array_with_path(obj, name, revs-pending, mode, path); +} + +static void add_pending_object_with_mode(struct rev_info *revs, +struct object *obj, +const char *name, unsigned mode) +{ + add_pending_object_with_path(revs, obj, name, mode, NULL); +} + +static void add_pending_object_from_entry(struct rev_info *revs, + struct object_array_entry *e) +{ + add_pending_object_with_path(revs, e-item, e-name, e-mode, e-path); } void add_pending_object(struct rev_info *revs, @@ -265,8
[PATCH v2 20/25] rev-list: document --reflog option
This is mostly used internally, but it does not hurt to explain it. Signed-off-by: Jeff King p...@peff.net --- Documentation/rev-list-options.txt | 4 1 file changed, 4 insertions(+) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 5d311b8..4cf94c6 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -168,6 +168,10 @@ respectively, and they must begin with `refs/` when applied to `--glob` or `--all`. If a trailing '/{asterisk}' is intended, it must be given explicitly. +--reflog:: + Pretend as if all objects mentioned by reflogs are listed on the + command line as `commit`. + --ignore-missing:: Upon seeing an invalid object name in the input, pretend as if the bad input was not given. -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 21/25] rev-list: add --index-objects option
There is currently no easy way to ask the revision traversal machinery to include objects reachable from the index (e.g., blobs and trees that have not yet been committed). This patch adds an option to do so. Signed-off-by: Jeff King p...@peff.net --- I was tempted to call this just --index, because I could not think of what else --index would mean in the context of rev-list. But I also worried about weird interactions with other commands that take revision arguments. And since this is mostly for internal use anyway, I figured the more verbose name is not too bad. I could be convinced otherwise, though. Documentation/rev-list-options.txt | 7 ++ revision.c | 51 ++ revision.h | 1 + t/t6000-rev-list-misc.sh | 23 + 4 files changed, 82 insertions(+) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 4cf94c6..03ab343 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -172,6 +172,13 @@ explicitly. Pretend as if all objects mentioned by reflogs are listed on the command line as `commit`. +--index-objects:: + Pretend as if all objects used by the index (any blobs, and any + trees which are mentioned by the index's cache-tree extension) + ad listed on the command line. Note that you probably want to + use `--objects`, too, as there are by definition no commits in + the index. + --ignore-missing:: Upon seeing an invalid object name in the input, pretend as if the bad input was not given. diff --git a/revision.c b/revision.c index 9a0f99a..caabaf1 100644 --- a/revision.c +++ b/revision.c @@ -17,6 +17,7 @@ #include mailmap.h #include commit-slab.h #include dir.h +#include cache-tree.h volatile show_early_output_fn_t show_early_output; @@ -1299,6 +1300,53 @@ void add_reflogs_to_pending(struct rev_info *revs, unsigned flags) for_each_reflog(handle_one_reflog, cb); } +static void add_cache_tree(struct cache_tree *it, struct rev_info *revs, + struct strbuf *path) +{ + size_t baselen = path-len; + int i; + + if (it-entry_count = 0) { + struct tree *tree = lookup_tree(it-sha1); + add_pending_object_with_path(revs, tree-object, , +04, path-buf); + } + + for (i = 0; i it-subtree_nr; i++) { + struct cache_tree_sub *sub = it-down[i]; + strbuf_addf(path, %s%s, baselen ? / : , sub-name); + add_cache_tree(sub-cache_tree, revs, path); + strbuf_setlen(path, baselen); + } + +} + +void add_index_objects_to_pending(struct rev_info *revs, unsigned flags) +{ + int i; + + read_cache(); + for (i = 0; i active_nr; i++) { + struct cache_entry *ce = active_cache[i]; + struct blob *blob; + + if (S_ISGITLINK(ce-ce_mode)) + continue; + + blob = lookup_blob(ce-sha1); + if (!blob) + die(unable to add index blob to traversal); + add_pending_object_with_path(revs, blob-object, , +ce-ce_mode, ce-name); + } + + if (active_cache_tree) { + struct strbuf path = STRBUF_INIT; + add_cache_tree(active_cache_tree, revs, path); + strbuf_release(path); + } +} + static int add_parents_only(struct rev_info *revs, const char *arg_, int flags) { unsigned char sha1[20]; @@ -1649,6 +1697,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg !strcmp(arg, --reflog) || !strcmp(arg, --not) || !strcmp(arg, --no-walk) || !strcmp(arg, --do-walk) || !strcmp(arg, --bisect) || starts_with(arg, --glob=) || + !strcmp(arg, --index-objects) || starts_with(arg, --exclude=) || starts_with(arg, --branches=) || starts_with(arg, --tags=) || starts_with(arg, --remotes=) || starts_with(arg, --no-walk=)) @@ -2078,6 +2127,8 @@ static int handle_revision_pseudo_opt(const char *submodule, clear_ref_exclusion(revs-ref_excludes); } else if (!strcmp(arg, --reflog)) { add_reflogs_to_pending(revs, *flags); + } else if (!strcmp(arg, --index-objects)) { + add_index_objects_to_pending(revs, *flags); } else if (!strcmp(arg, --not)) { *flags ^= UNINTERESTING | BOTTOM; } else if (!strcmp(arg, --no-walk)) { diff --git a/revision.h b/revision.h index e644044..e6dcd5d 100644 --- a/revision.h +++ b/revision.h @@ -277,6 +277,7 @@ extern void add_pending_sha1(struct rev_info *revs, extern void add_head_to_pending(struct rev_info *); extern void
[PATCH v2 22/25] reachable: use revision machinery's --index-objects code
This does the same thing as our custom code, so let's not repeat ourselves. Signed-off-by: Jeff King p...@peff.net --- reachable.c | 52 +--- 1 file changed, 1 insertion(+), 51 deletions(-) diff --git a/reachable.c b/reachable.c index 0176a88..a647267 100644 --- a/reachable.c +++ b/reachable.c @@ -32,56 +32,6 @@ static int add_one_ref(const char *path, const unsigned char *sha1, int flag, vo return 0; } -static void add_one_tree(const unsigned char *sha1, struct rev_info *revs) -{ - struct tree *tree = lookup_tree(sha1); - if (tree) - add_pending_object(revs, tree-object, ); -} - -static void add_cache_tree(struct cache_tree *it, struct rev_info *revs) -{ - int i; - - if (it-entry_count = 0) - add_one_tree(it-sha1, revs); - for (i = 0; i it-subtree_nr; i++) - add_cache_tree(it-down[i]-cache_tree, revs); -} - -static void add_cache_refs(struct rev_info *revs) -{ - int i; - - read_cache(); - for (i = 0; i active_nr; i++) { - struct blob *blob; - - /* -* The index can contain blobs and GITLINKs, GITLINKs are hashes -* that don't actually point to objects in the repository, it's -* almost guaranteed that they are NOT blobs, so we don't call -* lookup_blob() on them, to avoid populating the hash table -* with invalid information -*/ - if (S_ISGITLINK(active_cache[i]-ce_mode)) - continue; - - blob = lookup_blob(active_cache[i]-sha1); - if (blob) - blob-object.flags |= SEEN; - - /* -* We could add the blobs to the pending list, but quite -* frankly, we don't care. Once we've looked them up, and -* added them as objects, we've really done everything -* there is to do for a blob -*/ - } - if (active_cache_tree) - add_cache_tree(active_cache_tree, revs); -} - /* * The traversal will have already marked us as SEEN, so we * only need to handle any progress reporting here. @@ -213,7 +163,7 @@ void mark_reachable_objects(struct rev_info *revs, int mark_reflog, revs-tree_objects = 1; /* Add all refs from the index file */ - add_cache_refs(revs); + add_index_objects_to_pending(revs, 0); /* Add all external refs */ for_each_ref(add_one_ref, revs); -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 23/25] pack-objects: use argv_array
This saves us from having to bump the rp_av count when we add new traversal options. Signed-off-by: Jeff King p...@peff.net --- builtin/pack-objects.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 4df9499..b26276b 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -22,6 +22,7 @@ #include pack-bitmap.h #include reachable.h #include sha1-array.h +#include argv-array.h static const char *pack_usage[] = { N_(git pack-objects --stdout [options...] [ ref-list | object-list]), @@ -2614,8 +2615,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) int use_internal_rev_list = 0; int thin = 0; int all_progress_implied = 0; - const char *rp_av[6]; - int rp_ac = 0; + struct argv_array rp = ARGV_ARRAY_INIT; int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0; struct option pack_objects_options[] = { OPT_SET_INT('q', quiet, progress, @@ -2705,24 +2705,24 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (pack_to_stdout != !base_name || argc) usage_with_options(pack_usage, pack_objects_options); - rp_av[rp_ac++] = pack-objects; + argv_array_push(rp, pack-objects); if (thin) { use_internal_rev_list = 1; - rp_av[rp_ac++] = --objects-edge; + argv_array_push(rp, --objects-edge); } else - rp_av[rp_ac++] = --objects; + argv_array_push(rp, --objects); if (rev_list_all) { use_internal_rev_list = 1; - rp_av[rp_ac++] = --all; + argv_array_push(rp, --all); } if (rev_list_reflog) { use_internal_rev_list = 1; - rp_av[rp_ac++] = --reflog; + argv_array_push(rp, --reflog); } if (rev_list_unpacked) { use_internal_rev_list = 1; - rp_av[rp_ac++] = --unpacked; + argv_array_push(rp, --unpacked); } if (!reuse_object) @@ -2766,8 +2766,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (!use_internal_rev_list) read_object_list_from_stdin(); else { - rp_av[rp_ac] = NULL; - get_object_list(rp_ac, rp_av); + get_object_list(rp.argc, rp.argv); + argv_array_clear(rp); } cleanup_preferred_base(); if (include_tag nr_result) -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 24/25] repack: pack objects mentioned by the index
When we pack all objects, we use only the objects reachable from references and reflogs. This misses any objects which are reachable from the index, but not yet referenced. By itself this isn't a big deal; the objects can remain loose until they are actually used in a commit. However, it does create a problem when we drop packed but unreachable objects. We try to optimize out the writing of objects that we will immediately prune, which means we must follow the same rules as prune in determining what is reachable. And prune uses the index for this purpose. This is rather uncommon in practice, as objects in the index would not usually have been packed in the first place. But it could happen in a sequence like: 1. You make a commit on a branch that references blob X. 2. You repack, moving X into the pack. 3. You delete the branch (and its reflog), so that X is unreferenced. 4. You git add blob X so that it is now referenced only by the index. 5. You repack again with git-gc. The pack-objects we invoke will see that X is neither referenced nor recent and not bother loosening it. Signed-off-by: Jeff King p...@peff.net --- In addition to fixing the safety, this is actually _packing_ those index objects. I don't see anything wrong with that, but if would prefer not to, it should be possible to do a separate traversal over the index objects (and mark them as to keep, but not to pack). builtin/pack-objects.c | 8 builtin/repack.c | 1 + t/t7701-repack-unpack-unreachable.sh | 13 + 3 files changed, 22 insertions(+) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index b26276b..3dc9108 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2617,6 +2617,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) int all_progress_implied = 0; struct argv_array rp = ARGV_ARRAY_INIT; int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0; + int rev_list_index = 0; struct option pack_objects_options[] = { OPT_SET_INT('q', quiet, progress, N_(do not show progress meter), 0), @@ -2663,6 +2664,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) { OPTION_SET_INT, 0, reflog, rev_list_reflog, NULL, N_(include objects referred by reflog entries), PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, 1 }, + { OPTION_SET_INT, 0, index-objects, rev_list_index, NULL, + N_(include objects referred to by the index), + PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, 1 }, OPT_BOOL(0, stdout, pack_to_stdout, N_(output pack to stdout)), OPT_BOOL(0, include-tag, include_tag, @@ -2720,6 +2724,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) use_internal_rev_list = 1; argv_array_push(rp, --reflog); } + if (rev_list_index) { + use_internal_rev_list = 1; + argv_array_push(rp, --index-objects); + } if (rev_list_unpacked) { use_internal_rev_list = 1; argv_array_push(rp, --unpacked); diff --git a/builtin/repack.c b/builtin/repack.c index 2aae05d..80b6021 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -209,6 +209,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) argv_array_push(cmd_args, --non-empty); argv_array_push(cmd_args, --all); argv_array_push(cmd_args, --reflog); + argv_array_push(cmd_args, --index-objects); if (window) argv_array_pushf(cmd_args, --window=%s, window); if (window_memory) diff --git a/t/t7701-repack-unpack-unreachable.sh b/t/t7701-repack-unpack-unreachable.sh index b8d4cde..aad8a9c 100755 --- a/t/t7701-repack-unpack-unreachable.sh +++ b/t/t7701-repack-unpack-unreachable.sh @@ -109,4 +109,17 @@ test_expect_success 'do not bother loosening old objects' ' test_must_fail git cat-file -p $obj2 ' +test_expect_success 'keep packed objects found only in index' ' + echo my-unique-content file + git add file + git commit -m make it reachable + git gc + git reset HEAD^ + git reflog expire --expire=now --all + git add file + test-chmtime =-86400 .git/objects/pack/* + git gc --prune=1.hour.ago + git cat-file blob :file +' + test_done -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 25/25] pack-objects: double-check options before discarding objects
When we are given an expiration time like --unpack-unreachable=2.weeks.ago, we avoid writing out old, unreachable loose objects entirely, under the assumption that running prune would simply delete them immediately anyway. However, this is only valid if we computed the same set of reachable objects as prune would. In practice, this is the case, because only git-repack uses the --unpack-unreachable option with an expiration, and it always feeds as many objects into the pack as possible. But we can double-check at runtime just to be sure. Signed-off-by: Jeff King p...@peff.net --- No test here, because the potential breakage cannot be seen by running git repack (because it gives sane options), nor by just running git pack-objects (you can convince it not explode a loose object, but then you would have to reimplement the part of git repack where it deletes all the packs except the one we just created). So really this would only be protecting in practice against somebody who tried to reimplement git-repack themselves (and I do not know of anybody who has done that). builtin/pack-objects.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 3dc9108..72589ed 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2757,6 +2757,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (keep_unreachable unpack_unreachable) die(--keep-unreachable and --unpack-unreachable are incompatible.); + if (!rev_list_all || !rev_list_reflog || !rev_list_index) + unpack_unreachable_expiration = 0; if (!use_internal_rev_list || !pack_to_stdout || is_repository_shallow()) use_bitmap_index = 0; -- 2.1.2.596.g7379948 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Allow building Git with Asciidoctor
On Wed, Oct 15, 2014 at 01:24:22PM +0200, Thomas Braun wrote: Am 14.10.2014 um 11:51 schrieb Jeff King: What's the status on AsciiDoc versus AsciiDoctor? The latter seems more actively developed these days, but perhaps that is just my perception. The incompatibilities seem fairly minimal (if those first two patches are the extent of it, I have no problem at all trying to remain compatible with both). Would it ever make sense to switch to AsciiDoctor as our official command-line build program? I know it is supposed to be much faster (though a lot of the slowness in our build chain is due to docbook, not asciidoc itself). I don't think there's a lot of benefit for us to switch, and I say that being a contributor to Asciidoctor. It's useful to be able to build Git with both simply to find incompatibilities that we're going to need to fix anyway, due to the fact that Asciidoctor is used for the website. And yes, those first two patches are it, as far as I'm aware. Just recently we added the AsciiDoc toolchain to our git-for-windows/sdk (formerly known as msysgit). So I'm not really fond of switching now to something different again. Remaining compatible with both would therefore be my choice. That's my goal. I simply wanted the ability to support both AsciiDoc and Asciidoctor without making major changes to the codebase. Hence, moving the calls into variables. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
fsck option to remove corrupt objects - why/why not?
On 14/10/2014 19:21, Jeff King wrote: On Mon, Oct 13, 2014 at 09:37:27AM +1100, Ben Aveling wrote: A question about fsck - is there a reason it doesn't have an option to delete bad objects? If the objects are reachable, then deleting them would create other big problems (i.e., we would be breaking the object graph!). The man page for fsck advises: Any corrupt objects you will have to find in backups or other archives (i.e., you can just remove them and do an /rsync/ with some other site in the hopes that somebody else has the object you have corrupted). And that seems sensible to me - the object is corrupt, it is unusable, the object graph is already broken, we already have big problems, removing the corrupt object(s) doesn't create any new problems, and it allows the possibility that the damaged objects can be restored. I ask because I have a corrupt repository, and every time I run fsck, it reports one corrupt object, then stops. I could write a script to repeatedly call fsck and then remove the next corrupt object, but it raises the question for me; could it make sense to extend fsck with the option to do to the removes? Or even better, do the removes and then do the necessary [r]sync, assuming the user has another repository that has a good copy of the bad objects, which in this case I do. Regards, Ben -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] fsck: do not canonicalize modes in trees we are checking
On Thu, Oct 16, 2014 at 10:46:19AM +1100, Ben Aveling wrote: On 14/10/2014 19:21, Jeff King wrote: On Mon, Oct 13, 2014 at 09:37:27AM +1100, Ben Aveling wrote: A question about fsck - is there a reason it doesn't have an option to delete bad objects? If the objects are reachable, then deleting them would create other big problems (i.e., we would be breaking the object graph!). The man page for fsck advises: Any corrupt objects you will have to find in backups or other archives (i.e., you can just remove them and do an /rsync/ with some other site in the hopes that somebody else has the object you have corrupted). I think there are really two different types of corruption worth considering here. One is where the data bytes of the object were correct at one point, but now aren't anymore. The sha1 does not check out, and the object is broken. You need to go find another copy of the object somewhere else. The other is where the object was malformed in the first place, generally due to a software bug (e.g., syntactically bogus email addresses on committer lines, trees that are not sorted properly, etc). There is no point in finding another object, as any you find will be byte-for-byte identical. For this second type (which is what I thought we were talking about in this thread), deleting the object would be Very Bad. For the first type, deleting the object _could_ be a path forward. But it's not the usual method for fixing corruption. If you copy a good version of the object into the repository, then a subsequent repack should throw away the broken copy in favor of the good one. Note that you need to copy the objects in manually; you can't use the git protocol to do so. But that is true whether you have the broken object or not (because git does not communicate with the other side about _every_ object you have; it talks about commits, and assumes that you have the rest of the object graph that a commit refers to). So I don't think deleting really helps in that case. And it may hurt if it later turns out you don't have another copy of the object and need to recover what you can from the corrupted pieces (which has actually happened before). I ask because I have a corrupt repository, and every time I run fsck, it reports one corrupt object, then stops. Corrupt how? Bit-corruption, or a malformed object? I could write a script to repeatedly call fsck and then remove the next corrupt object, but it raises the question for me; could it make sense to extend fsck with the option to do to the removes? Or even better, do the removes and then do the necessary [r]sync, assuming the user has another repository that has a good copy of the bad objects, which in this case I do. If you have a non-corrupted version of the repository, the simplest thing is to just pack it, copy the resulting packfile to the broken repository (again, using cp or rsync and not git), and then repack there. That won't transfer the minimum amount of data, but it's simple and only requires one round-trip (bearing in mind that git doesn't know what out-of-band method you're using, nor whether we can even initiate a request from the broken repo to the good one, so each round trip generally does need to be done by hand). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/4] Documentation: move some AsciiDoc parameters into variables
On Wed, Oct 15, 2014 at 01:43:49PM -0700, Junio C Hamano wrote: I think it makes sense to make these customizable, but I wonder if it makes the result easier to maintain if we make units of logical definitions larger, e.g. ASCIIDOC = asciidoc TXT_TO_MANHTML = $(ASCIIDOC) -b xhtml11 -d manpage $(ASCIIDOC_CONF) TXT_TO_ARTICLE = $(ASCIIDOC) -b docbook -d article Looking at the code, it seems the most reusable unit is something like the following: TXT_TO_HTML = $(ASCIIDOC) -b xhtml11 $(ASCIIDOC_CONF) TXT_TO_XML = $(ASCIIDOC) -b docbook $(ASCIIDOC_CONF) that is, omitting the -d argument from the variable. Using -d would mean we'd have to have a variable for each of the seven locations we call $(ASCIIDOC). Then the above would become something like: ASCIIDOC = asciidoc ASCIIDOC_COMMON = $(ASCIIDOC) \ $(ASCIIDOC_EXTRA) $(ASCIIDOC_CONF) -agit_version=$(GIT_VERSION) TXT_TO_MANHTML = $(ASCIIDOC_COMMON) -b xhtml11 -d manpage ... and would further simplify this part $(MAN_HTML): %.html : %.txt asciidoc.conf $(QUIET_ASCIIDOC)$(RM) $@+ $@ \ - $(ASCIIDOC) -b xhtml11 -d manpage -f asciidoc.conf \ + $(ASCIIDOC) -b $(ASCIIDOC_HTML) -d manpage $(ASCIIDOC_CONF) \ $(ASCIIDOC_EXTRA) -agit_version=$(GIT_VERSION) -o $@+ $ \ into just $(TXT_TO_MANHTML) -o $@+ $ After all, our output formats are fairly limited, I would think. Are there too many different variants and exceptions to make such an approach infeasible? I'm on board with the $(ASCIIDOC_COMMON) idea, but to minimize the number of variables, I think we should leave the -d out of the macro itself. I'll re-roll over the next couple of days. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
FEAUTRE-REQUEST: upon git difftool, files which cannot be modified (persistently) should be made read only
Hi. Apparently since a while, git difftool has the awseome feature, that it just symlinks the files to the working copy, if a diff with that is requested. e.g. something like: git difftool -d will have /tmp/git-difftool.TWYTA created with subdirs left: -rw-r--r-- 1 user user 45k Oct 16 04:33 file right: lrwxrwxrwx 1 calestyo calestyo 52 Oct 16 04:33 file - /home/user/foo/file When one edits and saves this in one's favourite difftool,... it actually gets into the working copy. But of course saving the left (in this example) has no effect, so it would be nice if all these files were marked readonly, so that one already sees in the difftool changes here are useless. Cheers, Chris. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] mergetools/meld: make usage of `--output` configurable and more robust
Older versions of meld listed --output in `meld --help`. Newer versions only mention `meld [OPTIONS...]`. Improve the checks to catch these newer versions. Add a `mergetool.meld.hasOutput` configuration to allow overriding the heuristic. Reported-by: Andrey Novoseltsev novos...@gmail.com Helped-by: Junio C Hamano gits...@pobox.com Signed-off-by: David Aguilar dav...@gmail.com --- Changes since v1: This uses Junio's improved approach of checking for both --help styles and uses more focused name for the configuration variable. The documentation was reworded accordingly. Documentation/config.txt | 9 + mergetools/meld | 9 +++-- 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 04a1e2f..0f823eb 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1755,6 +1755,15 @@ mergetool.tool.trustExitCode:: if the file has been updated, otherwise the user is prompted to indicate the success of the merge. +mergetool.meld.hasOutput:: + Older versions of `meld` do not support the `--output` option. + Git will attempt to detect whether `meld` supports `--output` + by inspecting the output of `meld --help`. Configuring + `mergetool.meld.hasOutput` will make Git skip these checks and + use the configured value instead. Setting `mergetool.meld.hasOutput` + to `true` tells Git to unconditionally use the `--output` option, + and `false` avoids using `--output`. + mergetool.keepBackup:: After performing a merge, the original file with conflict markers can be saved as a file with a `.orig` extension. If this variable diff --git a/mergetools/meld b/mergetools/meld index cb672a5..83ebdfb 100644 --- a/mergetools/meld +++ b/mergetools/meld @@ -18,13 +18,18 @@ merge_cmd () { check_unchanged } -# Check whether 'meld --output file' is supported +# Check whether we should use 'meld --output file' check_meld_for_output_version () { meld_path=$(git config mergetool.meld.path) meld_path=${meld_path:-meld} - if $meld_path --help 21 | grep -e --output /dev/null + if meld_has_output_option=$(git config --bool mergetool.meld.hasOutput) then + : use configured value + elif $meld_path --help 21 | + grep -e '--output=' -e '\[OPTION\.\.\.\]' /dev/null + then + : old ones mention --output and new ones just say OPTION... meld_has_output_option=true else meld_has_output_option=false -- 2.1.2.444.g0cfad43 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 4/5] t7610-mergetool: use test_config to isolate tests
Signed-off-by: David Aguilar dav...@gmail.com --- This is a replacement patch for t7610-mergetool: prefer test_config over git config in da/mergetool-tests. Changes since v1: The changes are more surgical and the commit message mentions why we are using test_config. t/t7610-mergetool.sh | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/t/t7610-mergetool.sh b/t/t7610-mergetool.sh index 1a15e06..4fec633 100755 --- a/t/t7610-mergetool.sh +++ b/t/t7610-mergetool.sh @@ -112,7 +112,7 @@ test_expect_success 'custom mergetool' ' ' test_expect_success 'mergetool crlf' ' - git config core.autocrlf true + test_config core.autocrlf true git checkout -b test2 branch1 test_must_fail git merge master /dev/null 21 ( yes | git mergetool file1 /dev/null 21 ) @@ -505,14 +505,12 @@ test_expect_success 'file with no base' ' test_expect_success 'custom commands override built-ins' ' git checkout -b test14 branch1 - git config mergetool.defaults.cmd cat \\$REMOTE\ \\$MERGED\ - git config mergetool.defaults.trustExitCode true + test_config mergetool.defaults.cmd cat \\$REMOTE\ \\$MERGED\ + test_config mergetool.defaults.trustExitCode true test_must_fail git merge master git mergetool --no-prompt --tool defaults -- both echo master both added expected test_cmp both expected - git config --unset mergetool.defaults.cmd - git config --unset mergetool.defaults.trustExitCode git reset --hard master /dev/null 21 ' -- 2.1.2.453.g1b015e3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html