Re: [PATCH v3 1/4] git-credential-store: support multiple credential files
Hi all, thanks for providing your feedback. On Sun, Mar 15, 2015 at 6:14 AM, Junio C Hamano wrote: > I am not sure if this is not a premature over-engineering---I am not > convinced that such a future need will be fulfilled by passing just > a single default_fn this version already passes, or it needs even > more parameters that this version does not pass yet, and the > interface to the function needs to be updated at that point when you > need it _anyways_. One thing that we all agree is that we don't need > the extra parameter within the context of what the current code does. After considering everyone's responses, I've decided to remove the argument in the v4 patch. As Junio says, when there is a policy change the code can be modified anyway. Regards, Paul -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] sha1fd_check: die when we cannot open the file
Right now we return a NULL "struct sha1file" if we encounter an error. However, the sole caller (write_idx_file) does not check the return value, and will segfault if we hit this case. One option would be to handle the error in the caller. However, there's really nothing for it to do but die. This code path is hit during "git index-pack --verify"; after we verify the packfile, we check that the ".idx" we would generate from it is byte-wise identical to what is on disk. We hit the error (and segfault) if we can't open the .idx file (a likely cause of this is that somebody else ran "git repack -ad" while we were verifying). Since we can't complete the requested verification, we really have no choice but to die. Furthermore, the rest of the sha1fd_* functions simply die on errors. So if were to open the file successfully, for example, and then hit a read error, sha1write would call die() for us. So pushing the die() down into sha1fd_check keeps the interface consistent. Signed-off-by: Jeff King --- csum-file.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/csum-file.c b/csum-file.c index b00b215..a172199 100644 --- a/csum-file.c +++ b/csum-file.c @@ -130,14 +130,10 @@ struct sha1file *sha1fd_check(const char *name) sink = open("/dev/null", O_WRONLY); if (sink < 0) - return NULL; + die_errno("unable to open /dev/null"); check = open(name, O_RDONLY); - if (check < 0) { - int saved_errno = errno; - close(sink); - errno = saved_errno; - return NULL; - } + if (check < 0) + die_errno("unable to open '%s'", name); f = sha1fd(sink, name); f->check_fd = check; return f; -- 2.3.3.520.g3cfbb5d -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5/GSoC/MICRO] revision: forbid combining --graph and --no-walk
Because "--graph" is about connected history while --no-walk is about discrete points, it does not make sense to allow giving these two options at the same time. [1] This change makes a few calls to "show --graph" fail in t4052, but asking to show one commit with graph is a nonsensical thing to do. Thus, tests on "show --graph" in t4052 have been removed. [2,3] Same tests on "show" without --graph option have already been tested in 4052. 3 testcases have been added to test this patch. [1]: http://article.gmane.org/gmane.comp.version-control.git/216083 [2]: http://article.gmane.org/gmane.comp.version-control.git/264950 [3]: http://article.gmane.org/gmane.comp.version-control.git/265107 Helped-By: Eric Sunshine Helped-By: René Scharfe Helped-By: Junio C Hamano Signed-off-by: Dongcan Jiang --- Documentation/rev-list-options.txt | 2 ++ revision.c | 2 ++ t/t4052-stat-output.sh | 14 +++--- t/t4202-log.sh | 4 t/t6014-rev-list-all.sh| 4 t/t7007-show.sh| 4 6 files changed, 23 insertions(+), 7 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 4ed8587..136abdf 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -679,6 +679,7 @@ endif::git-rev-list[] given on the command line. Otherwise (if `sorted` or no argument was given), the commits are shown in reverse chronological order by commit time. + Cannot be combined with `--graph`. --do-walk:: Overrides a previous `--no-walk`. @@ -781,6 +782,7 @@ you would get an output like this: on the left hand side of the output. This may cause extra lines to be printed in between commits, in order for the graph history to be drawn properly. + Cannot be combined with `--no-walk`. + This enables parent rewriting, see 'History Simplification' below. + diff --git a/revision.c b/revision.c index 66520c6..6cd91dd 100644 --- a/revision.c +++ b/revision.c @@ -2339,6 +2339,8 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s if (revs->reflog_info && revs->graph) die("cannot combine --walk-reflogs with --graph"); + if (revs->no_walk && revs->graph) + die("cannot combine --no-walk with --graph"); if (!revs->reflog_info && revs->grep_filter.use_reflog_filter) die("cannot use --grep-reflog without --walk-reflogs"); diff --git a/t/t4052-stat-output.sh b/t/t4052-stat-output.sh index b68afef..54f10cf 100755 --- a/t/t4052-stat-output.sh +++ b/t/t4052-stat-output.sh @@ -99,7 +99,7 @@ do test_cmp "$expect" actual ' - test "$cmd" != diff || continue + test "$cmd" != diff && test "$cmd" != show || continue test_expect_success "$cmd --graph $verb COLUMNS (big change)" ' COLUMNS=200 git $cmd $args --graph >output @@ -127,7 +127,7 @@ do test_cmp "$expect" actual ' - test "$cmd" != diff || continue + test "$cmd" != diff && test "$cmd" != show || continue test_expect_success "$cmd --graph $verb not enough COLUMNS (big change)" ' COLUMNS=40 git $cmd $args --graph >output @@ -155,7 +155,7 @@ do test_cmp "$expect" actual ' - test "$cmd" != diff || continue + test "$cmd" != diff && test "$cmd" != show || continue test_expect_success "$cmd --graph $verb statGraphWidth config" ' git -c diff.statGraphWidth=26 $cmd $args --graph >output @@ -196,7 +196,7 @@ do test_cmp expect actual ' - test "$cmd" != diff || continue + test "$cmd" != diff && test "$cmd" != show || continue test_expect_success "$cmd --stat-width=width --graph with big change" ' git $cmd $args --stat-width=40 --graph >output @@ -236,7 +236,7 @@ do test_cmp expect actual ' - test "$cmd" != diff || continue + test "$cmd" != diff && test "$cmd" != show || continue test_expect_success "$cmd --stat=width --graph with big change is balanced" ' git $cmd $args --stat-width=60 --graph >output && @@ -270,7 +270,7 @@ do test_cmp "$expect" actual ' - test "$cmd" != diff || continue + test "$cmd" != diff && test "$cmd" != show || continue test_expect_success "$cmd --graph $verb COLUMNS (long filename)" ' COLUMNS=200 git $cmd $args --graph >output @@ -299,7 +299,7 @@ do test_cmp "$expect" actual ' - test "$cmd" != diff || continue + test "$cmd" != diff && test "$cmd" != show || continue test_expect_success COLUMNS_CAN_BE_1 \ "$cmd --graph $verb prefix greater than COLUMNS (big change)" ' diff --git a/t/t4202-log.sh b/t/t42
Git with Lader logic
Hi I am trying to find a way of using version control on PLC programmers like Allen Bradley PLC. I can't find a way of this. Could you please give me an idea if it will work with Plc programs. Which are basically Ladder logic. Sent from my iPad-- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
What's cooking in git.git (Mar 2015, #06; Tue, 17)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [Graduated to "master"] * ak/git-done-help-cleanup (2015-03-06) 1 commit (merged to 'next' on 2015-03-10 at 971382b) + git: make was_alias and done_help non-static Code simplification. * es/rebase-i-count-todo (2015-03-06) 2 commits (merged to 'next' on 2015-03-10 at fff95d5) + rebase-interactive: re-word "item count" comment + rebase-interactive: suppress whitespace preceding item count "git rebase -i" recently started to include the number of commits in the insn sheet to be processed, but on a platform that prepends leading whitespaces to "wc -l" output, the numbers are shown with extra whitespaces that aren't necessary. * mg/doc-status-color-slot (2015-03-10) 1 commit (merged to 'next' on 2015-03-12 at e53910a) + config,completion: add color.status.unmerged Documentation fixes. * mg/sequencer-commit-messages-always-verbatim (2015-03-06) 1 commit (merged to 'next' on 2015-03-10 at 6a09295) + sequencer: preserve commit messages "git cherry-pick" used to clean-up the log message even when it is merely replaying an existing commit. It now replays the message verbatim unless you are editing the message of resulting commits. * mg/status-v-v (2015-03-06) 3 commits (merged to 'next' on 2015-03-10 at 4fa5af0) + commit/status: show the index-worktree diff with -v -v + t7508: test git status -v + t7508: .gitignore 'expect' and 'output' files "git status" now allows the "-v" to be given twice to show the differences that are left in the working tree not to be committed. * rs/deflate-init-cleanup (2015-03-05) 1 commit (merged to 'next' on 2015-03-10 at 053157f) + zlib: initialize git_zstream in git_deflate_init{,_gzip,_raw} Code simplification. * rs/zip-text (2015-03-05) 1 commit (merged to 'next' on 2015-03-10 at 2f3fa92) + archive-zip: mark text files in archives "git archive" can now be told to set the 'text' attribute in the resulting zip archive. * sg/completion-remote (2015-03-06) 2 commits (merged to 'next' on 2015-03-10 at e1cd42b) + completion: simplify __git_remotes() + completion: add a test for __git_remotes() helper function Code simplification. -- [New Topics] * js/completion-ctags-pattern-substitution-fix (2015-03-14) 1 commit (merged to 'next' on 2015-03-17 at 4a68861) + contrib/completion: escape the forward slash in __git_match_ctag The code that reads from the ctags file in the completion script (in contrib/) did not spell ${param/pattern/string} substitution correctly, which happened to work with bash but not with zsh. Will merge to 'master'. * jc/a-lone-dash-stands-for-previous-branch (2015-03-16) 1 commit - "-" and "@{-1}" on various programs Lose special case code to make a lone dash "-" mean the previous branch aka "@{-1}" from a handful subcommands, and instead support the notation throughout the system by reimplementing it at the revisions layer. Needs tests, documentation updates, etc. * jc/submitting-patches-mention-send-email (2015-03-15) 1 commit - SubmittingPatches: encourage users to use format-patch and send-email Recommend format-patch and send-email for those who want to submit patches to this project. -- [Stalled] * nd/untracked-cache (2015-03-12) 24 commits - git-status.txt: advertisement for untracked cache - untracked cache: guard and disable on system changes - mingw32: add uname() - t7063: tests for untracked cache - update-index: test the system before enabling untracked cache - update-index: manually enable or disable untracked cache - status: enable untracked cache - untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHE - untracked cache: mark index dirty if untracked cache is updated - untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATS - untracked cache: avoid racy timestamps - read-cache.c: split racy stat test to a separate function - untracked cache: invalidate at index addition or removal - untracked cache: load from UNTR index extension - untracked cache: save to an index extension - ewah: add convenient wrapper ewah_serialize_strbuf() - untracked cache: don't open non-existent .gitignore - untracked cache: mark what dirs should be recursed/saved - untracked cache: record/validate dir mtime and reuse cached output - untracked cache: make a wrapper around {open,read,close}dir() - untracked cache: invalidate dirs recursively if .gitignore changes - untracked cache: initial untracked cache validation - untracked cache: record .gitignore information and dir hierarchy -
Re: [PATCH v4/GSoC/MICRO] revision: forbid combining --graph and --no-walk
On Tue, Mar 17, 2015 at 7:18 PM, Junio C Hamano wrote: > Dongcan Jiang writes: >> diff --git a/t/t4052-stat-output.sh b/t/t4052-stat-output.sh >> index b68afef..a989e8f 100755 >> --- a/t/t4052-stat-output.sh >> +++ b/t/t4052-stat-output.sh >> @@ -99,7 +99,7 @@ do >> test_cmp "$expect" actual >> ' >> >> - test "$cmd" != diff || continue >> + test "$cmd" != diff -a "$cmd" != show || continue > > I think we avoid -a and -o used with test (don't we have a write-up > on this somewhere?). The very last item in the shell script section of CodingGuidelines discusses it. > Write it like this > > test "$cmd" != diff && > test "$cmd" != show || continue > > or perhaps like this > > case "$cmd" in diff|show) continue ;; esac -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4/GSoC/MICRO] revision: forbid combining --graph and --no-walk
Dongcan Jiang writes: > Because "--graph" is about connected history while --no-walk > is about discrete points, it does not make sense to allow > giving these two options at the same time. [1] > > This change makes a few calls to "show --graph" fail in > t4052, but asking to show one commit with graph is a > nonsensical thing to do. Thus, tests on "show --graph" in > t4052 have been removed. [2,3] Same tests on "show" without > --graph option have already been tested in 4052. This looks almost perfect. > > diff --git a/t/t4052-stat-output.sh b/t/t4052-stat-output.sh > index b68afef..a989e8f 100755 > --- a/t/t4052-stat-output.sh > +++ b/t/t4052-stat-output.sh > @@ -99,7 +99,7 @@ do > test_cmp "$expect" actual > ' > > - test "$cmd" != diff || continue > + test "$cmd" != diff -a "$cmd" != show || continue I think we avoid -a and -o used with test (don't we have a write-up on this somewhere?). Write it like this test "$cmd" != diff && test "$cmd" != show || continue or perhaps like this case "$cmd" in diff|show) continue ;; esac Other than that I do not see anything objectionable. Thanks, good job. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] numparse module: systematically tighten up integer parsing
On Tue, Mar 17, 2015 at 11:00 PM, Michael Haggerty wrote: > Michael Haggerty (14): > numparse: new module for parsing integral numbers > cacheinfo_callback(): use convert_ui() when handling "--cacheinfo" > write_subdirectory(): use convert_ui() for parsing mode > handle_revision_opt(): use skip_prefix() in many places > handle_revision_opt(): use convert_i() when handling "-" > strtoul_ui(), strtol_i(): remove functions > handle_revision_opt(): use convert_ui() when handling "--abbrev=" > builtin_diff(): detect errors when parsing --unified argument > opt_arg(): val is always non-NULL > opt_arg(): use convert_i() in implementation > opt_arg(): report errors parsing option values > opt_arg(): simplify pointer handling > diff_opt_parse(): use convert_i() when handling "-l" > diff_opt_parse(): use convert_i() when handling --abbrev= Thank you for doing it. I was about to write another number parser and you did it :D Maybe you can add another patch to convert the only strtol in upload-pack.c to parse_ui. This place should accept positive number in base 10, plus sign is not accepted. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] not making corruption worse
Jeff King writes: > But it strikes me as weird that we consider the _tips_ of history to be > special for ignoring breakage. If the tip of "bar" is broken, we omit > it. But if the tip is fine, and there's breakage three commits down in > the history, then doing a clone is going to fail horribly, as > pack-objects realizes it can't generate the pack. So in practice, I'm > not sure how much you're buying with the "don't mention broken refs" > code. I think this is a trade-off between strictness and convenience. Is it preferrable that every time you try to clone a repository you get reminded that one of its refs point at a bogus object and you instead have to do "git fetch $there" with a refspec that excludes the broken one, or is it OK to allow clones and fetches silently succeed as if nothing is broken? If the breakage in the reachability chain is somewhere that affects a branch that is actively in use by the project, with or without hiding a broken tip, you will be hit by object transfer failure and you need to really go in there and fix things anyway. If it is just a no-longer-used experimental branch that lost necessary objects, it may be more convenient if the system automatically ignored it. In some parts of the system there is a movement to make this trade off tweakable (hint: what happened to the knobs to fsck that allow certain kinds of broken objects in the object store? did the topic go anywhere?). This one so far lacked such a knob to tweak, and I view your paranoia bit as such a knob. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Retry if fdopen() fails due to ENOMEM
Michael Haggerty writes: > On 03/06/2015 06:08 AM, Torsten Bögershausen wrote: >> On 03/05/2015 05:07 PM, Michael Haggerty wrote: >>> One likely reason for fdopen() to fail is the lack of memory for >>> allocating a FILE structure. When that happens, try freeing some >>> memory and calling fdopen() again in the hope that it will work the >>> second time. >>> >>> This change was suggested by Jonathan Nieder [1] >>> >>> In the first patch it is unsatisfying that try_to_free_routine() is >>> called with a magic number (1000) rather than sizeof(FILE). But the C >>> standard doesn't guarantee that FILE is a complete type, so I can't >>> think of a better approach. Suggestions, anybody? >> >> it's not the sizeof(FILE) which is critical, it is the size of the buffer >> associated with a FILE >> >> http://pubs.opengroup.org/onlinepubs/009695399/basedefs/stdio.h.html >> >> BUFSIZ may be your friend, and if it is not defined, 4096 may be a >> useful default. > > Good point. If this patch series is not dropped as being useless, I will > make this change. OK, it has been a week since anybody mentioned this series. What's the verdict? Taking what you said in $gmane/265228 into account, I am taking the lack of reroll or follow-up as a sign of lost interest, and if that is the case I'd drop the series before it hits 'next'. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Add revision range support on "-" and "@{-1}"
Kenny Lee Sin Cheong writes: > On Tue, Mar 17 2015 at 02:49:48 AM, Junio C Hamano wrote: >> Junio C Hamano writes: >> >> if (try to see if it is a revision or regvision range) { >> /* if failed ... */ >> if (starts with '-') { >> do the option thing; >> continue; >> } >> /* args must be pathspecs from here on */ >> check the '--' disambiguation; >> add pathspec to prune-data; >> } else { >> got_rev_arg = 1; >> } >> >> but I didn't trace the logic myself to see if that would work. > > You're right. I was actually going to try and check all possible > suffixes of "-" but your solution saves us from doing that, and it > didn't break any tests. "It didn't break any tests" does not tell us much, though. I also notice that handle_revision_arg() would die() by calling it directly or indirectly via verify_non_filename(), etc., but the caller actually is expecting it to silently return non-zero when it finds an argument that cannot be interpreted as a revision or as a revision range. If we feed the function a string that has ".." in it, with cant_be_filename unset, and if that string _can_ be parsed as a valid range (e.g. "master..next"), we would check if a file whose name is that string and die, e.g. $ >master..next ; git log master..next fatal: ambigous argument 'master..next': both revision and filename If we swap the order to do the "revision" first before "option", however, we would end up getting the same for a name that begins with "-" and has ".." in it. I see no guarantee that future possible option name cannot be misinterpreted as a range to trigger this check. But "git cmd -$option" for any value of $option does not have to be disambiguated when there is a file whose name is "-$option". The existing die()'s in the handle_revision_arg() function _will_ break that promise. Currently, because we check the options first, handle_revision_arg() does not cause us any problem, but swapping the order will have fallouts. If we want to really do the swapping (and I think that is the only sensible way if we wanted to allow "-" and any extended SHA-1 that begins with "-" as "the previous branch"), I think the "OK, it looks like a revision (or revision range); as we didn't see dashdash, it must not be a filename" check has to be moved to the caller, perhaps like this: if (try to see if it is a revision or a revision range) { /* failed */ ... } else { /* it can be read as a revision or a revision range */ if (!seen_dashdash) verify_non_filename(arg); got_rev_arg = 1; } The "missing" cases should also silently return failure and have the caller deal with that. > On a similar note, would it be relevant to add similar changes to > rev-parse? If the goal is "to allow '-' everywhere '@{-1}' is allowed, and used as such", then yes, of course, such an update is needed. But I am not sure if that is a worthwhile goal to aim for in the first place, though. You would need to accept -@{two.days.ago} as a "short-hand" for @{-1}@{two.days.ago}, etc., which does not look very readable way in the first place. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Need help deciding between subtree and submodule
At my workplace, the team is using Atlassian Stash + git We have a "Core" library that is our common code between various projects. To avoid a single monolithic repository and to allow our apps and tools to be modularized into their own repos, I have considered moving Core to a subtree or submodule. I tried subtree and this is definitely far more transparent and simple to the team (simplicity is very important), however I notice it has problems with unnecessary conflicts when you do not do `git subtree push` for each `git subtree pull`. This is unnecessary overhead and complicates the log graph which I don't like. Submodule functionally works but it is complicated. We make heavy use of pull requests for code reviews (they are required due to company policy). Instead of a pull request being atomic and containing any app changes + accompanying Core changes, we now need to create two pull requests and manage them in proper order. Things also become more difficult when branching. All around it just feels like submodule would interfere and add more administration overhead on a day to day basis, affecting productivity. Is there a third option here I'm missing? If only that issue with subtree could be addressed (the conflicts), it would be perfect enough for us I think. I have done all the stackoverflow reading and research I can manage at this point. I would really love some feedback from the actual git community on what would be a practical solution and structure here from a company perspective. Thanks in advance! -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Add revision range support on "-" and "@{-1}"
On Tue, Mar 17 2015 at 02:49:48 AM, Junio C Hamano wrote: > Junio C Hamano writes: > > if (try to see if it is a revision or regvision range) { > /* if failed ... */ > if (starts with '-') { > do the option thing; > continue; > } > /* args must be pathspecs from here on */ > check the '--' disambiguation; > add pathspec to prune-data; > } else { > got_rev_arg = 1; > } > > but I didn't trace the logic myself to see if that would work. You're right. I was actually going to try and check all possible suffixes of "-" but your solution saves us from doing that, and it didn't break any tests. On a similar note, would it be relevant to add similar changes to rev-parse? While trying to write some test, I noticed that rev-parse doesn't support "-". If I'm not mistaking it assumes everything that starts with "-" must be an option. But since it is a plumbing tool I don't know if it would be worth it or even an improvement. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git submodule: update=!command
kara, On Tue, Mar 17, 2015, at 17:05, Junio C Hamano wrote: > If you check the output from > > git diff 30a52c1d^ 30a52c1d > > and find it appropriately address the problem you originally had, > that would be wonderful, and if you can suggest further improvement, > that is equally good. Indeed, the new version of the docs looks much better. I'm particularly happy about the change to the format to make it easier to visually scan for the possible update modes. Cheers -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git submodule: update=!command
Ryan Lortie writes: > On Tue, Mar 17, 2015, at 16:49, Junio C Hamano wrote: >> With more recent versions of Git, namely, the versions after >> 30a52c1d (Merge branch 'ms/submodule-update-config-doc' into maint, >> 2015-03-13), the documentation pages already have updated >> descriptions around this area. > > sigh. > > That's what I get for forgetting to type 'git pull' before writing a > patch. > > Sorry for the noise! Nothing to apologise or sigh about. You re-confirmed that the old documentation was lacking, which led to an earlier discussion which in turn led to Michal to update the documentation. If you check the output from git diff 30a52c1d^ 30a52c1d and find it appropriately address the problem you originally had, that would be wonderful, and if you can suggest further improvement, that is equally good. Thanks for participating in our effort to collectively make Git better. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git submodule: update=!command
On Tue, Mar 17, 2015, at 16:49, Junio C Hamano wrote: > With more recent versions of Git, namely, the versions after > 30a52c1d (Merge branch 'ms/submodule-update-config-doc' into maint, > 2015-03-13), the documentation pages already have updated > descriptions around this area. sigh. That's what I get for forgetting to type 'git pull' before writing a patch. Sorry for the noise! Cheers -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git submodule: update=!command
Ryan Lortie writes: > 'man git-submodule' contains mention (in one place) that: > > Setting the key submodule.$name.update to !command > will cause command to be run. > > This is not documented in 'man gitmodules' (which documents the other > possible values for the 'update' key) Yes, that is deliberate, because you cannot use !command in .gitmodules that is tracked for security reasons. With more recent versions of Git, namely, the versions after 30a52c1d (Merge branch 'ms/submodule-update-config-doc' into maint, 2015-03-13), the documentation pages already have updated descriptions around this area. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git submodule: update=!command
On Tue, Mar 17, 2015, at 15:50, Jeff King wrote: > Yeah, spelling out the security model more explicitly would be good. Please see the attached patch. Cheers 0001-docs-clarify-command-submodule-update-policy.patch Description: Binary data
Re: [PATCH v4] rev-list: refuse --first-parent combined with --bisect
Christian Couder writes: > On Tue, Mar 17, 2015 at 7:33 PM, Junio C Hamano wrote: >> Christian Couder writes: >> >>> Yes, but the user is supposed to not change the "bad" pointer for no >>> good reason. >> >> That is irrelevant, no? Nobody is questioning that the user is >> supposed to judge if a commit is "good" or "bad" correctly. > > So if there is already a bad commit and the user gives another > bad commit, that means that the user knows that it will replace the > existing bad commit with the new one and that it's done for this > purpose. ECANNOTQUITEPARSE. The user may say "git bisect bad $that" and we do not question $that is bad. Git does not know better than the user. But that does not mean Git does not know better than the user how the current bad commit and $that commit are related. The user is not interested in "replacing" at all. The user is telling just one single fact, that is, "$that is bad". >> I am not quite sure if I am correctly getting what you meant to say, >> but if you meant "only when --alternate is given, we should do the >> merge-base thing; we should keep losing the current 'bad' and >> replace it with the new one without the --alternate option", I would >> see that as an exercise of a bad taste. > > What I wanted to say is that if we change "git bisect bad ", > so that now it means "add a new bad commit" instead of the previous > "replace the current bad commit, if any, with this one", then experienced > users might see that change as a regression in the user interface and > it might even break scripts. Huh? Step back a bit. The place you need to start from is to admit the fact that what "git bisect bad " currently does is broken. Try creating this history yourself a---b---c---d---e---f and start bisection this way: $ git bisect start f c $ git bisect bad a Immediately after the second command, "git bisect" moans Some good revs are not ancestor of the bad rev. git bisect cannot work properly in this case. Maybe you mistake good and bad revs? when it notices that the good rev (i.e. 'c') is no longer an ancestor of the 'bad', which now points at 'a'. But that is because "git bisect bad" _blindly_ moved 'bad' that used to point at 'f' to 'a', making a good rev (i.e. 'c') an ancestor of the bad rev, without even bothering to check. Now, if we fixed this bug and made the bisect_state function more careful (namely, when accepting "bad", make sure it is not beyond any existing "good", or barf like the above, _without_ moving the bad pointer), the user interface and behaviour would be changed. Is that a regression? No, it is a usability fix and a progress. Simply put, bisect_state function can become more careful and intelligent to help users. I view this "user goes out of way to tell us a commit that is known to be bad as bad, even though it is not what we offered to test and is not an ancestor of the commit that currently marked as bad" case the same way. We by now hopefully understand that blindly replacing the current 'bad' is suboptimal. By teaching bisect_state to do the "merge-base thing", we would be fixing that. Why is it a regression? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Promoting Git developers
Christian Couder writes: > On Mon, Mar 16, 2015 at 6:06 PM, Stefan Beller wrote: >> On Mon, Mar 16, 2015 at 2:20 AM, David Kastrup wrote: >>> >>> "Git Annotate"? >> >> "Git Praise" as opposed to blame? >> "Git Who" as a pun on the subcommand structure which doesn't always >> follows grammar? > > Yeah these suggestions above are nice, thanks for them, but "Git Rev News" > also look a bit like "git rev-list" and "git rev-parse" which are plumbing Git > commands, so it gives a somewhat "hardcore" look to the news letter which > I like. Call that "Git Rev List" then to be more direct? I myself liked the "Review" (spelled in full word) as its non-nerdy sound, as a suitable name for a publication that bridges between hard-core developers and slightly-more-serious-than-casual observers, but its not my call (and as I often say, I am not good at naming things) ;-). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Promoting Git developers
On Sun, Mar 15, 2015 at 11:18 PM, Junio C Hamano wrote: > Christian Couder writes: > >> I wrote something about a potential Git Rev News news letter: > > I read it. Sounds promising. Thanks! [...] > I obviously do not know how the actual contents would look like at > this point, but depending on the quality of the publication I might > be able to steal some descriptions when keeping the notes on topics > in flight that appear in my "What's cooking" report. And it can go > the other way around, too. The publication may want to peek my > "What's cooking" report for hints on how to characterize each topic > and assess its impact to the evolution of Git. Yeah, it would be a very nice thing if we could steal these kind of things from each other's work! Thanks, Christian. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 2/2] sha1_file: refactor sha1_file.c to support 'cat-file --literally'
Karthik Nayak writes: > Subject: Re: [PATCH v4 2/2] sha1_file: refactor sha1_file.c to support > 'cat-file --literally' > Modify sha1_loose_object_info() to support 'cat-file --literally' > by accepting flags and also make changes to copy the type to > object_info::typename. It would be more descriptive to mention the central point on the title regarding what it takes to "support cat-file --literally". For example: sha1_file.c: support reading from a loose object of a bogus type Update sha1_loose_object_info() to optionally allow it to read from a loose object file of unknown/bogus type; as the function usually returns the type of the object it read in the form of enum for known types, add an optional "typename" field to receive the name of the type in textual form. By the way, I think your 1/2 would not even compile unless you have this change; the patches in these two patch series must be swapped, no? > diff --git a/sha1_file.c b/sha1_file.c > index 69a60ec..e31e9e2 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -1564,6 +1564,36 @@ int unpack_sha1_header(git_zstream *stream, unsigned > char *map, unsigned long ma > return git_inflate(stream, 0); > } > > +static int unpack_sha1_header_literally(git_zstream *stream, unsigned char > *map, > + unsigned long mapsize, > + struct strbuf *header) > +{ > + unsigned char buffer[32], *cp; > + unsigned long bufsiz = sizeof(buffer); > + int status; > + > + /* Get the data stream */ > + memset(stream, 0, sizeof(*stream)); > + stream->next_in = map; > + stream->avail_in = mapsize; > + stream->next_out = buffer; > + stream->avail_out = bufsiz; > + > + git_inflate_init(stream); > + > + do { > + status = git_inflate(stream, 0); > + strbuf_add(header, buffer, stream->next_out - buffer); > + for (cp = buffer; cp < stream->next_out; cp++) > + if (!*cp) > + /* Found the NUL at the end of the header */ > + return 0; > + stream->next_out = buffer; > + stream->avail_out = bufsiz; > + } while (status == Z_OK); > + return -1; > +} > + There is nothing "literal" about this function. The only difference between the original and this one is that this uses a strbuf, instead of a buffer of a fixed size allocated by the caller, to return the early part of the inflated data. I wonder if it incurs too much overhead to the existing callers if you reimplementated unpack_sha1_header() as a thin wrapper around this function, something like int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz) { int status; struct strbuf header = STRBUF_INIT; status = unpack_sha1_header_to_strbuf(stream, map, mapsize, &header); if (bufsiz < header.len) die("object header too long"); memcpy(buffer, header.buf, header.len); return status; } Note that I am *not* suggesting to do this blindly. If there is no downside from performance point of view, however, the above would be a way to avoid code duplication. Another way to avoid code duplication may be to implement unpack_sha1_header_to_strbuf() in terms of unpack_sha1_header(), perhaps like this: static int unpack_sha1_header_to_strbuf(...) { unsigned char buffer[32], *cp; unsigned long bufsiz = sizeof(buffer); int status = unpack_sha1_header(stream, map, mapsize, buffer, bufsiz); if (status) return status; do { strbuf_add(header, buffer, stream->next_out - buffer); for (cp = buffer; cp < stream->next_out; cp++) if (!*cp) /* Found the NUL at the end of the header */ return 0; stream->next_out = buffer; stream->avail_out = bufsiz; } while (status == Z_OK); return -1; } which may be more in line with how we read data from loose objects. > +int parse_sha1_header_extended(const char *hdr, struct object_info *oi, > +int flags) Unless we are taking advantage of the fact that MSB is special in signed integral types, I would prefer to see us use an unsigned type to store these bits in a variable of an integral type. That would let the readers assume that they have fewer tricky things possibly going on in the code (also see the footnote to $gmane/263751). > @@ -1652,12 +1674,4
Re: Promoting Git developers
On Mon, Mar 16, 2015 at 6:06 PM, Stefan Beller wrote: > On Mon, Mar 16, 2015 at 2:20 AM, David Kastrup wrote: >> >> "Git Annotate"? > > "Git Praise" as opposed to blame? > "Git Who" as a pun on the subcommand structure which doesn't always > follows grammar? Yeah these suggestions above are nice, thanks for them, but "Git Rev News" also look a bit like "git rev-list" and "git rev-parse" which are plumbing Git commands, so it gives a somewhat "hardcore" look to the news letter which I like. Thanks, Christian. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Promoting Git developers
On Tue, Mar 17, 2015 at 10:43 AM, Thomas Ferris Nicolaisen wrote: > On Sun, Mar 15, 2015 at 9:46 AM, Christian Couder > wrote: >> >> I wrote something about a potential Git Rev News news letter: >> >> https://github.com/git/git.github.io/pull/15 >> > > I would love to have/use something like this in the GitMinutes > podcast. Perhaps in addition to the very random interview format that > I have now, I could do a more regular episodes about Git news, where I > incorporate this. Yeah, no problem. > I also volunteer to help with the production, if you'll allow list > lurkers like myself to contribute ;) Yes of course, you are very welcome! Peff, could you also give the rights on the repo to Thomas? Thanks both, Christian. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git submodule: update=!command
On Tue, Mar 17, 2015 at 03:28:57PM -0400, Ryan Lortie wrote: > The first is a question about git's basic policy with respect to things > like this. I hope that it's safe to assume that running 'git' commands > on repositories downloaded from potentially-hostile places will never > result in the authors of those repositories being able to run code on my > machine. Definitely, our policy is that downloading a git repository should not result in arbitrary code being run. If there is a case of that, it would be a serious security bug. I am not an expert on submodules, but I think the security module there is: 1. You can do whatever you like in submodule.*.update entries in .git/config, including arbitrary code. Nobody but the user can write to it. 2. The submodule code may migrate entries from .gitmodules into .git/config, but does so with an allow-known-good whitelist (see git-submodule.sh lines 622-637). So AFAICT there's no bug here, and the system is working as designed. It might be worth mentioning that restriction in the submodule documentation, if only to prevent non-malicious people from wondering why adding "!foo" does not work in .gitmodules. > If that is true then, the second request would be to spell this out more > explicitly in the relevant documentation. I'm happy to write a patch to > do that, if it is deemed appropriate. Yeah, spelling out the security model more explicitly would be good. There is also some subtlety around hooks. Doing: git clone user@host:/path/to/repo.git local should never run code controlled by "repo.git" as "user@host". But doing: ssh user@host 'cd /path/to/repo.git && git log' will respect the .git/config in repo.git, which may include arbitrary commands. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] rev-list: refuse --first-parent combined with --bisect
On Tue, Mar 17, 2015 at 7:33 PM, Junio C Hamano wrote: > Christian Couder writes: > >> On Mon, Mar 16, 2015 at 10:05 PM, Junio C Hamano wrote: >> >>> However, you can say "git bisect bad " (and "git bisect good >>> " for that matter) on a rev that is unrelated to what the >>> current bisection state is. E.g. after you mark the child of 8 as >>> "bad", the bisected graph would become >>> >>>G...1---2---3---4---6---8---B >>> >>> and you would be offered to test somewhere in the middle, say, 4. >>> But it is perfectly OK for you to respond with "git bisect bad 7", >>> if you know 7 is bad. >>> >>> I _think_ the current code blindly overwrites the "bad" pointer, >>> making the bisection state into this graph if you do so. >>> >>>G...1---2---3---4 >>> \ >>> 5---B >> >> Yes, we keep only one "bad" pointer. >> >>> This is very suboptimal. The side branch 4-to-7 could be much >>> longer than the original trunk 4-to-the-tip, in which case we would >>> have made the suspect space _larger_, not smaller. >> >> Yes, but the user is supposed to not change the "bad" pointer for no >> good reason. > > That is irrelevant, no? Nobody is questioning that the user is > supposed to judge if a commit is "good" or "bad" correctly. So if there is already a bad commit and the user gives another bad commit, that means that the user knows that it will replace the existing bad commit with the new one and that it's done for this purpose. > And nobody sane is dreaming that "Git could do better and detect > user's mistakes when the user says 'bad' for a commit that is > actually 'good'"; if Git can do that, then it should be able to do > the bisect without any user input (including "bisect run") at all > ;-). > >>> We certainly should be able to take advantage of the fact that the >>> current "bad" commit (i.e. the child of 8) and the newly given "bad" >>> commit (i.e. 7) are both known to be bad and mark 4 as "bad" instead >>> when that happens, instead of doing the suboptimal thing the code >>> currently does. >> >> Yeah, we could do that, but we would have to allow it only if a >> special option is passed on the command line, for example: >> git bisect bad --alternate > > I am not quite sure if I am correctly getting what you meant to say, > but if you meant "only when --alternate is given, we should do the > merge-base thing; we should keep losing the current 'bad' and > replace it with the new one without the --alternate option", I would > see that as an exercise of a bad taste. What I wanted to say is that if we change "git bisect bad ", so that now it means "add a new bad commit" instead of the previous "replace the current bad commit, if any, with this one", then experienced users might see that change as a regression in the user interface and it might even break scripts. That's why I suggested to use a new option to mean "add a new bad commit", though --alternate might not be the best name for this option. > Because the merge-base thing is using both the current and the new > one, such a use is not "alternate" in the first place. > > If the proposal were "with a new option, the user can say 'oh, I > made a mistake earlier and said that a commit that is not bad as > 'bad'. Let me replace the commit currently marked as 'bad' with > this one.", I would find it very sensible, actually. What I find sensible is to not break the semantics of the current interface. > I can see that such an operation can be called "alternate", but > "--fix" might be shorter-and-sweeter-and-to-the-point. > > In the "normal" case, the commit we offer the user to check (and > respond with "git bisect bad" without any commit parameter) is > always an ancestor of the current 'bad', so the merge-base with > 'bad' and the commit that was just checked would always be the > current commit. Using the merge-base thing will be transparent to > the end users in the normal case, and when the user has off-line > knowledge that some other commit that is not an ancestor of the > current 'bad' commit is bad, the merge-base thing will give a better > behaviour than the current implementation that blindly replaces. Yes, I agree that it could be an improvement to make it possible for the user to specify another bad commit. I just think it should be done with a new option if there is already a bad commit... >> and/or we could make "git bisect bad" accept any number of bad >> commitishs. ... or by allowing any number of bad commits after "git bisect bad". > Yes, that is exactly what I meant. > > The way I understand the Philip's point is that the user may have > a-priori knowledge that a breakage from the same cause appears in > both tips of these branches. In such a case, we can start bisection > after marking the merge-base of two 'bad' commits, e.g. 4 in the > illustration in the message you are responding to, instead of > including 5, 6, and 8 in the suspect set. Yeah, I agree that we can do better in
Re: [PATCH 00/14] numparse module: systematically tighten up integer parsing
On 03/17/2015 07:48 PM, Junio C Hamano wrote: > Michael Haggerty writes: > >> When I looked around, I found scores of sites that call atoi(), >> strtoul(), and strtol() carelessly. And it's understandable. Calling >> any of these functions safely is so much work as to be completely >> impractical in day-to-day code. >> >> git-compat-util.h has two functions, strtoul_ui() and strtol_i(), that >> try to make parsing integers a little bit easier. > > Exactly. They were introduced to prevent sloppy callers from > passing NULL to the &end parameter to underlying strtoul(3). The > first thing that came to my mind while reading your message up to > this point was "why not use them more, tighten them, adding > different variants if necessary, instead of introducing an entirely > new set of functions in a new namespace?" Here were my thoughts: * I wanted to change the interface to look less like strtol()/strtoul(), so it seemed appropriate to make the names more dissimilar. * The functions seemed long enough that they shouldn't be inline, and I wanted to put them in their own module rather than put them in git-compat-util.h. * It wasn't obvious to me how to generalize the names, strtoul_ui() and strtol_i(), to the eight functions that I wanted. That being said, I'm not married to the names. Suggestions are welcome--but we would need names for 8 functions, not 4 [1]. Michael > For example: > >> * Am I making callers too strict? In many cases where a positive >> integer is expected (e.g., "--abbrev="), I have been replacing >> code like >> >> result = strtoul(s, NULL, 10); >> >> with >> >> if (convert_i(s, 10, &result)) >> die("..."); > > I think strictness would be good and those who did "--abbrev=' 20'" > deserve what they get from the additional strictness, but > > if (strtol_i(s, 10, &result)) > > would have been more familiar. [1] It could be that when we're done, it will turn out that some of the eight variants are not needed. If so, we can delete them then. -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git submodule: update=!command
karaj, 'man git-submodule' contains mention (in one place) that: Setting the key submodule.$name.update to !command will cause command to be run. This is not documented in 'man gitmodules' (which documents the other possible values for the 'update' key) nor in 'man git-config' which also mentions the 'update' key (but refers readers to the two other pages). This feature is scary. The idea that arbitrary code could be executed on my machine when I run innocent-looking git commands, based on the content of the .gitmodules file is enough to give pause to anybody. Fortunately, it seems that (for now?) this is not really the case. 'git submodule init' will copy the values of the 'update' key from .gitmodules to your local git config, but only if they are one of "none", "checkout", "merge" or "rebase". So, I guess I'm asking two things. The first is a question about git's basic policy with respect to things like this. I hope that it's safe to assume that running 'git' commands on repositories downloaded from potentially-hostile places will never result in the authors of those repositories being able to run code on my machine. If that is true then, the second request would be to spell this out more explicitly in the relevant documentation. I'm happy to write a patch to do that, if it is deemed appropriate. Thanks in advance. Cheers -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] t5312: test object deletion code paths in a corrupted repository
On Tue, Mar 17, 2015 at 07:34:02PM +0100, Johannes Sixt wrote: > Am 17.03.2015 um 08:28 schrieb Jeff King: > >+test_expect_success 'create history reachable only from a bogus-named ref' ' > >+test_tick && git commit --allow-empty -m master && > >+base=$(git rev-parse HEAD) && > >+test_tick && git commit --allow-empty -m bogus && > >+bogus=$(git rev-parse HEAD) && > >+git cat-file commit $bogus >saved && > >+echo $bogus >.git/refs/heads/bogus:name && > > This causes headaches on Windows: It creates an empty file, named "bogus", > with all the data diverted to the alternate data stream named "name". > Needless to say that this... Ah, yes. Windows. Our usual workaround would be to put it straight into packed-refs, but in this case, the test really does need the badly named ref in the file system. But... > >+test_expect_success 'clean up bogus ref' ' > >+rm .git/refs/heads/bogus:name > >+' > > does not remove the file "bogus", but only the alternate data stream (if at > all---I forgot to check). How about .git/refs/heads/bogus..nam.e? Yes, that works. The colon is what originally brought my attention to this case, but anything that fails git-check-ref-format is fine. I've squashed this in: diff --git a/t/t5312-prune-corruption.sh b/t/t5312-prune-corruption.sh index 167031e..1001a69 100755 --- a/t/t5312-prune-corruption.sh +++ b/t/t5312-prune-corruption.sh @@ -21,7 +21,7 @@ test_expect_success 'create history reachable only from a bogus-named ref' ' test_tick && git commit --allow-empty -m bogus && bogus=$(git rev-parse HEAD) && git cat-file commit $bogus >saved && - echo $bogus >.git/refs/heads/bogus:name && + echo $bogus >.git/refs/heads/bogus..name && git reset --hard HEAD^ ' @@ -47,7 +47,7 @@ test_expect_failure 'destructive repack keeps packed object' ' # subsequent tests will have different corruptions test_expect_success 'clean up bogus ref' ' - rm .git/refs/heads/bogus:name + rm .git/refs/heads/bogus..name ' test_expect_success 'create history with missing tip commit' ' I assumed the final "." in your example wasn't significant (it is not to git), but let me know if I've run afoul of another weird restriction. :) Thanks. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] numparse module: systematically tighten up integer parsing
Michael Haggerty writes: > When I looked around, I found scores of sites that call atoi(), > strtoul(), and strtol() carelessly. And it's understandable. Calling > any of these functions safely is so much work as to be completely > impractical in day-to-day code. > > git-compat-util.h has two functions, strtoul_ui() and strtol_i(), that > try to make parsing integers a little bit easier. Exactly. They were introduced to prevent sloppy callers from passing NULL to the &end parameter to underlying strtoul(3). The first thing that came to my mind while reading your message up to this point was "why not use them more, tighten them, adding different variants if necessary, instead of introducing an entirely new set of functions in a new namespace?" For example: > * Am I making callers too strict? In many cases where a positive > integer is expected (e.g., "--abbrev="), I have been replacing > code like > > result = strtoul(s, NULL, 10); > > with > > if (convert_i(s, 10, &result)) > die("..."); I think strictness would be good and those who did "--abbrev=' 20'" deserve what they get from the additional strictness, but if (strtol_i(s, 10, &result)) would have been more familiar. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GSoC] Applying for conversion scripts to builtins
Johannes Schindelin writes: > Therefore, I would wager a bet that just the mere conversion of a > shell script into even a primitive `run_command()`-based builtin would > help performance on Windows in a noticeable manner. As you correctly allege, if a patch rewrote a shell-scripted porcelain by using series of run_command() and doing nothing else, I would have asked "is that an improvement?", without knowing that. > Of course, it would be *even nicer* to avoid the spawning altogether. Yeah, that, too ;-) > The biggest benefit of avoiding needless parsing, however, is not > performance. It is avoiding quoting issues. This is particularly so on > Windows, where Git is sometimes called from outside a shell > environment, where we have to deal with inconsistent quoting because > it is every Windows program's own job to parse the command-line, > including the quoting. > > Concrete example: on Windows, we have file locking issues because > files that are in use cannot be deleted. For that reason, we have > Windows-specific code that is "nice" by trying harder to delete files, > giving programs a little time to let their locks go. This locking > issue happens also when a virus scanner "uses"... These are definitely good advices from the area expert. Thanks for a bunch of good input. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] rev-list: refuse --first-parent combined with --bisect
Christian Couder writes: > On Mon, Mar 16, 2015 at 10:05 PM, Junio C Hamano wrote: > >> However, you can say "git bisect bad " (and "git bisect good >> " for that matter) on a rev that is unrelated to what the >> current bisection state is. E.g. after you mark the child of 8 as >> "bad", the bisected graph would become >> >>G...1---2---3---4---6---8---B >> >> and you would be offered to test somewhere in the middle, say, 4. >> But it is perfectly OK for you to respond with "git bisect bad 7", >> if you know 7 is bad. >> >> I _think_ the current code blindly overwrites the "bad" pointer, >> making the bisection state into this graph if you do so. >> >>G...1---2---3---4 >> \ >> 5---B > > Yes, we keep only one "bad" pointer. > >> This is very suboptimal. The side branch 4-to-7 could be much >> longer than the original trunk 4-to-the-tip, in which case we would >> have made the suspect space _larger_, not smaller. > > Yes, but the user is supposed to not change the "bad" pointer for no > good reason. That is irrelevant, no? Nobody is questioning that the user is supposed to judge if a commit is "good" or "bad" correctly. And nobody sane is dreaming that "Git could do better and detect user's mistakes when the user says 'bad' for a commit that is actually 'good'"; if Git can do that, then it should be able to do the bisect without any user input (including "bisect run") at all ;-). >> We certainly should be able to take advantage of the fact that the >> current "bad" commit (i.e. the child of 8) and the newly given "bad" >> commit (i.e. 7) are both known to be bad and mark 4 as "bad" instead >> when that happens, instead of doing the suboptimal thing the code >> currently does. > > Yeah, we could do that, but we would have to allow it only if a > special option is passed on the command line, for example: > git bisect bad --alternate I am not quite sure if I am correctly getting what you meant to say, but if you meant "only when --alternate is given, we should do the merge-base thing; we should keep losing the current 'bad' and replace it with the new one without the --alternate option", I would see that as an exercise of a bad taste. Because the merge-base thing is using both the current and the new one, such a use is not "alternate" in the first place. If the proposal were "with a new option, the user can say 'oh, I made a mistake earlier and said that a commit that is not bad as 'bad'. Let me replace the commit currently marked as 'bad' with this one.", I would find it very sensible, actually. I can see that such an operation can be called "alternate", but "--fix" might be shorter-and-sweeter-and-to-the-point. In the "normal" case, the commit we offer the user to check (and respond with "git bisect bad" without any commit parameter) is always an ancestor of the current 'bad', so the merge-base with 'bad' and the commit that was just checked would always be the current commit. Using the merge-base thing will be transparent to the end users in the normal case, and when the user has off-line knowledge that some other commit that is not an ancestor of the current 'bad' commit is bad, the merge-base thing will give a better behaviour than the current implementation that blindly replaces. > and/or we could make "git bisect bad" accept any number of bad > commitishs. Yes, that is exactly what I meant. The way I understand the Philip's point is that the user may have a-priori knowledge that a breakage from the same cause appears in both tips of these branches. In such a case, we can start bisection after marking the merge-base of two 'bad' commits, e.g. 4 in the illustration in the message you are responding to, instead of including 5, 6, and 8 in the suspect set. You need to be careful, though. An obvious pitfall is what you should do when there is a criss-cross merge. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] t5312: test object deletion code paths in a corrupted repository
Am 17.03.2015 um 08:28 schrieb Jeff King: +test_expect_success 'create history reachable only from a bogus-named ref' ' + test_tick && git commit --allow-empty -m master && + base=$(git rev-parse HEAD) && + test_tick && git commit --allow-empty -m bogus && + bogus=$(git rev-parse HEAD) && + git cat-file commit $bogus >saved && + echo $bogus >.git/refs/heads/bogus:name && This causes headaches on Windows: It creates an empty file, named "bogus", with all the data diverted to the alternate data stream named "name". Needless to say that this... +test_expect_success 'clean up bogus ref' ' + rm .git/refs/heads/bogus:name +' does not remove the file "bogus", but only the alternate data stream (if at all---I forgot to check). How about .git/refs/heads/bogus..nam.e? -- Hannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FW: Time Date on file
On di, 2015-03-17 at 13:19 -0400, Patrice Monette wrote: > > I did not find any config, but is there one configuration somewhere to > preserve the real date creation by author somewhere? No, there is no such configuration as git does not track timestamps of files. -- Dennis Kaarsemaker www.kaarsemaker.net -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] diff-lib.c: adjust position of i-t-a entries in diff
Duy Nguyen writes: > On Mon, Mar 16, 2015 at 09:05:45AM -0700, Junio C Hamano wrote: >> The offending one came from eec3e7e4 (cache-tree: invalidate i-t-a >> paths after generating trees, 2012-12-16), which was a fix to an >> earlier bug where a cache-tree written out of an index with i-t-a >> entries had incorrect information and still claimed it is fully >> valid after write-tree rebuilt it. The test probably should add >> another path without i-t-a bit, run the same "diff --cached" with >> updated expectation before write-tre, and run the "diff --cached" >> again to make sure it produces a result that match the updated >> expectation. > > Would adding another non-i-t-a entry help? Before this patch > "diff --cached" after write-tree shows the i-t-a entry only when > eec3e7e4 is applied. But with this patch we don't show i-t-a entry any > more, before or after write-tree, eec3e7e4 makes no visible difference. > > We could even revert eec3e7e4 and the outcome of "diff --cached" would > be the same because we just sort of move the "invalidation" part from > cache-tree to do_oneway_diff(). Not invalidating would speed up "diff > --cached" when i-t-a entries are present. Still it may be a good idea > to invalidate i-t-a paths to be on the safe side. Perhaps a patch like > this to resurrect the test? My unerstanding of what eec3e7e4 (cache-tree: invalidate i-t-a paths after generating trees, 2012-12-16) fixed was that in this sequence: - You prepare an index. - You write-tree out of the index, which involves: - updating the cache-tree to match the shape of the resulting from writing the index out. - create tree objects matching all levels of the cache-tree as needed on disk. - report the top-level tree object name - run "diff-index --cached", which can and will take advantage of the fact that everything in a subtree below a known-to-be-valid cache-tree entry does not have to be checked one-by-one. If a cache-tree says "everything under D/ in the index would hash to tree object T" and the HEAD has tree object T at D/, then the diff machinery will bypass the entire section in the index under D/, which is a valid optimization. However, when there is an i-t-a entry, we excluded that entry from the tree object computation, its presence did not contribute to the tree object name, but still marked the cache-tree entries that contain it as valid by mistake. This old bug was what the commit fixed, so an invocation of "diff --cached" after a write-tree, even if the index contains an i-t-a entry, will not see cache-tree entries that are marked valid when they are not. Instead, "diff --cached" will bypass the optimization and makes comparison one-by-one for the index entries. So reverting the fix obviously is not the right thing to do. If the tests show different results from two invocations of "diff --cached" with your patch applied, there is something that is broken by your patch, because the index and the HEAD does not change across write-tree in that test. If on the other hand the tests show the same result from these two "diff --cached" and the result is different from what the test expects, that means your patch changed the world order, i.e. an i-t-a entry used to be treated as if it were adding an empty blob to the index but it is now treated as non-existent, then that is a good thing and the only thing we need to update is what the test expects. I am guessing that instead of expecting dir/bar to be shown, it now should expect no output? Does adding an non-i-t-a entry help? It does not hurt, and it makes the test uses a non-empty output, making its effect more visible, which may or may not count as helping. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
FW: Time Date on file
I have address the following question to github support and the refer me to git documentation but I`m not too familiar with Git right now. Here`s my question: I did not find any config, but is there one configuration somewhere to preserve the real date creation by author somewhere? Thanks in advance, Patrice Monette IT / Programmer ; TI / Programmeur E-mail: patr...@steltec.ca Phone: 450-971-5995Ext: 2221 Fax: 450-971-1865 -Original Message- From: James Dennes (GitHub Staff) [mailto:supp...@github.com] Sent: February-16-15 4:46 PM To: Patrice Monette Subject: Re: Time Date on file Hi there, > I did not find any config, but is there one configuration somewhere to > preserve the real date creation by author somewhere? Not that I'm aware of, however this would be something specific to Git, not just GitHub. You'd need to check the Git documentation. Thanks, James -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] rev-list: refuse --first-parent combined with --bisect
On Mon, Mar 16, 2015 at 10:05 PM, Junio C Hamano wrote: > "Philip Oakley" writes: > >> From: "Junio C Hamano" >> >>> Hence, if you have a history that looks like this: >>> >>> >>> G...1---2---3---4---6---8---B >>>\ >>> 5---7---B >>> >>> it follows that 4 must also be "bad". It used to be good long time >>> ago somewhere before 1, and somewhere along way on the history, >>> there was a single breakage event that we are hunting for. That >>> single event cannot be 5, 6, 7 or 8 because breakage at say 5 would >>> not explain why the tip of the upper branch is broken---its breakage >>> has no way to propagate there. The breakage must have happened at 4 >>> or before that commit. >> >> Is it not worth at least confirming the assertion that 4 is bad before >> proceding, or at least an option to confirm that in complex scenarios >> where the fault may be devious. > > That raises a somewhat interesting tangent. > > Christian seems to be forever interested in bisect, so I'll add him > to the Cc list ;-) > > There is no way to give multiple "bad" from the command line. You > can say "git bisect start rev rev rev..." but that gives only one > bad and everything else is good. And once you specify one of the > above two bad ones (say, the child of 8), then we will not even > offer the other one (i.e. the child of 7) as a candidate to be > tested. So in that sense, "confirm that 4 is bad before proceeding" > is a moot point. > > However, you can say "git bisect bad " (and "git bisect good > " for that matter) on a rev that is unrelated to what the > current bisection state is. E.g. after you mark the child of 8 as > "bad", the bisected graph would become > >G...1---2---3---4---6---8---B > > and you would be offered to test somewhere in the middle, say, 4. > But it is perfectly OK for you to respond with "git bisect bad 7", > if you know 7 is bad. > > I _think_ the current code blindly overwrites the "bad" pointer, > making the bisection state into this graph if you do so. > >G...1---2---3---4 > \ > 5---B Yes, we keep only one "bad" pointer. > This is very suboptimal. The side branch 4-to-7 could be much > longer than the original trunk 4-to-the-tip, in which case we would > have made the suspect space _larger_, not smaller. Yes, but the user is supposed to not change the "bad" pointer for no good reason. For example maybe a mistake was made and the first commit marked as "bad" was not actually bad. > We certainly should be able to take advantage of the fact that the > current "bad" commit (i.e. the child of 8) and the newly given "bad" > commit (i.e. 7) are both known to be bad and mark 4 as "bad" instead > when that happens, instead of doing the suboptimal thing the code > currently does. Yeah, we could do that, but we would have to allow it only if a special option is passed on the command line, for example: git bisect bad --alternate and/or we could make "git bisect bad" accept any number of bad commitishs. That could give additional bonus points to the GSoC student who would implement it :-) Thanks, Christian. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/14] opt_arg(): report errors parsing option values
If an argument is there, but it can't be parsed as a non-positive number, then die() rather than returning 0. Signed-off-by: Michael Haggerty --- diff.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index 77c7acb..03cdabf 100644 --- a/diff.c +++ b/diff.c @@ -3368,7 +3368,7 @@ static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *va if (!c) return 1; /* optional argument was missing */ if (convert_i(arg, 10, val)) - return 0; + die("The value for -%c must be a non-negative integer", arg_short); return 1; } if (c != '-') @@ -3381,7 +3381,7 @@ static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *va if (!*eq) return 1; /* '=' and optional argument were missing */ if (convert_i(eq + 1, 10, val)) - return 0; + die("The value for --%s must be a non-negative integer", arg_long); return 1; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/14] handle_revision_opt(): use convert_ui() when handling "--abbrev="
This adds error checking, where previously there was none. It also disallows '+' and '-', leading whitespace, and trailing junk. Signed-off-by: Michael Haggerty --- revision.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/revision.c b/revision.c index 4908e66..a6f7c2e 100644 --- a/revision.c +++ b/revision.c @@ -1963,7 +1963,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg } else if (!strcmp(arg, "--abbrev")) { revs->abbrev = DEFAULT_ABBREV; } else if (skip_prefix(arg, "--abbrev=", &arg)) { - revs->abbrev = strtoul(arg, NULL, 10); + if (convert_ui(arg, 10, &revs->abbrev)) + die("--abbrev requires a non-negative integer argument"); if (revs->abbrev < MINIMUM_ABBREV) revs->abbrev = MINIMUM_ABBREV; else if (revs->abbrev > 40) -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/14] diff_opt_parse(): use convert_i() when handling "-l"
die() with an error message if the argument is not a non-negative integer. This change tightens up parsing: '+' and '-', leading whitespace, and trailing junk are all disallowed now. Signed-off-by: Michael Haggerty --- diff.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/diff.c b/diff.c index bc1e3c3..be389ae 100644 --- a/diff.c +++ b/diff.c @@ -3799,7 +3799,8 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac) else if (!strcmp(arg, "-z")) options->line_termination = 0; else if ((argcount = short_opt('l', av, &optarg))) { - options->rename_limit = strtoul(optarg, NULL, 10); + if (convert_i(optarg, 10, &options->rename_limit)) + die("-l requires a non-negative integer argument"); return argcount; } else if ((argcount = short_opt('S', av, &optarg))) { -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/14] opt_arg(): val is always non-NULL
opt_arg() is never called with val set to NULL, so remove the code for handling that eventuality (which anyway wasn't handled consistently in the function). Signed-off-by: Michael Haggerty --- diff.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/diff.c b/diff.c index a350677..6e3f498 100644 --- a/diff.c +++ b/diff.c @@ -3367,7 +3367,7 @@ static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *va c = *++arg; if (!c) return 1; - if (val && isdigit(c)) { + if (isdigit(c)) { char *end; int n = strtoul(arg, &end, 10); if (*end) -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/14] handle_revision_opt(): use convert_i() when handling "-"
This tightens up the parsing a bit: * Leading whitespace is no longer allowed * '+' and '-' are no longer allowed It also removes the need to check separately that max_count is non-negative. Signed-off-by: Michael Haggerty --- revision.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/revision.c b/revision.c index 25838fc..4908e66 100644 --- a/revision.c +++ b/revision.c @@ -1,4 +1,5 @@ #include "cache.h" +#include "numparse.h" #include "tag.h" #include "blob.h" #include "tree.h" @@ -1709,8 +1710,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg return argcount; } else if ((*arg == '-') && isdigit(arg[1])) { /* accept -, like traditional "head" */ - if (strtol_i(arg + 1, 10, &revs->max_count) < 0 || - revs->max_count < 0) + if (convert_i(arg + 1, 10, &revs->max_count)) die("'%s': not a non-negative integer", arg + 1); revs->no_walk = 0; } else if (!strcmp(arg, "-n")) { -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/14] diff_opt_parse(): use convert_i() when handling --abbrev=
die() with an error message if the argument is not a non-negative integer. This change tightens up parsing: '+' and '-', leading whitespace, and trailing junk are all disallowed now. Signed-off-by: Michael Haggerty --- diff.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/diff.c b/diff.c index be389ae..1cc5428 100644 --- a/diff.c +++ b/diff.c @@ -3830,7 +3830,8 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac) else if (!strcmp(arg, "--abbrev")) options->abbrev = DEFAULT_ABBREV; else if (skip_prefix(arg, "--abbrev=", &arg)) { - options->abbrev = strtoul(arg, NULL, 10); + if (convert_i(arg, 10, &options->abbrev)) + die("--abbrev requires an integer argument"); if (options->abbrev < MINIMUM_ABBREV) options->abbrev = MINIMUM_ABBREV; else if (40 < options->abbrev) -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/14] opt_arg(): use convert_i() in implementation
This shortens the code and avoids the old code's careless truncation from unsigned long to int. Signed-off-by: Michael Haggerty --- diff.c | 28 1 file changed, 8 insertions(+), 20 deletions(-) diff --git a/diff.c b/diff.c index 6e3f498..77c7acb 100644 --- a/diff.c +++ b/diff.c @@ -3366,16 +3366,10 @@ static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *va if (c == arg_short) { c = *++arg; if (!c) - return 1; - if (isdigit(c)) { - char *end; - int n = strtoul(arg, &end, 10); - if (*end) - return 0; - *val = n; - return 1; - } - return 0; + return 1; /* optional argument was missing */ + if (convert_i(arg, 10, val)) + return 0; + return 1; } if (c != '-') return 0; @@ -3384,16 +3378,10 @@ static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *va len = eq - arg; if (!len || strncmp(arg, arg_long, len)) return 0; - if (*eq) { - int n; - char *end; - if (!isdigit(*++eq)) - return 0; - n = strtoul(eq, &end, 10); - if (*end) - return 0; - *val = n; - } + if (!*eq) + return 1; /* '=' and optional argument were missing */ + if (convert_i(eq + 1, 10, val)) + return 0; return 1; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/14] numparse: new module for parsing integral numbers
Implement wrappers for strtol() and strtoul() that are safer and more convenient to use. Signed-off-by: Michael Haggerty --- Makefile | 1 + numparse.c | 180 + numparse.h | 207 + 3 files changed, 388 insertions(+) create mode 100644 numparse.c create mode 100644 numparse.h diff --git a/Makefile b/Makefile index 44f1dd1..6c0cfcc 100644 --- a/Makefile +++ b/Makefile @@ -732,6 +732,7 @@ LIB_OBJS += notes.o LIB_OBJS += notes-cache.o LIB_OBJS += notes-merge.o LIB_OBJS += notes-utils.o +LIB_OBJS += numparse.o LIB_OBJS += object.o LIB_OBJS += pack-bitmap.o LIB_OBJS += pack-bitmap-write.o diff --git a/numparse.c b/numparse.c new file mode 100644 index 000..90b44ce --- /dev/null +++ b/numparse.c @@ -0,0 +1,180 @@ +#include "git-compat-util.h" +#include "numparse.h" + +#define NUM_NEGATIVE (1 << 16) + + +static int parse_precheck(const char *s, unsigned int *flags) +{ + const char *number; + + if (isspace(*s)) { + if (!(*flags & NUM_LEADING_WHITESPACE)) + return -NUM_LEADING_WHITESPACE; + do { + s++; + } while (isspace(*s)); + } + + if (*s == '+') { + if (!(*flags & NUM_PLUS)) + return -NUM_PLUS; + number = s + 1; + *flags &= ~NUM_NEGATIVE; + } else if (*s == '-') { + if (!(*flags & NUM_MINUS)) + return -NUM_MINUS; + number = s + 1; + *flags |= NUM_NEGATIVE; + } else { + number = s; + *flags &= ~NUM_NEGATIVE; + } + + if (!(*flags & NUM_BASE_SPECIFIER)) { + int base = *flags & NUM_BASE_MASK; + if (base == 0) { + /* This is a pointless combination of options. */ + die("BUG: base=0 specified without NUM_BASE_SPECIFIER"); + } else if (base == 16 && starts_with(number, "0x")) { + /* +* We want to treat this as zero terminated by +* an 'x', whereas strtol()/strtoul() would +* silently eat the "0x". We accomplish this +* by treating it as a base 10 number: +*/ + *flags = (*flags & ~NUM_BASE_MASK) | 10; + } + } + return 0; +} + +int parse_l(const char *s, unsigned int flags, long *result, char **endptr) +{ + long l; + const char *end; + int err = 0; + + err = parse_precheck(s, &flags); + if (err) + return err; + + /* +* Now let strtol() do the heavy lifting: +*/ + errno = 0; + l = strtol(s, (char **)&end, flags & NUM_BASE_MASK); + if (errno) { + if (errno == ERANGE) { + if (!(flags & NUM_SATURATE)) + return -NUM_SATURATE; + } else { + return -NUM_OTHER_ERROR; + } + } + if (end == s) + return -NUM_NO_DIGITS; + + if (*end && !(flags & NUM_TRAILING)) + return -NUM_TRAILING; + + /* Everything was OK */ + *result = l; + if (endptr) + *endptr = (char *)end; + return 0; +} + +int parse_ul(const char *s, unsigned int flags, +unsigned long *result, char **endptr) +{ + unsigned long ul; + const char *end; + int err = 0; + + err = parse_precheck(s, &flags); + if (err) + return err; + + /* +* Now let strtoul() do the heavy lifting: +*/ + errno = 0; + ul = strtoul(s, (char **)&end, flags & NUM_BASE_MASK); + if (errno) { + if (errno == ERANGE) { + if (!(flags & NUM_SATURATE)) + return -NUM_SATURATE; + } else { + return -NUM_OTHER_ERROR; + } + } + if (end == s) + return -NUM_NO_DIGITS; + + /* +* strtoul(), perversely, accepts negative numbers, converting +* them to the positive number with the same bit pattern. We +* don't ever want that. +*/ + if ((flags & NUM_NEGATIVE) && ul) { + if (!(flags & NUM_SATURATE)) + return -NUM_SATURATE; + ul = 0; + } + + if (*end && !(flags & NUM_TRAILING)) + return -NUM_TRAILING; + + /* Everything was OK */ + *result = ul; + if (endptr) + *endptr = (char *)end; + return 0; +} + +int parse_i(const char *s, unsigned int flags, int *result, char **endptr) +{ + long l; + int err; + char *end; + + err = parse_l(s, flags, &l, &end); +
[PATCH 12/14] opt_arg(): simplify pointer handling
Increment "arg" when a character is consumed, not just before consuming the next character. Signed-off-by: Michael Haggerty --- diff.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/diff.c b/diff.c index 03cdabf..bc1e3c3 100644 --- a/diff.c +++ b/diff.c @@ -3358,14 +3358,13 @@ static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *va char c, *eq; int len; - if (*arg != '-') + if (*arg++ != '-') return 0; - c = *++arg; + c = *arg++; if (!c) return 0; if (c == arg_short) { - c = *++arg; - if (!c) + if (!*arg) return 1; /* optional argument was missing */ if (convert_i(arg, 10, val)) die("The value for -%c must be a non-negative integer", arg_short); @@ -3373,7 +3372,6 @@ static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *va } if (c != '-') return 0; - arg++; eq = strchrnul(arg, '='); len = eq - arg; if (!len || strncmp(arg, arg_long, len)) -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/14] builtin_diff(): detect errors when parsing --unified argument
The previous code used strtoul() without any checks that it succeeded. Instead use convert_l(), in strict mode, and die() if there is an error. This tightens up the parsing: * Leading whitespace is no longer allowed * '+' and '-' are no longer allowed * Trailing junk is not allowed Signed-off-by: Michael Haggerty --- diff.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/diff.c b/diff.c index abc32c8..a350677 100644 --- a/diff.c +++ b/diff.c @@ -2,6 +2,7 @@ * Copyright (C) 2005 Junio C Hamano */ #include "cache.h" +#include "numparse.h" #include "quote.h" #include "diff.h" #include "diffcore.h" @@ -2393,12 +2394,12 @@ static void builtin_diff(const char *name_a, xecfg.flags |= XDL_EMIT_FUNCCONTEXT; if (pe) xdiff_set_find_func(&xecfg, pe->pattern, pe->cflags); - if (!diffopts) - ; - else if (skip_prefix(diffopts, "--unified=", &v)) - xecfg.ctxlen = strtoul(v, NULL, 10); - else if (skip_prefix(diffopts, "-u", &v)) - xecfg.ctxlen = strtoul(v, NULL, 10); + if (diffopts + && (skip_prefix(diffopts, "--unified=", &v) || + skip_prefix(diffopts, "-u", &v))) { + if (convert_l(v, 10, &xecfg.ctxlen)) + die("--unified argument must be a non-negative integer"); + } if (o->word_diff) init_diff_words_data(&ecbdata, o, one, two); xdi_diff_outf(&mf1, &mf2, fn_out_consume, &ecbdata, -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/14] handle_revision_opt(): use skip_prefix() in many places
This reduces the need for "magic numbers". Signed-off-by: Michael Haggerty --- revision.c | 59 ++- 1 file changed, 30 insertions(+), 29 deletions(-) diff --git a/revision.c b/revision.c index 66520c6..25838fc 100644 --- a/revision.c +++ b/revision.c @@ -1719,8 +1719,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg revs->max_count = atoi(argv[1]); revs->no_walk = 0; return 2; - } else if (starts_with(arg, "-n")) { - revs->max_count = atoi(arg + 2); + } else if (skip_prefix(arg, "-n", &arg)) { + revs->max_count = atoi(arg); revs->no_walk = 0; } else if ((argcount = parse_long_opt("max-age", argv, &optarg))) { revs->max_age = atoi(optarg); @@ -1779,15 +1779,15 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg } else if (!strcmp(arg, "--author-date-order")) { revs->sort_order = REV_SORT_BY_AUTHOR_DATE; revs->topo_order = 1; - } else if (starts_with(arg, "--early-output")) { + } else if (skip_prefix(arg, "--early-output", &arg)) { int count = 100; - switch (arg[14]) { + switch (*arg++) { case '=': - count = atoi(arg+15); + count = atoi(arg); /* Fallthrough */ case 0: revs->topo_order = 1; - revs->early_output = count; + revs->early_output = count; } } else if (!strcmp(arg, "--parents")) { revs->rewrite_parents = 1; @@ -1804,12 +1804,12 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg revs->min_parents = 2; } else if (!strcmp(arg, "--no-merges")) { revs->max_parents = 1; - } else if (starts_with(arg, "--min-parents=")) { - revs->min_parents = atoi(arg+14); + } else if (skip_prefix(arg, "--min-parents=", &arg)) { + revs->min_parents = atoi(arg); } else if (starts_with(arg, "--no-min-parents")) { revs->min_parents = 0; - } else if (starts_with(arg, "--max-parents=")) { - revs->max_parents = atoi(arg+14); + } else if (skip_prefix(arg, "--max-parents=", &arg)) { + revs->max_parents = atoi(arg); } else if (starts_with(arg, "--no-max-parents")) { revs->max_parents = -1; } else if (!strcmp(arg, "--boundary")) { @@ -1891,26 +1891,27 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg revs->verbose_header = 1; revs->pretty_given = 1; get_commit_format(NULL, revs); - } else if (starts_with(arg, "--pretty=") || starts_with(arg, "--format=")) { + } else if (skip_prefix(arg, "--pretty=", &arg) || + skip_prefix(arg, "--format=", &arg)) { /* * Detached form ("--pretty X" as opposed to "--pretty=X") * not allowed, since the argument is optional. */ revs->verbose_header = 1; revs->pretty_given = 1; - get_commit_format(arg+9, revs); + get_commit_format(arg, revs); } else if (!strcmp(arg, "--show-notes") || !strcmp(arg, "--notes")) { revs->show_notes = 1; revs->show_notes_given = 1; revs->notes_opt.use_default_notes = 1; } else if (!strcmp(arg, "--show-signature")) { revs->show_signature = 1; - } else if (!strcmp(arg, "--show-linear-break") || - starts_with(arg, "--show-linear-break=")) { - if (starts_with(arg, "--show-linear-break=")) - revs->break_bar = xstrdup(arg + 20); - else - revs->break_bar = ".."; + } else if (!strcmp(arg, "--show-linear-break")) { + revs->break_bar = ".."; + revs->track_linear = 1; + revs->track_first_time = 1; + } else if (skip_prefix(arg, "--show-linear-break=", &arg)) { + revs->break_bar = xstrdup(arg); revs->track_linear = 1; revs->track_first_time = 1; } else if (starts_with(arg, "--show-notes=") || @@ -1961,8 +1962,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg revs->abbrev = 0; } else if (!strcmp(arg, "--abbrev")) { revs->abbrev = DEFAULT_ABBREV; - } else if (starts_with(arg, "--abbrev=")) { - revs->abbrev = strtoul(arg + 9, NULL, 10); + } else if (skip_prefix(arg,
[PATCH 03/14] write_subdirectory(): use convert_ui() for parsing mode
Use convert_ui() instead of strtoul_ui() when parsing tree entries' modes. This tightens up the parsing a bit: * Leading whitespace is no longer allowed * '+' and '-' are no longer allowed Signed-off-by: Michael Haggerty --- contrib/convert-objects/convert-objects.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/contrib/convert-objects/convert-objects.c b/contrib/convert-objects/convert-objects.c index f3b57bf..4f484a4 100644 --- a/contrib/convert-objects/convert-objects.c +++ b/contrib/convert-objects/convert-objects.c @@ -1,4 +1,5 @@ #include "cache.h" +#include "numparse.h" #include "blob.h" #include "commit.h" #include "tree.h" @@ -88,7 +89,7 @@ static int write_subdirectory(void *buffer, unsigned long size, const char *base unsigned int mode; char *slash, *origpath; - if (!path || strtoul_ui(buffer, 8, &mode)) + if (!path || convert_ui(buffer, 8, &mode)) die("bad tree conversion"); mode = convert_mode(mode); path++; -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/14] strtoul_ui(), strtol_i(): remove functions
Their callers have been changed to use the numparse module. Signed-off-by: Michael Haggerty --- git-compat-util.h | 26 -- 1 file changed, 26 deletions(-) diff --git a/git-compat-util.h b/git-compat-util.h index a3095be..cbe7f16 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -751,32 +751,6 @@ static inline int sane_iscase(int x, int is_lower) return (x & 0x20) == 0; } -static inline int strtoul_ui(char const *s, int base, unsigned int *result) -{ - unsigned long ul; - char *p; - - errno = 0; - ul = strtoul(s, &p, base); - if (errno || *p || p == s || (unsigned int) ul != ul) - return -1; - *result = ul; - return 0; -} - -static inline int strtol_i(char const *s, int base, int *result) -{ - long ul; - char *p; - - errno = 0; - ul = strtol(s, &p, base); - if (errno || *p || p == s || (int) ul != ul) - return -1; - *result = ul; - return 0; -} - #ifdef INTERNAL_QSORT void git_qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *)); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/14] numparse module: systematically tighten up integer parsing
Sorry for the long email. Feel free to skip over the background info and continue reading at "My solution" below. This odyssey started with a typo: $ git shortlog -snl v2.2.0..v2.3.0 14795 Junio C Hamano 1658 Jeff King 1400 Shawn O. Pearce 1109 Linus Torvalds 760 Jonathan Nieder [...] It took me a minute to realize why so many commits are listed. It turns out that "-l", which I added by accident, requires an integer argument, but it happily swallows "v2.2.0..v2.3.0" without error. This leaves no range argument, so the command traverses the entire history of HEAD. The relevant code is else if ((argcount = short_opt('l', av, &optarg))) { options->rename_limit = strtoul(optarg, NULL, 10); return argcount; } , which is broken in many ways. First of all, strtoul() is way too permissive for simple tasks like this: * It allows leading whitespace. * It allows arbitrary trailing characters. * It allows a leading sign character ('+' or '-'), even though the result is unsigned. * If the string doesn't contain a number at all, it sets its "endptr" argument to point at the start of the string but doesn't set errno. * If the value is larger than fits in an unsigned long, it returns the value clamped to the range 0..ULONG_MAX (setting errno to ERANGE). * If the value is between -ULONG_MAX and 0, it returns the positive integer with the same bit pattern, without setting errno(!) (I can hardly think of a scenario where this would be useful.) * For base=0 (autodetect base), it allows a base specifier prefix "0x" or "0" and parses the number accordingly. For base=16 it also allows a "0x" prefix. strtol() has similar foibles. The caller compounds the permissivity of strtoul() with further sins: * It does absolutely no error detection. * It assigns the return value, which is an unsigned long, to an int value. This could truncate the result, perhaps even resulting in rename_limit being set to a negative value. When I looked around, I found scores of sites that call atoi(), strtoul(), and strtol() carelessly. And it's understandable. Calling any of these functions safely is so much work as to be completely impractical in day-to-day code. git-compat-util.h has two functions, strtoul_ui() and strtol_i(), that try to make parsing integers a little bit easier. But they are only used in a few places, they hardly restrict the pathological flexibility of strtoul()/strtol(), strtoul_ui() doesn't implement clamping consistently when converting from unsigned long to unsigned int, and neither function can be used when the integer value *is* followed by other characters. My solution: the numparse module So I hereby propose a new module, numparse, to make it easier to parse integral values from strings in a convenient, safe, and flexible way. The first commit introduces the new module, and subsequent commits change a sample (a small fraction!) of call sites to use the new functions. Consider it a proof-of-concept. If people are OK with this approach, I will continue sending patches to fix other call sites. (I already have another two dozen such patches in my repo). Please see the docstrings in numparse.h in the first commit for detailed API docs. Briefly, there are eight new functions: parse_{l,ul,i,ui}(const char *s, unsigned int flags, T *result, char **endptr); convert_{l,ul,i,ui}(const char *s, unsigned int flags, T *result); The parse_*() functions are for parsing a number from the start of a string; the convert_*() functions are for parsing a string that consists of a single number. The flags argument selects not only the base of the number, but also which of strtol()/strtoul()'s many features should be allowed. The *_i() and *.ui() functions are for parsing int and unsigned int values; they are careful about how they truncate the corresponding long values. The functions all return 0 on success and a negative number on error. Here are a few examples of how these functions can be used (taken from the header of numparse.h): * Convert hexadecimal string s into an unsigned int. Die if there are any characters in s besides hexadecimal digits, or if the result exceeds the range of an unsigned int: if (convert_ui(s, 16, &result)) die("..."); * Read a base-ten long number from the front of a string, allowing sign characters and setting endptr to point at any trailing characters: if (parse_l(s, 10 | NUM_SIGN | NUM_TRAILING, &result, &endptr)) die("..."); * Convert decimal string s into a signed int, but not allowing the string to contain a '+' or '-' prefix (and thereby indirectly ensuring that the result will be non-negative): if (convert_i(s, 10, &result)) die("..."); * Convert s into a signed int, interpreting prefix "0x" to mean hexadecimal and "0" to mean octal. If the value doesn't fit in
[PATCH 02/14] cacheinfo_callback(): use convert_ui() when handling "--cacheinfo"
Use convert_ui() instead of strtoul_ui() to parse the argument. This tightens up the parsing a bit: * Leading whitespace is no longer allowed * '+' and '-' are no longer allowed Signed-off-by: Michael Haggerty --- builtin/update-index.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/builtin/update-index.c b/builtin/update-index.c index 5878986..845e944 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -4,6 +4,7 @@ * Copyright (C) Linus Torvalds, 2005 */ #include "cache.h" +#include "numparse.h" #include "lockfile.h" #include "quote.h" #include "cache-tree.h" @@ -672,7 +673,7 @@ static int cacheinfo_callback(struct parse_opt_ctx_t *ctx, } if (ctx->argc <= 3) return error("option 'cacheinfo' expects ,,"); - if (strtoul_ui(*++ctx->argv, 8, &mode) || + if (convert_ui(*++ctx->argv, 8, &mode) || get_sha1_hex(*++ctx->argv, sha1) || add_cacheinfo(mode, sha1, *++ctx->argv, 0)) die("git update-index: --cacheinfo cannot add %s", *ctx->argv); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] diff-lib.c: adjust position of i-t-a entries in diff
On Mon, Mar 16, 2015 at 09:05:45AM -0700, Junio C Hamano wrote: > The offending one came from eec3e7e4 (cache-tree: invalidate i-t-a > paths after generating trees, 2012-12-16), which was a fix to an > earlier bug where a cache-tree written out of an index with i-t-a > entries had incorrect information and still claimed it is fully > valid after write-tree rebuilt it. The test probably should add > another path without i-t-a bit, run the same "diff --cached" with > updated expectation before write-tre, and run the "diff --cached" > again to make sure it produces a result that match the updated > expectation. Would adding another non-i-t-a entry help? Before this patch "diff --cached" after write-tree shows the i-t-a entry only when eec3e7e4 is applied. But with this patch we don't show i-t-a entry any more, before or after write-tree, eec3e7e4 makes no visible difference. We could even revert eec3e7e4 and the outcome of "diff --cached" would be the same because we just sort of move the "invalidation" part from cache-tree to do_oneway_diff(). Not invalidating would speed up "diff --cached" when i-t-a entries are present. Still it may be a good idea to invalidate i-t-a paths to be on the safe side. Perhaps a patch like this to resurrect the test? -- 8< -- Subject: t2203: enable 'cache-tree invalidates i-t-a paths' test This test is disabled in the previous patch because the "diff --cached" expectation is the same, with or without eec3e7e4 [1] where this test is added. But it still is a good idea to invalidate i-t-a paths because the index does _not_ match the cached-tree exactly. "diff --cached" may not care, but other operations might. Update the test to catch this invalidation instead. [1] cache-tree: invalidate i-t-a paths after generating trees, 2012-12-16 Signed-off-by: Nguyễn Thái Ngọc Duy diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh index 42827b8..1a6c814 100755 --- a/t/t2203-add-intent.sh +++ b/t/t2203-add-intent.sh @@ -77,7 +77,7 @@ test_expect_success 'can "commit -a" with an i-t-a entry' ' git commit -a -m all ' -test_expect_failure 'cache-tree invalidates i-t-a paths' ' +test_expect_success 'cache-tree invalidates i-t-a paths' ' git reset --hard && mkdir dir && : >dir/foo && @@ -86,14 +86,14 @@ test_expect_failure 'cache-tree invalidates i-t-a paths' ' : >dir/bar && git add -N dir/bar && - git diff --cached --name-only >actual && - echo dir/bar >expect && - test_cmp expect actual && - git write-tree >/dev/null && - - git diff --cached --name-only >actual && - echo dir/bar >expect && + test-dump-cache-tree >actual && + cat >expect <<-\EOF && + invalid (1 subtrees) + invalid #(ref) (1 subtrees) + invalid dir/ (0 subtrees) + invalid #(ref) dir/ (0 subtrees) + EOF test_cmp expect actual ' -- 8< -- -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC/GSOC] make git-pull a builtin
The git-pull command is currently implemented by the shell script git-pull.sh. However, git-pull.sh suffers from some problems: * Inconsistent command line interface: since git-pull.sh does not use git's internal parse-options API, it's command line interface is inconsistent with other git builtins. For example, the option --st should throw an ambigious option error as it could mean --stat or --strategy, but git-pull.sh is hardcoded to assume --st resolves to --strategy. It also does not list the command line options in its usage help. * Inefficient: shell scripts can only access git functionality through spawning git processes, which has a relatively high overhead on certain operating systems such as Windows. Furthermore, the repeated starting up of git means that commonly-used files such config files, the index file and often accessed objects/packfiles are repeatedly read, which can cause poor performance on filesystems or operating systems with slow IO operations. * Makes it difficult to port git: git-pull.sh requires a posix conformant shell and utilities such as sed and tr. This makes it harder to port git to non-posix environments. By making git-pull a builtin, it has access to git's internal API and thus solves the above problems: * Usage of parse-options API means greater consistency of the command-line interface with other git commands. * No spawning of external git processes: internal API and builtins are called directly. This also means that they use the same underlying object/config cache, which reduces needless IO/parsing operations. * Low-level control over code execution and thus greater efficiency. For example commit SHA1s now only need to be parsed from a string once (when resolving branch refs), while as a shell script they would have to be stored as strings and parsed over and over again. Furthermore, this reduces git's dependence on posix shells and utilities, which would be very helpful in porting git to non-posix environments such as Windows. Signed-off-by: Paul Tan --- Hi, I would like to share this very rough prototype with everyone. All the relevant tests (t5520-t5572) pass. I did a quick benchmark on my Linux and Windows systems, which have nearly the same hardware, by running 'git pull' 100 times on a repo and its clone with 1 commit. On Linux, there is negligible performance difference, but on Windows the runtime fell from 8m 25s to 1m 3s. The timings should not be affected by any msys2 emulation as the git processes were spawned from a native executable. I started this as a just-for-fun exercise to learn about the git internal API, but seeing as the results are very encouraging, I'm tempted to take on this project for GSoC. Of course, it would be easy to make any shell script a builtin by using the run_command* functions, but I feel that that would not take advantage of the potential benefits in using git's internal API directly. So, I set myself a few requirements for the ideal end result: * zero spawning of processes so that the internal object, index and config cache can be taken advantage of. Some operating systems (like Windows) also have a relatively large overhead when spawning processes[1]. * avoid needless parsing/string conversions by directly accessing the C data structures. Massaging of strings cause too much code churn, so functions that take the richer C data structures as input should be preferred over functions that work by parsing string arguments (e.g. prefer using the run_diff_files() function instead of calling cmd_diff_files or spawning a git diff-files process) * use the internal API as much as possible: share code between the builtins (e.g. builtin fmt-merge-msg, exposed in fmt-merge-msg.h) in order to reduce code complexity. For example, the fork_point() code in this patch should be shared with the merge-base builtin. [1] http://stackoverflow.com/questions/10710912 However, there were some problems encountered when trying to fulfill the requirements above: * Many builtins only provide their main entry point to access their functionality -- this means that argument options are parsed, put back together, and then re-parsed again, which is quite inefficient and can create a lot of code complexity. A better solution would be to expose the functionality of these builtins as a separate function without argument parsing, similar to how fmt_merge_msg is implemented. * Some buitins do not expect to be called by their cmd_* functions through code paths other than git.c. For example, fetch.c actually frees the strings it pulls from argv. This patch works around these functions by not attempting to free memory passed to these functions, but a more comprehensive solution would be to investigate and fix these functions. (Or expose their functionality in as a separate function) * Finally, there is the possibility that in the process of conversion bugs or incompatible behavioral changes may be introduced whic
Re: [PATCH v4 1/2] cat-file: teach cat-file a '--literally' option
On March 17, 2015 12:21:33 PM GMT+05:30, Eric Sunshine wrote: >On Tue, Mar 17, 2015 at 1:16 AM, Karthik Nayak >wrote: >> diff --git a/builtin/cat-file.c b/builtin/cat-file.c >> index df99df4..1625246 100644 >> --- a/builtin/cat-file.c >> +++ b/builtin/cat-file.c >> @@ -323,8 +338,8 @@ static int batch_objects(struct batch_options >*opt) >> } > >I don't presently have time for a proper read-through, however, this >popped out while quickly running my eyes over the changes. > >> static const char * const cat_file_usage[] = { >> - N_("git cat-file (-t | -s | -e | -p | | --textconv) >"), >> - N_("git cat-file (--batch | --batch-check) < >"), >> + N_("git cat-file (-t [--literally]|-s >[--literally]|-e|-p||--textconv) "), >> + N_("git cat-file (--batch | --batch-check) >"), > >The '<' in the second line before is intentional. It >means that is provided via stdin. Therefore, it is >incorrect to remove it. I see. It seemed out of place. Makes sense, probably should have asked. Thanks. > >> NULL >> }; >> -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/GSoC/RFC] fetch: git fetch --deepen
> C Git is not the only client that can talk to upload-pack. A buggy > client may send both. The least we need is make sure it would not > crash or break the server security in any way. But if we have to > consider that, we may as well reject with a clear message, which would > help the client writer. And speaking of clients.. Actually, I mean that the upload-pack in this patch will not crash, whatever it receives. e.g. "depth N", "depth_deepen", "depth N"+"depth_deepen" Because "depth_deepen" is just a boolean signal to tell the upload-pack that "depth N" means "deepen N", it takes effect only when "depth N" is given. Otherwise, "depth_deepen" will be ignored. > You missed Junio's keyword "existing". Your new client will work well > with your new server. But it breaks "git clone --depth" (or "git fetch > --depth") for all existing git installations. Oh, thank you for pointing out this. And sorry Junio. I misunderstood what you said. I haven't realized the problem of existing git server. You've reminded me of the importance of compatibility. Thank you! 2015-03-17 18:44 GMT+08:00 Duy Nguyen : > On Mon, Mar 16, 2015 at 11:10 PM, Dongcan Jiang > wrote: >> Hi Duy, >> >> Sorry for my late response. >> >>> we need to make sure that upload-pack barf if some client sends both >>> "deepen" and >>> "depth". >> >> Actually, in my current design, when client just wants "depth", it >> sends "depth N"; >> when it want "deepen", it sends "depth N" as well as "depth_deepen". >> For the latter >> case, it tells upload-pack to collect objects for "deepen N". >> >> Thus, I'm not so sure whether upload-pack needs to check their exclusion. > > C Git is not the only client that can talk to upload-pack. A buggy > client may send both. The least we need is make sure it would not > crash or break the server security in any way. But if we have to > consider that, we may as well reject with a clear message, which would > help the client writer. And speaking of clients.. > > On Mon, Mar 16, 2015 at 11:08 PM, Dongcan Jiang > wrote: - if (starts_with(line, "deepen ")) { + if (starts_with(line, "depth ")) { char *end; - depth = strtol(line + 7, &end, 0); - if (end == line + 7 || depth <= 0) - die("Invalid deepen: %s", line); + depth = strtol(line + 6, &end, 0); + if (end == line + 6 || depth <= 0) + die("Invalid depth: %s", line); continue; } if (!starts_with(line, "want ") || >>> >>> I do not quite understand this hunk. What happens when this program >>> is talking to an existing fetch-pack that requested "deepen"? >> >> In this hunk, I actually just changed the one command name in upload-pack >> service from "deepen" to "depth". I made this change because the name >> "deepen" here is quite misleading, as it just means "depth". Of course, >> I've changed the caller's sent pack from "deepen" to "depth" too (See below). > > You missed Junio's keyword "existing". Your new client will work well > with your new server. But it breaks "git clone --depth" (or "git fetch > --depth") for all existing git installations. > -- > Duy -- 江东灿(Dongcan Jiang) Team of Search Engine & Web Mining School of Electronic Engineering & Computer Science Peking University, Beijing, 100871, P.R.China -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GSoC] Applying for conversion scripts to builtins
Hi Paul, On 2015-03-17 01:22, Paul Tan wrote: > On Tue, Mar 17, 2015 at 12:49 AM, Yurii Shevtsov wrote: > > Generally, it would be easy to convert any shell script to C by just > using the run_command* functions (and in less lines of code), but that > would not be taking advantage of the potential benefits in porting > shell scripts to C. To summarize the (ideal) requirements: > > * zero spawning of processes so that the internal object/config/index > cache can be taken advantage of. (and to avoid the process spawning > overhead which is relative large in e.g. Windows) Spawning definitely uses up many more resources on Windows. However, spawning a full-fledged Bash requires MSys (or soon MSys2) to spin up an entire POSIX emulation layer. This costs us dearly. For example, when I run the t3404 test (which exercises scripting heavily, what with `git rebase -i` being implemented as a shell script) on MacOSX, it takes roughly a minute to complete. On a comparable Windows machine, it takes roughly 12 minutes to complete. Therefore, I would wager a bet that just the mere conversion of a shell script into even a primitive `run_command()`-based builtin would help performance on Windows in a noticeable manner. Of course, it would be *even nicer* to avoid the spawning altogether. > * avoid needless parsing since we have direct access to the C data > structures. True that. Turning SHA-1s into strings, spawning, and reparsing the same SHA-1 is quite a lot of unnecessary churn. The biggest benefit of avoiding needless parsing, however, is not performance. It is avoiding quoting issues. This is particularly so on Windows, where Git is sometimes called from outside a shell environment, where we have to deal with inconsistent quoting because it is every Windows program's own job to parse the command-line, including the quoting. > * use the internal API as much as possible: share code between the > builtins (e.g. fmt-merge-msg.c, exposed in fmt-merge-msg.h) in order > to reduce code complexity. That is definitely something that even the Git maintainer should be interested in (he does not touch Windows, therefore the performance differences do not concern him): by sharing code paths between different subcommands, you ensure that you have to fix problems only once, not twice or more. Concrete example: on Windows, we have file locking issues because files that are in use cannot be deleted. For that reason, we have Windows-specific code that is "nice" by trying harder to delete files, giving programs a little time to let their locks go. This locking issue happens also when a virus scanner "uses", say, the .git-rewrite/revs file that was written by `git filter-branch`, while said shell script already wants to delete the file because it is obsolete. If `git filter-branch` were a builtin, the bug would already be fixed due to our override of the `unlink()` function in C. Now we have to fix that bug separately because `filter-branch` is a shell script. > The biggest wins would definitely be portability, but there may be > performance improvements, though they are theoretical at this point. > > I'm not exactly sure if the above requirements are sane, which is why > I'm also CC-ing Dscho who knows the problems of git on Windows more > than I do. Thanks for bringing this to my attention. I hope I managed to add useful information to the discussion. Ciao, Johannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/GSoC/RFC] fetch: git fetch --deepen
On Mon, Mar 16, 2015 at 11:10 PM, Dongcan Jiang wrote: > Hi Duy, > > Sorry for my late response. > >> we need to make sure that upload-pack barf if some client sends both >> "deepen" and >> "depth". > > Actually, in my current design, when client just wants "depth", it > sends "depth N"; > when it want "deepen", it sends "depth N" as well as "depth_deepen". > For the latter > case, it tells upload-pack to collect objects for "deepen N". > > Thus, I'm not so sure whether upload-pack needs to check their exclusion. C Git is not the only client that can talk to upload-pack. A buggy client may send both. The least we need is make sure it would not crash or break the server security in any way. But if we have to consider that, we may as well reject with a clear message, which would help the client writer. And speaking of clients.. On Mon, Mar 16, 2015 at 11:08 PM, Dongcan Jiang wrote: >>> - if (starts_with(line, "deepen ")) { >>> + if (starts_with(line, "depth ")) { >>> char *end; >>> - depth = strtol(line + 7, &end, 0); >>> - if (end == line + 7 || depth <= 0) >>> - die("Invalid deepen: %s", line); >>> + depth = strtol(line + 6, &end, 0); >>> + if (end == line + 6 || depth <= 0) >>> + die("Invalid depth: %s", line); >>> continue; >>> } >>> if (!starts_with(line, "want ") || >> >> I do not quite understand this hunk. What happens when this program >> is talking to an existing fetch-pack that requested "deepen"? > > In this hunk, I actually just changed the one command name in upload-pack > service from "deepen" to "depth". I made this change because the name > "deepen" here is quite misleading, as it just means "depth". Of course, > I've changed the caller's sent pack from "deepen" to "depth" too (See below). You missed Junio's keyword "existing". Your new client will work well with your new server. But it breaks "git clone --depth" (or "git fetch --depth") for all existing git installations. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Promoting Git developers
On Sun, Mar 15, 2015 at 9:46 AM, Christian Couder wrote: > > I wrote something about a potential Git Rev News news letter: > > https://github.com/git/git.github.io/pull/15 > I would love to have/use something like this in the GitMinutes podcast. Perhaps in addition to the very random interview format that I have now, I could do a more regular episodes about Git news, where I incorporate this. I also volunteer to help with the production, if you'll allow list lurkers like myself to contribute ;) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] git submodule purge
On Tue, Mar 17, 2015 at 09:18:26AM +0100, Patrick Steinhardt wrote: > Is it even possible to create a new submodule without any > upstream repository? At least `git submodule init` does not work > without a corresponding entry in .gitmodules which the user would > have needed to create himself manually. In this case one _could_ > assume that the user knows what he is doing and expect him not to > call `submodule purge` (or whatever the command will be called) > on the authoritative copy. Other than that I've got no idea how > to assure safety. Look at git/t/t7400-submodule-basic.sh for example at the test starting at line 84 on how to add a submodule without any upstream. Git has already a disadvantage against other SCM (like mercurial) because it's "too easy to loose data with git". Meaning that we purge unrefered commits. (Yes this is up to debate if this is good or bad, but here's not the place). I think we should be very carefully with adding commands that permanently removes data. They should be really well crafted so that there's no way to do this by mistake. -- Fredrik Gustafsson phone: +46 733-608274 e-mail: iv...@iveqy.com website: http://www.iveqy.com -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] git submodule purge
On Mon, Mar 16, 2015 at 08:55:03AM -0700, Junio C Hamano wrote: > Patrick Steinhardt writes: > > > I think there should be a command that is able to remove those > > dangling repositories if the following conditions are met: > > > > - the submodule should not be initialized > > > > - the submodule should not have an entry in .gitmodules in the > > currently checked out revision > > > > - the submodule should not contain any commits that are not > > upstream > > > > - the submodule should not contain other submodules that do not > > meet those conditions > > I do not have a strong opinion on whether it is a good idea to make > it possible to remove the .git/modules/*, but should it be a > separate subcommand, or should it be an option to the 'deinit' > subcommand? See my response to Jonathan. > Also, how would you apply the safety to a repository without > "upstream", i.e. the authoritative copy? Is it even possible to create a new submodule without any upstream repository? At least `git submodule init` does not work without a corresponding entry in .gitmodules which the user would have needed to create himself manually. In this case one _could_ assume that the user knows what he is doing and expect him not to call `submodule purge` (or whatever the command will be called) on the authoritative copy. Other than that I've got no idea how to assure safety. Regards Patrick signature.asc Description: PGP signature
Re: [RFC] git submodule purge
On Mon, Mar 16, 2015 at 01:03:53PM -0700, Jonathan Nieder wrote: > (+cc: Jens and Heiko, submodule experts) > Hi, > > Patrick Steinhardt wrote: > > > This proposal is just for discussion. If there is any interest I > > will implement the feature and send some patches. > > > > Currently it is hard to properly remove submodules. That is when > > a submodule is deinitialized and removed from a repository the > > directory '.git/modules/' will still be present and > > there is no way to remove it despite manually calling `rm` on it. > > I think there should be a command that is able to remove those > > dangling repositories if the following conditions are met: > > > > - the submodule should not be initialized > > > > - the submodule should not have an entry in .gitmodules in the > > currently checked out revision > > > > - the submodule should not contain any commits that are not > > upstream > > > > - the submodule should not contain other submodules that do not > > meet those conditions > > > > This would ensure that it is hard to loose any commits that may > > be of interest. In the case that the user knows what he is doing > > we may provide a '--force' switch to override those checks. > > Those conditions look simultaneously too strong and too weak. ;-) > > In principle, it should be safe to remove .git/modules/ as > long as > > (1) it (and its submodules, sub-sub-modules, etc) doesn't have any > un-pushed local commits. > > (2) it is not being referred to by a .git file in the work tree of > the parent repository. > > Condition (1) can be relaxed if the user knows what they are losing > and is okay with that. Condition (2) can be avoided by removing > (de-initing) the copy of that submodule in the worktree at the same > time. > > The functionality sounds like a useful thing to have, whether as an > option to 'git submodule deinit' or as a new subcommand. In the long > term I would like it to be possible to do everything 'git submodule' > can do using normal git commands instead of that specialized > interface. What command do you think this would eventually belong in? > (An option to "git gc", maybe?) > > Thanks, > Jonathan Thanks for your feedback. Considering that purging the submodule is tightly coupled with de-initializing it, it might make sense to provide this functionality as part of `git submodule deinit`. Maybe something like `git submodule deinit --purge` would work for the user. Problem is if the user first removes the submodule and does not first deinitialize it he is not able to purge the repository afterwards as deinit will complain about the submodule not being matched anymore. We could just make `deinit --purge` work with removed submodules, but that does not feel very natural to me. `git gc` feels saner in that regard, but I don't think it would be easy to spot for users as this command is in general not used very frequently by them. One could argue though that it does not need to be discoverable. Regards Patrick signature.asc Description: PGP signature
Re: [PATCH 0/5] not making corruption worse
On Tue, Mar 17, 2015 at 03:27:50AM -0400, Jeff King wrote: > The general strategy for these is to use for_each_rawref traversals in > these situations. That doesn't cover _every_ possible scenario. For > example, you could do: > > git clone --no-local repo.git backup.git && > rm -rf repo.git > > and you might be disappointed if "backup.git" omitted some broken refs > (upload-pack will simply skip the broken refs in its advertisement). We > could tighten this, but then it becomes hard to access slightly broken > repositories (e.g., you might prefer to clone what you can, and not have > git die() when it tries to serve the breakage). Patch 2 provides a > tweakable safety valve for this. One thing I thought about while working on this was whether we should just make _all_ ref iterations for_each_rawref. The benefit to not doing so in the hypothetical above is that you might be able to clone "foo" even if "bar" is broken. But it strikes me as weird that we consider the _tips_ of history to be special for ignoring breakage. If the tip of "bar" is broken, we omit it. But if the tip is fine, and there's breakage three commits down in the history, then doing a clone is going to fail horribly, as pack-objects realizes it can't generate the pack. So in practice, I'm not sure how much you're buying with the "don't mention broken refs" code. OTOH, there are probably _some_ situations that can be recovered with the current code that could not otherwise. For example, in the current code, I can still fetch "foo" even if "bar" is broken 3 commits down. Whereas if the tip is broken, there's a reasonable chance that "upload-pack" would just barf and I could fetch nothing. So I stuck to the status quo in most cases, and only turned on the more aggressive behavior for destructive operations (and people who want to go wild can set GIT_REF_PARANOIA=1 for their every day operations if they want to). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/5] refs.c: drop curate_packed_refs
When we delete a ref, we have to rewrite the entire packed-refs file. We take this opportunity to "curate" the packed-refs file and drop any entries that are crufty or broken. Dropping broken entries (e.g., with bogus names, or ones that point to missing objects) is actively a bad idea, as it means that we lose any notion that the data was there in the first place. Aside from the general hackiness that we might lose any information about ref "foo" while deleting an unrelated ref "bar", this may seriously hamper any attempts by the user at recovering from the corruption in "foo". They will lose the sha1 and name of "foo"; the exact pointer may still be useful even if they recover missing objects from a different copy of the repository. But worse, once the ref is gone, there is no trace of the corruption. A follow-up "git prune" may delete objects, even though it would otherwise bail when seeing corruption. We could just drop the "broken" bits from curate_packed_refs, and continue to drop the "crufty" bits: refs whose loose counterpart exists in the filesystem. This is not wrong to do, and it does have the advantage that we may write out a slightly smaller packed-refs file. But it has two disadvantages: 1. It is a potential source of races or mistakes with respect to these refs that are otherwise unrelated to the operation. To my knowledge, there aren't any active problems in this area, but it seems like an unnecessary risk. 2. We have to spend time looking up the matching loose refs for every item in the packed-refs file. If you have a large number of packed refs that do not change, that outweights the benefit from writing out a smaller packed-refs file (it doesn't get smaller, and you do a bunch of directory traversal to find that out). Signed-off-by: Jeff King --- I'll admit my argument against curate_packed_refs is a bit hand-wavy. I won't be _too_ sad if somebody insists on cutting this back to just keeping "broken" refs around, and still curating the "crufty" ones. refs.c | 67 + t/t5312-prune-corruption.sh | 2 +- 2 files changed, 2 insertions(+), 67 deletions(-) diff --git a/refs.c b/refs.c index 7f0e7be..47e4e53 100644 --- a/refs.c +++ b/refs.c @@ -2621,68 +2621,10 @@ int pack_refs(unsigned int flags) return 0; } -/* - * If entry is no longer needed in packed-refs, add it to the string - * list pointed to by cb_data. Reasons for deleting entries: - * - * - Entry is broken. - * - Entry is overridden by a loose ref. - * - Entry does not point at a valid object. - * - * In the first and third cases, also emit an error message because these - * are indications of repository corruption. - */ -static int curate_packed_ref_fn(struct ref_entry *entry, void *cb_data) -{ - struct string_list *refs_to_delete = cb_data; - - if (entry->flag & REF_ISBROKEN) { - /* This shouldn't happen to packed refs. */ - error("%s is broken!", entry->name); - string_list_append(refs_to_delete, entry->name); - return 0; - } - if (!has_sha1_file(entry->u.value.sha1)) { - unsigned char sha1[20]; - int flags; - - if (read_ref_full(entry->name, 0, sha1, &flags)) - /* We should at least have found the packed ref. */ - die("Internal error"); - if ((flags & REF_ISSYMREF) || !(flags & REF_ISPACKED)) { - /* -* This packed reference is overridden by a -* loose reference, so it is OK that its value -* is no longer valid; for example, it might -* refer to an object that has been garbage -* collected. For this purpose we don't even -* care whether the loose reference itself is -* invalid, broken, symbolic, etc. Silently -* remove the packed reference. -*/ - string_list_append(refs_to_delete, entry->name); - return 0; - } - /* -* There is no overriding loose reference, so the fact -* that this reference doesn't refer to a valid object -* indicates some kind of repository corruption. -* Report the problem, then omit the reference from -* the output. -*/ - error("%s does not point to a valid object!", entry->name); - string_list_append(refs_to_delete, entry->name); - return 0; - } - - return 0; -} - int repack_without_refs(struct string_list *refnames, struct strbuf *err) { struct ref_dir *packed; - struct string_list refs_to_delete = STRING_LIST_INIT_DUP; -
[PATCH 4/5] repack: turn on "ref paranoia" when doing a destructive repack
If we are repacking with "-ad", we will drop any unreachable objects. Likewise, using "-Ad --unpack-unreachable=" will drop any old, unreachable objects. In these cases, we want to make sure the reachability we compute with "--all" is complete. We can do this by passing GIT_REF_PARANOIA=1 in the environment to pack-objects. Note that "-Ad" is safe already, because it only loosens unreachable objects. It is up to "git prune" to avoid deleting them. Signed-off-by: Jeff King --- builtin/repack.c| 8 ++-- t/t5312-prune-corruption.sh | 2 +- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 28fbc70..f2edeb0 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -228,13 +228,17 @@ int cmd_repack(int argc, const char **argv, const char *prefix) get_non_kept_pack_filenames(&existing_packs); if (existing_packs.nr && delete_redundant) { - if (unpack_unreachable) + if (unpack_unreachable) { argv_array_pushf(&cmd.args, "--unpack-unreachable=%s", unpack_unreachable); - else if (pack_everything & LOOSEN_UNREACHABLE) + argv_array_push(&cmd.env_array, "GIT_REF_PARANOIA=1"); + } else if (pack_everything & LOOSEN_UNREACHABLE) { argv_array_push(&cmd.args, "--unpack-unreachable"); + } else { + argv_array_push(&cmd.env_array, "GIT_REF_PARANOIA=1"); + } } } else { argv_array_push(&cmd.args, "--unpacked"); diff --git a/t/t5312-prune-corruption.sh b/t/t5312-prune-corruption.sh index cccab58..e3e9994 100755 --- a/t/t5312-prune-corruption.sh +++ b/t/t5312-prune-corruption.sh @@ -38,7 +38,7 @@ test_expect_success 'put bogus object into pack' ' verbose git cat-file -e $bogus ' -test_expect_failure 'destructive repack keeps packed object' ' +test_expect_success 'destructive repack keeps packed object' ' test_might_fail git repack -Ad --unpack-unreachable=now && verbose git cat-file -e $bogus && test_might_fail git repack -ad && -- 2.3.3.520.g3cfbb5d -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] prune: turn on ref_paranoia flag
Prune should know about broken objects at the tips of refs, so that we can feed them to our traversal rather than ignoring them. It's better for us to abort the operation on the broken object than it is to start deleting objects with an incomplete view of the reachability namespace. Note that for missing objects, aborting is the best we can do. For a badly-named ref, we technically could use its sha1 as a reachability tip. However, the iteration code just feeds us a null sha1, so there would be a reasonable amount of code involved to pass down our wishes. It's not really worth trying to do better, because this is a case that should happen extremely rarely, and the message we provide: fatal: unable to parse object: refs/heads/bogus:name is probably enough to point the user in the right direction. Signed-off-by: Jeff King --- Note that we should already be aborting for non-tip objects. I guess we could test that explicitly, too, but I didn't here. builtin/prune.c | 1 + t/t5312-prune-corruption.sh | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/builtin/prune.c b/builtin/prune.c index 04d3b12..17094ad 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -115,6 +115,7 @@ int cmd_prune(int argc, const char **argv, const char *prefix) expire = ULONG_MAX; save_commit_buffer = 0; check_replace_refs = 0; + ref_paranoia = 1; init_revisions(&revs, prefix); argc = parse_options(argc, argv, prefix, options, prune_usage, 0); diff --git a/t/t5312-prune-corruption.sh b/t/t5312-prune-corruption.sh index 167031e..cccab58 100755 --- a/t/t5312-prune-corruption.sh +++ b/t/t5312-prune-corruption.sh @@ -25,7 +25,7 @@ test_expect_success 'create history reachable only from a bogus-named ref' ' git reset --hard HEAD^ ' -test_expect_failure 'pruning does not drop bogus object' ' +test_expect_success 'pruning does not drop bogus object' ' test_when_finished "git hash-object -w -t commit saved" && test_might_fail git prune --expire=now && verbose git cat-file -e $bogus @@ -62,7 +62,7 @@ test_expect_success 'create history with missing tip commit' ' test_must_fail git cat-file -e $missing ' -test_expect_failure 'pruning with a corrupted tip does not drop history' ' +test_expect_success 'pruning with a corrupted tip does not drop history' ' test_when_finished "git hash-object -w -t commit saved" && test_might_fail git prune --expire=now && verbose git cat-file -e $recoverable -- 2.3.3.520.g3cfbb5d -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/5] refs: introduce a "ref paranoia" flag
Most operations that iterate over refs are happy to ignore broken cruft. However, some operations should be performed with knowledge of these broken refs, because it is better for the operation to choke on a missing object than it is to silently pretend that the ref did not exist (e.g., if we are computing the set of reachable tips in order to prune objects). These processes could just call for_each_rawref, except that ref iteration is often hidden behind other interfaces. For instance, for a destructive "repack -ad", we would have to inform "pack-objects" that we are destructive, and then it would in turn have to tell the revision code that our "--all" should include broken refs. It's much simpler to just set a global for "dangerous" operations that includes broken refs in all iterations. Signed-off-by: Jeff King --- I waffled on documenting this for end users. I'm not sure if people would ever want to actually use the environment variable themselves. The best hypothetical I could come up with was the: git clone --no-local repo.git backup.git && rm -rf repo.git scenario. Documentation/git.txt | 11 +++ cache.h | 8 environment.c | 1 + refs.c| 5 + 4 files changed, 25 insertions(+) diff --git a/Documentation/git.txt b/Documentation/git.txt index af30620..8da85a6 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -1026,6 +1026,17 @@ GIT_ICASE_PATHSPECS:: variable when it is invoked as the top level command by the end user, to be recorded in the body of the reflog. +`GIT_REF_PARANOIA`:: + If set to `1`, include broken or badly named refs when iterating + over lists of refs. In a normal, non-corrupted repository, this + does nothing. However, enabling it may help git to detect and + abort some operations in the presence of broken refs. Git sets + this variable automatically when performing destructive + operations like linkgit:git-prune[1]. You should not need to set + it yourself unless you want to be paranoid about making sure + an operation has touched every ref (e.g., because you are + cloning a repository to make a backup). + Discussion[[Discussion]] diff --git a/cache.h b/cache.h index 761c570..162ea6f 100644 --- a/cache.h +++ b/cache.h @@ -614,6 +614,14 @@ extern int protect_hfs; extern int protect_ntfs; /* + * Include broken refs in all ref iterations, which will + * generally choke dangerous operations rather than letting + * them silently proceed without taking the broken ref into + * account. + */ +extern int ref_paranoia; + +/* * The character that begins a commented line in user-editable file * that is subject to stripspace. */ diff --git a/environment.c b/environment.c index 1ade5c9..a40044c 100644 --- a/environment.c +++ b/environment.c @@ -24,6 +24,7 @@ int is_bare_repository_cfg = -1; /* unspecified */ int log_all_ref_updates = -1; /* unspecified */ int warn_ambiguous_refs = 1; int warn_on_object_refname_ambiguity = 1; +int ref_paranoia = -1; int repository_format_version; const char *git_commit_encoding; const char *git_log_output_encoding; diff --git a/refs.c b/refs.c index e23542b..7f0e7be 100644 --- a/refs.c +++ b/refs.c @@ -1934,6 +1934,11 @@ static int do_for_each_ref(struct ref_cache *refs, const char *base, data.fn = fn; data.cb_data = cb_data; + if (ref_paranoia < 0) + ref_paranoia = git_env_bool("GIT_REF_PARANOIA", 0); + if (ref_paranoia) + data.flags |= DO_FOR_EACH_INCLUDE_BROKEN; + return do_for_each_entry(refs, base, do_one_ref, &data); } -- 2.3.3.520.g3cfbb5d -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] t5312: test object deletion code paths in a corrupted repository
When we are doing a destructive operation like "git prune", we want to be extra careful that the set of reachable tips we compute is valid. If there is any corruption or oddity, we are better off aborting the operation and letting the user figure things out rather than plowing ahead and possibly deleting some data that cannot be recovered. The tests here include: 1. Pruning objects mentioned only be refs with invalid names. This used to abort prior to d0f810f (refs.c: allow listing and deleting badly named refs, 2014-09-03), but since then we silently ignore the tip. Likewise, we test repacking that can drop objects (either "-ad", which drops anything unreachable, or "-Ad --unpack-unreachable=", which tries to optimize out a loose object write that would be directly pruned). 2. Pruning objects when some refs point to missing objects. We don't know whether any dangling objects would have been reachable from the missing objects. We are better to keep them around, as they are better than nothing for helping the user recover history. 3. Packed refs that point to missing objects can sometimes be dropped. By itself, this is more of an annoyance (you do not have the object anyway; even if you can recover it from elsewhere, all you are losing is a placeholder for your state at the time of corruption). But coupled with (2), if we drop the ref and then go on to prune, we may lose unrecoverable objects. Note that we use test_might_fail for some of the operations. In some cases, it would be appropriate to abort the operation, and in others, it might be acceptable to continue but taking the information into account. The tests don't care either way, and check only for data loss. Signed-off-by: Jeff King --- t/t5312-prune-corruption.sh | 104 1 file changed, 104 insertions(+) create mode 100755 t/t5312-prune-corruption.sh diff --git a/t/t5312-prune-corruption.sh b/t/t5312-prune-corruption.sh new file mode 100755 index 000..167031e --- /dev/null +++ b/t/t5312-prune-corruption.sh @@ -0,0 +1,104 @@ +#!/bin/sh + +test_description=' +Test pruning of repositories with minor corruptions. The goal +here is that we should always be erring on the side of safety. So +if we see, for example, a ref with a bogus name, it is OK either to +bail out or to proceed using it as a reachable tip, but it is _not_ +OK to proceed as if it did not exist. Otherwise we might silently +delete objects that cannot be recovered. +' +. ./test-lib.sh + +test_expect_success 'disable reflogs' ' + git config core.logallrefupdates false && + rm -rf .git/logs +' + +test_expect_success 'create history reachable only from a bogus-named ref' ' + test_tick && git commit --allow-empty -m master && + base=$(git rev-parse HEAD) && + test_tick && git commit --allow-empty -m bogus && + bogus=$(git rev-parse HEAD) && + git cat-file commit $bogus >saved && + echo $bogus >.git/refs/heads/bogus:name && + git reset --hard HEAD^ +' + +test_expect_failure 'pruning does not drop bogus object' ' + test_when_finished "git hash-object -w -t commit saved" && + test_might_fail git prune --expire=now && + verbose git cat-file -e $bogus +' + +test_expect_success 'put bogus object into pack' ' + git tag reachable $bogus && + git repack -ad && + git tag -d reachable && + verbose git cat-file -e $bogus +' + +test_expect_failure 'destructive repack keeps packed object' ' + test_might_fail git repack -Ad --unpack-unreachable=now && + verbose git cat-file -e $bogus && + test_might_fail git repack -ad && + verbose git cat-file -e $bogus +' + +# subsequent tests will have different corruptions +test_expect_success 'clean up bogus ref' ' + rm .git/refs/heads/bogus:name +' + +test_expect_success 'create history with missing tip commit' ' + test_tick && git commit --allow-empty -m one && + recoverable=$(git rev-parse HEAD) && + git cat-file commit $recoverable >saved && + test_tick && git commit --allow-empty -m two && + missing=$(git rev-parse HEAD) && + # point HEAD elsewhere + git checkout $base && + rm .git/objects/$(echo $missing | sed "s,..,&/,") && + test_must_fail git cat-file -e $missing +' + +test_expect_failure 'pruning with a corrupted tip does not drop history' ' + test_when_finished "git hash-object -w -t commit saved" && + test_might_fail git prune --expire=now && + verbose git cat-file -e $recoverable +' + +test_expect_success 'pack-refs does not silently delete broken loose ref' ' + git pack-refs --all --prune && + echo $missing >expect && + git rev-parse refs/heads/master >actual && + test_cmp expect actual +' + +# we do not want to count on running pack-refs to +# actually pack it, as it is perfectly rea
[PATCH 0/5] not making corruption worse
This is a grab bag of fixes related to performing destructive operations in a repository with minor corruption. Of course we hope never to see corruption in the first place, but I think if we do see it, we should err on the side of not making things worse. IOW, it is better to abort and say "fix this before pruning" than it is to just start deleting objects. The issue that spurred this is that I noticed recent versions of git will omit funny-named refs from iteration. This comes from d0f810f (refs.c: allow listing and deleting badly named refs, 2014-09-03), in v2.2.0. That's probably a good idea in general, but for things like "prune" we need to be more careful (and we were, prior to that commit). Similarly, if you have a ref whose tip object is missing, we tend to just ignore it in our traversals, and it presents a similar problem for pruning. I didn't trace it back, but I think this problem is much older. The general strategy for these is to use for_each_rawref traversals in these situations. That doesn't cover _every_ possible scenario. For example, you could do: git clone --no-local repo.git backup.git && rm -rf repo.git and you might be disappointed if "backup.git" omitted some broken refs (upload-pack will simply skip the broken refs in its advertisement). We could tighten this, but then it becomes hard to access slightly broken repositories (e.g., you might prefer to clone what you can, and not have git die() when it tries to serve the breakage). Patch 2 provides a tweakable safety valve for this. And not strictly related to the above, but in the same ballpark, is the issue with packed-refs that I noted here: http://thread.gmane.org/gmane.comp.version-control.git/256051/focus=256896 where we can drop broken refs from the packed-refs file during the deletion of an unrelated ref. The patches are: [1/5]: t5312: test object deletion code paths in a corrupted repository [2/5]: refs: introduce a "ref paranoia" flag [3/5]: prune: turn on ref_paranoia flag [4/5]: repack: turn on "ref paranoia" when doing a destructive repack [5/5]: refs.c: drop curate_packed_refs -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html