Re: git log -S or -G
On Mon, 8 Oct 2018, Jacob Keller wrote: > On Mon, Oct 8, 2018 at 8:22 PM Jeff King wrote: > > > > On Tue, Oct 09, 2018 at 08:09:32AM +0900, Junio C Hamano wrote: > > > > > Julia Lawall writes: > > > > > > >> Doing the same for -S is much harder at the machinery level, as it > > > >> performs its thing without internally running "diff" twice, but just > > > >> counts the number of occurrences of 'foo'---that is sufficient for > > > >> its intended use, and more efficient. > > > > > > > > There is still the question of whether the number of occurrences of foo > > > > decreases or increases. > > > > > > Hmph, taking the changes that makes the number of hits decrease > > > would catch a subset of "changes that removes 'foo' only---I am not > > > interested in the ones that adds 'foo'". It will avoid getting > > > confused by a change that moves an existing 'foo' to another place > > > in the same file (as the number of hits does not change), but at the > > > same time, it will miss a change that genuinely removes an existing > > > 'foo' and happens to add a 'foo' at a different place in the same > > > file that is unrelated to the original 'foo'. Depending on the > > > definition of "I am only interested in removed ones", that may or > > > may not be acceptable. > > > > I think that is the best we could do for "-S", though, which is > > inherently about counting hits. > > > > For "-G", we are literally grepping the diff. It does not seem > > unreasonable to add the ability to grep only "-" or "+" lines, and the > > interface for that should be pretty straightforward (a tri-state flag to > > look in remove, added, or both lines). > > > > -Peff > > Yea. I know I've wanted something like this in the past. It could also be nice to be able to specify multiple patterns, with and and or between them. So -A&-B would be remove A somewhere and remove B somewhere. julia
Re: [PATCH 1/1] commit-graph: define GIT_TEST_COMMIT_GRAPH
Derrick Stolee writes: > I see these failures, too, but I believe they are due to > ds/commit-graph-with-grafts not being merged to 'next' yet. The > purpose of that branch is to fix these test breaks. The environment > variable got merged a lot faster. A separate "ping" would have helped me. Will merge it down to 'next'.
Re: How to handle patch series conflicts
Stephen & Linda Smith writes: > Junio - I've been working this but would like your opinion on 7500, 7501 and > now 7510. > > I note that the the commit tests have intermixed functionality. An example > is > signoff tests that are in the three tests I mentioned. > > I've been tempted multiple times over the last week to just merge the tests > into a single script, but that doesn't seem right either. > > So would you prefer a single script? Would you prefer me to move tests > around? The scripts themselves having the same name that is no more specific tha just "commit" does not bother _me_ personally too much. If I were doing it, unless you are an obsessive type that wants to see spanking cleanness everywhere, I'd limit the changes to the minimum. If something tested in script X is tested in another script Y and it is trivial to see they are testing exactly the same thing, removing one copy from script Y would be good, and if the remaining changes in script Y becomes more focused with only such removals, that would even be better, as at that point we can rename "tY-commit.sh" to something more specific like "tY-commit-signature.sh".
Re: [PATCH 1/1] rebase -i: introduce the 'break' command
Junio C Hamano writes: >> There is one crucial difference: the exit code. > > OK, and it was good that you explicitly said "with exit code 0" in > the log message. Together with the idea to update the doc I floated > earlier, this probably is worth documenting, too. Heh, I am becoming sloppy in reviewing. The patch does not even update any doc. It is not a reason to reject the change (the change looks reasonably simple and reviewers and those who have to look at the code to build upon it would understand it in the current shape), but it is a blocker for the change to be merged to 'next' and down.
Re: git log -S or -G
Jeff King writes: > I think that is the best we could do for "-S", though, which is > inherently about counting hits. > > For "-G", we are literally grepping the diff. It does not seem > unreasonable to add the ability to grep only "-" or "+" lines, and the > interface for that should be pretty straightforward (a tri-state flag to > look in remove, added, or both lines). Yeah, here is a lunchtime hack that hasn't even been compile tested. diff.c | 4 diff.h | 2 ++ diffcore-pickaxe.c | 22 -- 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index f0c7557b40..d1f2780844 100644 --- a/diff.c +++ b/diff.c @@ -5068,6 +5068,10 @@ int diff_opt_parse(struct diff_options *options, } else if (!strcmp(arg, "--pickaxe-all")) options->pickaxe_opts |= DIFF_PICKAXE_ALL; + else if (!strcmp(arg, "--pickaxe-ignore-add")) + options->pickaxe_opts |= DIFF_PICKAXE_IGNORE_ADD; + else if (!strcmp(arg, "--pickaxe-ignore-del")) + options->pickaxe_opts |= DIFF_PICKAXE_IGNORE_DEL; else if (!strcmp(arg, "--pickaxe-regex")) options->pickaxe_opts |= DIFF_PICKAXE_REGEX; else if ((argcount = short_opt('O', av, &optarg))) { diff --git a/diff.h b/diff.h index a30cc35ec3..147c47ace7 100644 --- a/diff.h +++ b/diff.h @@ -358,6 +358,8 @@ int git_config_rename(const char *var, const char *value); DIFF_PICKAXE_KIND_OBJFIND) #define DIFF_PICKAXE_IGNORE_CASE 32 +#define DIFF_PICKAXE_IGNORE_ADD64 +#define DIFF_PICKAXE_IGNORE_DEL 128 void diffcore_std(struct diff_options *); void diffcore_fix_diff_index(struct diff_options *); diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c index 800a899c86..826dde6bd4 100644 --- a/diffcore-pickaxe.c +++ b/diffcore-pickaxe.c @@ -16,6 +16,7 @@ typedef int (*pickaxe_fn)(mmfile_t *one, mmfile_t *two, struct diffgrep_cb { regex_t *regexp; + struct diff_options *diff_options; int hit; }; @@ -23,9 +24,14 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len) { struct diffgrep_cb *data = priv; regmatch_t regmatch; + unsigned pickaxe_opts = data->diff_options->pickaxe_opts; if (line[0] != '+' && line[0] != '-') return; + if ((pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD) && line[0] == '+') + return; + if ((pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL) && line[0] == '-') + return; if (data->hit) /* * NEEDSWORK: we should have a way to terminate the @@ -45,13 +51,20 @@ static int diff_grep(mmfile_t *one, mmfile_t *two, xpparam_t xpp; xdemitconf_t xecfg; - if (!one) + if (!one) { + if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD) + return 0; return !regexec_buf(regexp, two->ptr, two->size, 1, ®match, 0); - if (!two) + } + if (!two) { + if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL) + return 0; return !regexec_buf(regexp, one->ptr, one->size, 1, ®match, 0); + } + ecbdata.diff_options = o; /* * We have both sides; need to run textual diff and see if * the pattern appears on added/deleted lines. @@ -113,6 +126,11 @@ static int has_changes(mmfile_t *one, mmfile_t *two, { unsigned int one_contains = one ? contains(one, regexp, kws) : 0; unsigned int two_contains = two ? contains(two, regexp, kws) : 0; + + if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD) + return one_contains > two_contains; + if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL) + return one_contains < two_contains; return one_contains != two_contains; }
Re: [BUG] A part of an edge from an octopus merge gets colored, even with --color=never
On Wed, Oct 03, 2018 at 06:32:06PM -0400, Noam Postavsky wrote: > > which is admittedly pretty horrible, too, but at least resembles a > > graph. I dunno. > > Yeah, but it's lossy, so it doesn't seem usable for the test. Maybe > doubling up some characters? > > ** left > R| **B-B-M-M. octopus-merge > R| R|Y\ B\ M\ > R|R/ Y/ B/ M/ > R| Y| B| ** 4 > R| Y| ** M| 3 > R| Y| M|M/ > R| ** M| 2 > R| M|M/ > ** M| 1 > M|M/ > ** initial Yeah, I tried something similar, but it's hard to read as a graph since the alignment is lost between lines. I agree the single-char version is lossy, but I think in combination with checking the literal, uncolored version, we'd be OK. However, it may be best to just leave the original verbose version you had. It's hard to read and to modify, but we don't plan for people to do that very often. And it's at least simple. > > I'm also not thrilled that we depend on the exact sequence of default > > colors, but I suspect it's not the first time. And it wouldn't be too > > hard to update it if that default changes. > > Well, it's easy enough to set the colors explicitly. After doing this > I noticed that green seems to be skipped. Not sure if that's a bug or > not. Hmm, yeah, that is weird. I think it's an artifact of the way we increment the color selector, though, and not related to your patch (the same thing happens before your fix, as well). > > I think it's OK to have a dedicated script for even these two tests, if > > it makes things easier to read. However, would we also want to test the > > octopus without the problematic graph here? I think if we just omit > > "left" we get that, don't we? > > t4202-log.sh already does test a "normal" octopus merge (starting > around line 615, search for "octopus-a"). But that is only a 3-parent > merge. And adding another test is easy enough. > [...] Thanks, what you have here looks good. > From cd9415b524357c2c8b9b20a63032c94e01d46a15 Mon Sep 17 00:00:00 2001 > From: Noam Postavsky > Date: Sat, 1 Sep 2018 20:07:16 -0400 > Subject: [PATCH v5] log: Fix coloring of certain octupus merge shapes This whole version looks good to me. "git am" is supposed to understand attachments, but it seems to want to apply our whole conversation as the commit message. You may want to repost one more time with this subject in the email subject line to fix that and to get the maintainer's attention. Feel free to add my: Reviewed-by: Jeff King after your signoff. Thanks for sticking with this topic! -Peff
Re: [PATCH 1/1] rebase -i: introduce the 'break' command
Johannes Schindelin writes: > Hi Junio, > > On Fri, 5 Oct 2018, Junio C Hamano wrote: > >> "Johannes Schindelin via GitGitGadget" >> writes: >> >> > From: Johannes Schindelin >> > >> > The 'edit' command can be used to cherry-pick a commit and then >> > immediately drop out of the interactive rebase, with exit code 0, to let >> > the user amend the commit, or test it, or look around. >> ... >> If one wants to emulate this with the versions of Git that are >> currently deployed, would it be sufficient to insert "exec false" >> instead of "break"? > > There is one crucial difference: the exit code. OK, and it was good that you explicitly said "with exit code 0" in the log message. Together with the idea to update the doc I floated earlier, this probably is worth documenting, too. >> I think the earlier request asked for 'pause' (I didn't dig the list >> archive very carefully, though), > > No need to: I mentioned it in the cover letter. Here is the link again, > for your convenience: > https://public-inbox.org/git/20180118183618.39853-3-sbel...@google.com/ No, you misunderstood. I knew what message in the immediate past triggered this patch and that wasn't what I "could have dug for but didn't". "The earlier request" I meant was another one I recall that was made long time ago---that was what I "could have dug for but didn't". > Good initial version. I would like it to be a bit more precise about who > exits with what status. How about this: Sounds good. It is likely that I'll either forget or will continue to be too busy to pick textual pieces and assemble together myself, so I'll leave it as a left over bit for somebody reading from the sideline to send in a finished version if they care deeply enough ;-) Thanks.
Re: git log -S or -G
On Mon, Oct 8, 2018 at 8:22 PM Jeff King wrote: > > On Tue, Oct 09, 2018 at 08:09:32AM +0900, Junio C Hamano wrote: > > > Julia Lawall writes: > > > > >> Doing the same for -S is much harder at the machinery level, as it > > >> performs its thing without internally running "diff" twice, but just > > >> counts the number of occurrences of 'foo'---that is sufficient for > > >> its intended use, and more efficient. > > > > > > There is still the question of whether the number of occurrences of foo > > > decreases or increases. > > > > Hmph, taking the changes that makes the number of hits decrease > > would catch a subset of "changes that removes 'foo' only---I am not > > interested in the ones that adds 'foo'". It will avoid getting > > confused by a change that moves an existing 'foo' to another place > > in the same file (as the number of hits does not change), but at the > > same time, it will miss a change that genuinely removes an existing > > 'foo' and happens to add a 'foo' at a different place in the same > > file that is unrelated to the original 'foo'. Depending on the > > definition of "I am only interested in removed ones", that may or > > may not be acceptable. > > I think that is the best we could do for "-S", though, which is > inherently about counting hits. > > For "-G", we are literally grepping the diff. It does not seem > unreasonable to add the ability to grep only "-" or "+" lines, and the > interface for that should be pretty straightforward (a tri-state flag to > look in remove, added, or both lines). > > -Peff Yea. I know I've wanted something like this in the past. Thanks, Jake
Re: [PATCH v6 09/10] submodule: support reading .gitmodules when it's not in the working tree
Junio C Hamano writes: > Antonio Ospite writes: > >> Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading >> from .gitmodules succeeds and that writing to it fails when the file is >> not checked out. >> ... >> t/t7416-submodule-sparse-gitmodules.sh | 78 ++ > > This now triggers test-lint errors as the most recent maintenance > release took t/t7416 for something else. I'll do s/t7416-/t7418-/g > on the mailbox before running "git am -s" on this series. This is an unrelated tangent to the topic, but running "range-diff" on what has been queued on 'pu' since mid September and this replacement after doing the renaming was a surprisingly pleasant experience. In its comparison between 09/10 of the two iterations, it showed that 7416's name has been changed to 7418 but otherwise there is no change in the contents of that test script. FWIW, tbdiff also gets this right, so the pleasant experience was inherited without getting broken. Kudos should go to both Thomas and Dscho ;-).
Re: [PATCH v6 09/10] submodule: support reading .gitmodules when it's not in the working tree
Antonio Ospite writes: > Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading > from .gitmodules succeeds and that writing to it fails when the file is > not checked out. > ... > t/t7416-submodule-sparse-gitmodules.sh | 78 ++ This now triggers test-lint errors as the most recent maintenance release took t/t7416 for something else. I'll do s/t7416-/t7418-/g on the mailbox before running "git am -s" on this series.
Re: git log -S or -G
On Tue, Oct 09, 2018 at 08:09:32AM +0900, Junio C Hamano wrote: > Julia Lawall writes: > > >> Doing the same for -S is much harder at the machinery level, as it > >> performs its thing without internally running "diff" twice, but just > >> counts the number of occurrences of 'foo'---that is sufficient for > >> its intended use, and more efficient. > > > > There is still the question of whether the number of occurrences of foo > > decreases or increases. > > Hmph, taking the changes that makes the number of hits decrease > would catch a subset of "changes that removes 'foo' only---I am not > interested in the ones that adds 'foo'". It will avoid getting > confused by a change that moves an existing 'foo' to another place > in the same file (as the number of hits does not change), but at the > same time, it will miss a change that genuinely removes an existing > 'foo' and happens to add a 'foo' at a different place in the same > file that is unrelated to the original 'foo'. Depending on the > definition of "I am only interested in removed ones", that may or > may not be acceptable. I think that is the best we could do for "-S", though, which is inherently about counting hits. For "-G", we are literally grepping the diff. It does not seem unreasonable to add the ability to grep only "-" or "+" lines, and the interface for that should be pretty straightforward (a tri-state flag to look in remove, added, or both lines). -Peff
Re: [PATCH v3] coccicheck: process every source file at once
On Fri, Oct 05, 2018 at 09:54:13PM +0200, SZEDER Gábor wrote: > Runtimes tend to fluctuate quite a bit more on Travis CI compared to > my machine, but not this much, and it seems to be consistent so far. > > After scripting/querying the Travis CI API a bit, I found that from > the last 100 static analysis build jobs 78 did actully run 'make > coccicheck' [1], avaraging 470s for the whole build job, with only 4 > build job exceeding the 10min mark. > > I had maybe 6-8 build jobs running this patch over the last 2-3 days, > I think all of them were over 15min. (I restarted some of them, so I > don't have separate logs for all of them, hence the uncertainty.) So that's really weird and counter-intuitive, since we should be doing strictly less work. I know that spatch tries to parallelize itself, though from my tests, 1.0.4 does not. I wonder if the version in Travis differs in that respect and starts too many threads, and the extra time is going to contention and context switches. Have you tried passing "-j1" to spatch? My 1.0.4 does not even recognize it. That seems like a pretty unlikely explanation to me, but I am having trouble coming up with another one. I guess the other plausible thing is that the extra memory is forcing us into some slower path. E.g., a hypervisor may even be swapping, unbeknownst to the child OS, and it gets accounted in the child OS as "boy, that memory load was really slow", which becomes used CPU. That actually sounds more credible to me. -Peff
Re: [PATCH v3] coccicheck: process every source file at once
On Sat, Oct 06, 2018 at 10:42:57AM +0200, René Scharfe wrote: > > That's OK, too, assuming people would actually want to use it. I'm also > > OK shipping this (with the "make -j" fix you suggested) and seeing if > > anybody actually complains. I assume there are only a handful of people > > running coccicheck in the first place. > > FWIW, my development environment is a virtual machine with 1200MB RAM > and 900MB swap space. coccicheck takes almost eight minutes > sequentially, and four and a half minutes with -j4. > > Unsurprisingly, it fails after almost three minutes with the patch, > reporting that it ran out of memory. With 2900MB it fails after almost > two minutes, with 3000MB it succeeds after a good two minutes. > > time(1) says (for -j1): > > 433.30user 36.17system 7:49.84elapsed 99%CPU (0avgtext+0avgdata > 108212maxresident)k > 192inputs+1512outputs (0major+16409056minor)pagefaults 0swaps > > 129.74user 2.06system 2:13.27elapsed 98%CPU (0avgtext+0avgdata > 1884568maxresident)k > 236896inputs+1096outputs (795major+462129minor)pagefaults 0swaps > > So with the patch it's more than three times faster, but needs more > than seventeen times more memory. And I need a bigger VM. :-/ Yuck. :) So if we want to take this as a complaint, then I guess we can jump straight to implementing the fallback to the existing behavior (though it may be worth it for you to expand your VM to get the decreased CPU time). I'm still puzzled by Gábor's counter-intuitive CI numbers, though. -Peff
Re: [PATCH v5 0/4] Filter alternate references
On Mon, Oct 08, 2018 at 11:09:20AM -0700, Taylor Blau wrote: > Attached is (what I anticipate to be) the final re-roll of my series to > introduce 'core.alternateRefsCommand' and 'core.alternateRefsPrefixes' > in order to limit the ".have" advertisement when pushing over protocol > v1 to a repository with configured alternates. Thanks, this looks good to me! -Peff
Re: We should add a "git gc --auto" after "git clone" due to commit graph
On Mon, Oct 08, 2018 at 02:29:47PM -0400, Derrick Stolee wrote: > > > > But I'm afraid it will take a while until I get around to turn it into > > > > something presentable... > > > Do you have the code pushed somewhere public where one could take a look? > > > I > > > Do you have the code pushed somewhere public where one could take a > > > look? I could provide some early feedback. > > Nah, definitely not... I know full well how embarassingly broken this > > implementation is, I don't need others to tell me that ;) > There are two questions that I was hoping to answer by looking at your code: > > 1. How do you store your Bloom filter? Is it connected to the commit-graph > and split on a commit-by-commit basis (storing "$path" as a key), or is it > one huge Bloom filter (storing "$commitid:$path" as key)? I guess you've probably thought all of this through for your implementation, but let me pontificate. I'd have done it as one fixed-size filter per commit. Then you should be able to hash the path keys once, and apply the result as a bitwise query to each individual commit (I'm assuming that it's constant-time to access the filter for each, as an index into an mmap'd array, with the offset coming from a commit-graph entry we'd be able to look up anyway). I think it would also be easier to deal with maintenance, since each filter is independent (IIRC, you cannot delete from a bloom filter without re-adding all of the other keys). The obvious downside is that it's O(commits) storage instead of O(1). Now let me ponder a bit further afield. A bloom filter lets us answer the question "did this commit (probably) touch these paths?". But it does not let us answer "which paths did this commit touch?". That second one is less useful than you might think, because we almost always care about not just the names of the paths, but their actual object ids. Think about a --raw diff, or even traversing for reachability (where if we knew the tree-diff cheaply, we could avoid asking "have we seen this yet?" on most of the tree entries). The names alone can make that a bit faster, but in the worst case you still have to walk the whole tree to find their entries. But there's also a related question: how do we match pathspec patterns? For a changed path like "foo/bar/baz", I imagine a bloom filter would mark all of "foo", "foo/bar", and "foo/bar/baz". But what about "*.c"? I don't think a bloom filter can answer that. At least not by itself. If we imagine that the commit-graph also had an alphabetized list of every path in every tree, then it's easy: apply the glob to that list once to get a set of concrete paths, and then query the bloom filters for those. And that list actually isn't too big. The complete set of paths in linux.git is only about 300k gzipped (I think that's the most relevant measure, since it's an obvious win to avoid repeating shared prefixes of long paths). Imagine we have that list. Is a bloom filter still the best data structure for each commit? At the point that we have the complete universe of paths, we could give each commit a bitmap of changed paths. That lets us ask "did this commit touch these paths" (collect the bits from the list of paths, then check for 1's), as well as "what paths did we touch" (collect the 1 bits, and then index the path list). Those bitmaps should compress very well via EWAH or similar (most of them would be huge stretches of 0's punctuated by short runs of 1's). So that seems promising to me (or at least not an obvious dead-end). I do think maintenance gets to be a headache, though. Adding new paths potentially means reordering the bitmaps, which means O(commits) work to "incrementally" update the structure. (Unless you always add the new paths at the end, but then you lose fast lookups in the list; that might be an acceptable tradeoff). And finally, there's one more radical option: could we actually store a real per-commit tree-diff cache? I.e., imagine that each commit had the equivalent of a --raw diff easily accessible, including object ids. That would allow: - fast pathspec matches, including globs - fast --raw output (and faster -p output, since we can skip the tree entirely) - fast reachability traversals (we only need to bother to look at the objects for changed entries) where "fast" is basically O(size of commit's changes), rather than O(size of whole tree). This was one of the big ideas of packv4 that never materialized. You can _almost_ do it with packv2, since after all, we end up storing many trees as deltas. But those deltas are byte-wise so it's hard for a reader to convert them directly into a pure-tree diff (they also don't mention the "deleted" data, so it's really only half a diff). So let's imagine we'd store such a cache external to the regular object data (i.e., as a commit-graph entry). The "log --raw" diff of linux.git has 1.7M entries. The paths should easily compress to a single 32-bit integer (e.g., as an index int
Re: [PATCH][Outreachy] remove all the inclusions of git-compat-util.h in header files
On Mon, 8 Oct 2018 at 22:43, Derrick Stolee wrote: > > On 10/8/2018 1:05 PM, Ananya Krishna Maram wrote: > > Hi All, > Hello, Ananya! Welcome. > > > I was searching through #leftovers and found this. > > https://public-inbox.org/git/cabpp-bgvvxcbzx44er6to-pusfen_6gnyj1u5cuon9deaa4...@mail.gmail.com/ > > > > This patch address the task discussed in the above link. > The discussion above seems to not be intended for your commit message, > but it does show up when I run `git am` and provide your email as input. > The typical way to avoid this is to place all commentary below the "---" Sorry, I didn't know that. Shall I re submit the patch with proper commentary. > that signifies the commit message is over. > > From: Ananya Krishan Maram > > > > skip the #include of git-compat-util.h since all .c files include it. > > > > Signed-off-by: Ananya Krishna Maram > > --- > > advice.h | 1 - > > commit-graph.h | 1 - > > hash.h | 1 - > > pkt-line.h | 1 - > > t/helper/test-tool.h | 1 - > > 5 files changed, 5 deletions(-) > > > > diff --git a/advice.h b/advice.h > > index ab24df0fd..09148baa6 100644 > > --- a/advice.h > > +++ b/advice.h > > @@ -1,7 +1,6 @@ > > #ifndef ADVICE_H > > #define ADVICE_H > > > > -#include "git-compat-util.h" > > > > extern int advice_push_update_rejected; > > extern int advice_push_non_ff_current; > > diff --git a/commit-graph.h b/commit-graph.h > > index b05047676..0e93c2bed 100644 > > --- a/commit-graph.h > > +++ b/commit-graph.h > > @@ -1,7 +1,6 @@ > > #ifndef COMMIT_GRAPH_H > > #define COMMIT_GRAPH_H > > > > -#include "git-compat-util.h" > > #include "repository.h" > > #include "string-list.h" > > #include "cache.h" > > diff --git a/hash.h b/hash.h > > index 7c8238bc2..9a4334c5d 100644 > > --- a/hash.h > > +++ b/hash.h > > @@ -1,7 +1,6 @@ > > #ifndef HASH_H > > #define HASH_H > > > > -#include "git-compat-util.h" > > > > #if defined(SHA1_PPC) > > #include "ppc/sha1.h" > > diff --git a/pkt-line.h b/pkt-line.h > > index 5b28d4347..fdd316494 100644 > > --- a/pkt-line.h > > +++ b/pkt-line.h > > @@ -1,7 +1,6 @@ > > #ifndef PKTLINE_H > > #define PKTLINE_H > > > > -#include "git-compat-util.h" > > #include "strbuf.h" > > > > /* > > diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h > > index e07495727..24e0a1589 100644 > > --- a/t/helper/test-tool.h > > +++ b/t/helper/test-tool.h > > @@ -1,7 +1,6 @@ > > #ifndef __TEST_TOOL_H__ > > #define __TEST_TOOL_H__ > > > > -#include "git-compat-util.h" > > > > int cmd__chmtime(int argc, const char **argv); > > int cmd__config(int argc, const char **argv); > I applied these changes locally and confirmed the code compiles, so all > .c files including these _do_ include git-compat-util.h properly. > > Thanks, > -Stolee >
Re: What's so special about objects/17/ ?
Ævar Arnfjörð Bjarmason writes: > Depending on how we're counting there's at least two. I thought you were asking "why the special sentinel is not 0{40}?" You counted the number of reasons why 0{40} is used to stand in for a real value, but that was the number I didn't find interesting in the scope of this discussion, i.e. "why the special sample is 17?" I vaguely recall we also used 0{39}1 for something else long time ago; I offhand do not recall if we still do, or we got rid of it.
Re: What's so special about objects/17/ ?
Stefan Beller writes: > On Sun, Oct 7, 2018 at 1:07 PM Junio C Hamano wrote: >> >> Junio C Hamano writes: > >> > ... >> > by general public and I do not have to explain the choice to the >> > general public ;-) >> >> One thing that is more important than "why not 00 but 17?" to answer >> is why a hardcoded number rather than a runtime random. It is for >> repeatability. > > Let's talk about repeatability vs statistics for a second. ;-) Oh, I think I misled you by saying "more important". I didn't mean that it is more important to stick to the "use hardcoded value" design decision than sticking to "use 17". I've made sure that everybody would understnd choosing any arbitrary byte value other than "17" does not make the resulting Git any better nor worse. But discussing the design decision to use hardcoded value is "more important", as that affects the balance between the end-user experience and debuggability, and I tried to help those who do not know the history by giving the fact that choice was made for the latter and not for other hidden reasons, that those who would propose to change the system may have to keep in mind. Sorry if you mistook it as if I were saying that it is important to keep the design to use a hardcoded byte value. That wasn't what the message was about.
Re: [PATCH] builtin/grep.c: remote superflous submodule code
> Well, submodule-config.c has its implementation and another caller, > which technically is outside submodule.c ;-) i.e. there is a typo in my commit message. I meant to say submodule-config.c > repo_read_gitmodules > has two more callers in unpack-trees.c these days, so perhaps we can > do without this last paragraph. Gah, looking at that code, did we have any reason to rush that series? c.f. https://public-inbox.org/git/20170811171811.gc1...@book.hvoigt.net/ On Sat, Oct 6, 2018 at 5:33 PM Junio C Hamano wrote: > > Stefan Beller writes: > > > After ff6f1f564c4 (submodule-config: lazy-load a repository's .gitmodules > > file, 2017-08-03) this is no longer necessary, but that commit did not > > cleanup the whole tree, but just show cased the new way how to deal with > > submodules in ls-files. > > The log message of the above one singles out "grep" as a special > case and explalins why it did not touch, by the way. You probably > need to explain the reason why "this is no longer necessary" a bit > better than the above---as it stands, it is "ff6f1f564c4 said it > still is necessary, I say it is not". That is true. For grep, the reason seems to be, that we check is_submodule_active based off the index, i.e. using module = submodule_from_path(repo, &null_oid, path); as the deciding factor, which falls in line with lazyloading. However the use of the specialized gitmodules_config_oid in grep is also guarded by the same commit ff6f1f564c4. Going back to the use case of unpack-trees.c, I think that we need to keep it there as alternatives seem to be more complicated. So I guess I'll just resend with a better commit message. Thanks, Stefan
Re: [PATCH 14/14] rerere: convert to use the_hash_algo
On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson wrote: > > Since this data is stored in the .git directory, it makes sense for us > to use the same hash algorithm for it as for everything else. Convert > the remaining uses of SHA-1 to use the_hash_algo. Use GIT_MAX_RAWSZ for > allocations. Rename various struct members, local variables, and a > function to be named "hash" instead of "sha1". > > Signed-off-by: brian m. carlson > --- > rerere.c | 81 +--- > 1 file changed, 42 insertions(+), 39 deletions(-) I have reviewed all patches and had minor questions on some of them, all others look good to me. Thanks, Stefan
Re: [PATCH 13/14] submodule: make zero-oid comparison hash function agnostic
On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson wrote: > > With SHA-256, the length of the all-zeros object ID is longer. Add a > function to git-submodule.sh to check if a full hex object ID is the > all-zeros value, and use it to check the output we're parsing from git > diff-files or diff-index. > > Signed-off-by: brian m. carlson > --- Nice!
Re: git log -S or -G
Julia Lawall writes: >> Doing the same for -S is much harder at the machinery level, as it >> performs its thing without internally running "diff" twice, but just >> counts the number of occurrences of 'foo'---that is sufficient for >> its intended use, and more efficient. > > There is still the question of whether the number of occurrences of foo > decreases or increases. Hmph, taking the changes that makes the number of hits decrease would catch a subset of "changes that removes 'foo' only---I am not interested in the ones that adds 'foo'". It will avoid getting confused by a change that moves an existing 'foo' to another place in the same file (as the number of hits does not change), but at the same time, it will miss a change that genuinely removes an existing 'foo' and happens to add a 'foo' at a different place in the same file that is unrelated to the original 'foo'. Depending on the definition of "I am only interested in removed ones", that may or may not be acceptable.
Re: [PATCH 02/14] builtin/repack: replace hard-coded constant
On Mon, Oct 8, 2018 at 6:27 PM Stefan Beller wrote: > On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson > wrote: > > - if (line.len != 40) > > - die("repack: Expecting 40 character sha1 lines only > > from pack-objects."); > > + if (line.len != the_hash_algo->hexsz) > > + die("repack: Expecting full hex object ID lines > > only from pack-objects."); > > This is untranslated as it is plumbing? If so, maybe > > if (is_sha1(the_hash_algo) > die("repack: Expecting 40 character sh... > else > die(repack: Expecting full hex object ID in %s, of length %d", > the_hash_algo->name, > the_hash_algo->hexsz); Special-casing for SHA-1 seems overkill for an error message. A script expecting this particular error condition and this particular error message would be fragile indeed.
Re: We should add a "git gc --auto" after "git clone" due to commit graph
SZEDER Gábor writes: > There is certainly potential there. With a (very) rough PoC > experiment, a 8MB bloom filter, and a carefully choosen path I can > achieve a nice, almost 25x speedup: > > $ time git rev-list --count HEAD -- t/valgrind/valgrind.sh > 6 > > real0m1.563s > user0m1.519s > sys 0m0.045s > > $ time GIT_USE_POC_BLOOM_FILTER=y ~/src/git/git rev-list --count HEAD -- > t/valgrind/valgrind.sh > 6 > > real0m0.063s > user0m0.043s > sys 0m0.020s Even though I somehow sense a sign of exaggeration in [v] in the pathname, it still is quite respectable. > bloom filter total queries: 16269 definitely not: 16195 maybe: 74 false > positives: 64 fp ratio: 0.003934 > > But I'm afraid it will take a while until I get around to turn it into > something presentable... That's OK. This is an encouraging result. Just from curiousity, how are you keying the filter? tree object name of the top-level and full path concatenated or something like that?
Re: [PATCH 06/14] packfile: express constants in terms of the_hash_algo
On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson wrote: > > Replace uses of GIT_SHA1_RAWSZ with references to the_hash_algo to avoid > dependence on a particular hash length. Unlike the previous patches, this is dealing directly with packfiles, which (I would think) carry their own hash function selector? (i.e. packfiles up to version 4 are sha1 hardcoded and version 5 and onwards will have a hash type field. Usually that hash type would match what is in the_repository, but you could obtain packfiles out of band, or the translation table that we plan to have might be part of the packfile/idx file?) > > Signed-off-by: brian m. carlson > --- > packfile.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/packfile.c b/packfile.c > index 841b36182f..17f993b5bf 100644 > --- a/packfile.c > +++ b/packfile.c > @@ -1121,13 +1121,14 @@ int unpack_object_header(struct packed_git *p, > void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1) > { > unsigned i; > + const unsigned hashsz = the_hash_algo->rawsz; > for (i = 0; i < p->num_bad_objects; i++) > - if (hasheq(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i)) > + if (hasheq(sha1, p->bad_object_sha1 + hashsz * i)) > return; > p->bad_object_sha1 = xrealloc(p->bad_object_sha1, > st_mult(GIT_MAX_RAWSZ, > st_add(p->num_bad_objects, 1))); > - hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, > sha1); > + hashcpy(p->bad_object_sha1 + hashsz * p->num_bad_objects, sha1); > p->num_bad_objects++; > } >
Re: [PATCH 05/14] pack-revindex: express constants in terms of the_hash_algo
> - /* This knows the pack format -- the 20-byte trailer > + /* This knows the pack format -- the hash trailer > * follows immediately after the last object data. While at it, fix the comment style? With or without the nit addressed, this patch (and patches 1 and 4) are Reviewed-by: Stefan Beller
Re: [PATCH 03/14] builtin/mktree: remove hard-coded constant
On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson wrote: > > Instead of using a hard-coded constant for the size of a hex object ID, > switch to use the computed pointer from parse_oid_hex that points after > the parsed object ID. > > Signed-off-by: brian m. carlson > --- > builtin/mktree.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/builtin/mktree.c b/builtin/mktree.c > index 2dc4ad6ba8..94e82b8504 100644 > --- a/builtin/mktree.c > +++ b/builtin/mktree.c > @@ -98,7 +98,7 @@ static void mktree_line(char *buf, size_t len, int > nul_term_line, int allow_miss > > *ntr++ = 0; /* now at the beginning of SHA1 */ > > - path = ntr + 41; /* at the beginning of name */ > + path = (char *)p + 1; /* at the beginning of name */ ... and we need the cast to un-const p such that path takes it. Makes sense.
Re: [PATCH 02/14] builtin/repack: replace hard-coded constant
On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson wrote: > > Signed-off-by: brian m. carlson > --- > builtin/repack.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/builtin/repack.c b/builtin/repack.c > index c6a7943d5c..e77859062d 100644 > --- a/builtin/repack.c > +++ b/builtin/repack.c > @@ -407,8 +407,8 @@ int cmd_repack(int argc, const char **argv, const char > *prefix) > > out = xfdopen(cmd.out, "r"); > while (strbuf_getline_lf(&line, out) != EOF) { > - if (line.len != 40) > - die("repack: Expecting 40 character sha1 lines only > from pack-objects."); > + if (line.len != the_hash_algo->hexsz) > + die("repack: Expecting full hex object ID lines only > from pack-objects."); This is untranslated as it is plumbing? If so, maybe if (is_sha1(the_hash_algo) die("repack: Expecting 40 character sh... else die(repack: Expecting full hex object ID in %s, of length %d", the_hash_algo->name, the_hash_algo->hexsz);
Re: [PATCH v3 0/2] EditorConfig file
On Mon, Oct 08, 2018 at 10:03:51PM +, brian m. carlson wrote: > This series introduces an EditorConfig file to help developers using any > editor set their editor's settings in conformance with the Git Project's > settings. This is helpful for developers who work on different projects > with different indentation standards to keep their work in sync. > > Changes since v2: > * Add .pl and .pm files. > > Changes since v1: > * Add notes to both .editorconfig and .clang-format that they should be > kept in sync. > * Add commit message line length. Thanks for both of these. I think that v3 is ready for queueing, if other folks find it OK to have an .editorconfig in the repository. Therefore: Reviewed-by: Taylor Blau Thanks, Taylor
Re: [PATCH v6 09/10] submodule: support reading .gitmodules when it's not in the working tree
> +test_expect_success 'not writing gitmodules config file when it is not > checked out' ' > +test_must_fail git -C super submodule--helper config > submodule.submodule.url newurl This only checks the exit code, do we also want to check for test_path_is_missing .gitmodules ? > +test_expect_success 'initialising submodule when the gitmodules config is > not checked out' ' > + git -C super submodule init > +' > + > +test_expect_success 'showing submodule summary when the gitmodules config is > not checked out' ' > + git -C super submodule summary > +' Same for these, is the exit code enough, or do we want to look at specific things? > + > +test_expect_success 'updating submodule when the gitmodules config is not > checked out' ' > + (cd submodule && > + echo file2 >file2 && > + git add file2 && > + git commit -m "add file2 to submodule" > + ) && > + git -C super submodule update git status would want to be clean afterwards?
[PATCH v3 0/2] EditorConfig file
This series introduces an EditorConfig file to help developers using any editor set their editor's settings in conformance with the Git Project's settings. This is helpful for developers who work on different projects with different indentation standards to keep their work in sync. Changes since v2: * Add .pl and .pm files. Changes since v1: * Add notes to both .editorconfig and .clang-format that they should be kept in sync. * Add commit message line length. brian m. carlson (2): editorconfig: provide editor settings for Git developers editorconfig: indicate settings should be kept in sync .clang-format | 2 ++ .editorconfig | 16 2 files changed, 18 insertions(+) create mode 100644 .editorconfig
[PATCH v3 1/2] editorconfig: provide editor settings for Git developers
Contributors to Git use a variety of editors, each with their own configuration files. Because C lacks the defined norms on how to indent and style code that other languages, such as Ruby and Rust, have, it's possible for various contributors, especially new ones, to have configured their editor to use a style other than the style the Git community prefers. To make automatically configuring one's editor easier, provide an EditorConfig file. This is an INI-style configuration file that can be used to specify editor settings and can be understood by a wide variety of editors. Some editors include this support natively; others require a plugin. Regardless, providing such a file allows users to automatically configure their editor of choice with the correct settings by default. Provide global settings to set the character set to UTF-8 and insert a final newline into files. Provide language-specific settings for C, Shell, Perl, and Python files according to what CodingGuidelines already specifies. Since the indentation of other files varies, especially certain AsciiDoc files, don't provide any settings for them until a clear consensus forward emerges. Set the line length for commit messages to 72 characters, which is the generally accepted line length for emails, since we send patches by email. Don't specify an end of line type. While the Git community uses Unix-style line endings in the repository, some Windows users may use Git's auto-conversion support and forcing Unix-style line endings might cause problems for those users. Finally, leave out a root directive, which would prevent reading other EditorConfig files higher up in the tree, in case someone wants to set the end of line type for their system in such a file. Signed-off-by: brian m. carlson --- .editorconfig | 14 ++ 1 file changed, 14 insertions(+) create mode 100644 .editorconfig diff --git a/.editorconfig b/.editorconfig new file mode 100644 index 00..8667740e1d --- /dev/null +++ b/.editorconfig @@ -0,0 +1,14 @@ +[*] +charset = utf-8 +insert_final_newline = true + +[*.{c,h,sh,perl,pl,pm}] +indent_style = tab +tab_width = 8 + +[*.py] +indent_style = space +indent_size = 4 + +[COMMIT_EDITMSG] +max_line_length = 72
[PATCH v3 2/2] editorconfig: indicate settings should be kept in sync
Now that we have two places where we set code formatting settings, .editorconfig and .clang-format, it's possible that they could fall out of sync. This is relatively unlikely, since we're not likely to change the tab width or our preference for tabs, but just in case, add comments to both files reminding future developers to keep them in sync. Signed-off-by: brian m. carlson --- .clang-format | 2 ++ .editorconfig | 2 ++ 2 files changed, 4 insertions(+) diff --git a/.clang-format b/.clang-format index 12a89f95f9..de1c8b5c77 100644 --- a/.clang-format +++ b/.clang-format @@ -6,6 +6,8 @@ # Use tabs whenever we need to fill whitespace that spans at least from one tab # stop to the next one. +# +# These settings are mirrored in .editorconfig. Keep them in sync. UseTab: Always TabWidth: 8 IndentWidth: 8 diff --git a/.editorconfig b/.editorconfig index 8667740e1d..42cdc4bbfb 100644 --- a/.editorconfig +++ b/.editorconfig @@ -2,6 +2,8 @@ charset = utf-8 insert_final_newline = true +# The settings for C (*.c and *.h) files are mirrored in .clang-format. Keep +# them in sync. [*.{c,h,sh,perl,pl,pm}] indent_style = tab tab_width = 8
[PATCH 11/14] apply: replace hard-coded constants
Replace several 40-based constants with references to GIT_MAX_HEXSZ or the_hash_algo, as appropriate. Signed-off-by: brian m. carlson --- apply.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/apply.c b/apply.c index e485fbc6bc..792ecea36a 100644 --- a/apply.c +++ b/apply.c @@ -223,8 +223,8 @@ struct patch { struct fragment *fragments; char *result; size_t resultsize; - char old_sha1_prefix[41]; - char new_sha1_prefix[41]; + char old_sha1_prefix[GIT_MAX_HEXSZ + 1]; + char new_sha1_prefix[GIT_MAX_HEXSZ + 1]; struct patch *next; /* three-way fallback result */ @@ -1093,9 +1093,10 @@ static int gitdiff_index(struct apply_state *state, */ const char *ptr, *eol; int len; + const unsigned hexsz = the_hash_algo->hexsz; ptr = strchr(line, '.'); - if (!ptr || ptr[1] != '.' || 40 < ptr - line) + if (!ptr || ptr[1] != '.' || hexsz < ptr - line) return 0; len = ptr - line; memcpy(patch->old_sha1_prefix, line, len); @@ -1109,7 +1110,7 @@ static int gitdiff_index(struct apply_state *state, ptr = eol; len = ptr - line; - if (40 < len) + if (hexsz < len) return 0; memcpy(patch->new_sha1_prefix, line, len); patch->new_sha1_prefix[len] = 0; @@ -3142,13 +3143,14 @@ static int apply_binary(struct apply_state *state, { const char *name = patch->old_name ? patch->old_name : patch->new_name; struct object_id oid; + const unsigned hexsz = the_hash_algo->hexsz; /* * For safety, we require patch index line to contain -* full 40-byte textual SHA1 for old and new, at least for now. +* full hex textual object ID for old and new, at least for now. */ - if (strlen(patch->old_sha1_prefix) != 40 || - strlen(patch->new_sha1_prefix) != 40 || + if (strlen(patch->old_sha1_prefix) != hexsz || + strlen(patch->new_sha1_prefix) != hexsz || get_oid_hex(patch->old_sha1_prefix, &oid) || get_oid_hex(patch->new_sha1_prefix, &oid)) return error(_("cannot apply binary patch to '%s' " @@ -4055,7 +4057,7 @@ static int preimage_oid_in_gitlink_patch(struct patch *p, struct object_id *oid) starts_with(++preimage, heading) && /* does it record full SHA-1? */ !get_oid_hex(preimage + sizeof(heading) - 1, oid) && - preimage[sizeof(heading) + GIT_SHA1_HEXSZ - 1] == '\n' && + preimage[sizeof(heading) + the_hash_algo->hexsz - 1] == '\n' && /* does the abbreviated name on the index line agree with it? */ starts_with(preimage + sizeof(heading) - 1, p->old_sha1_prefix)) return 0; /* it all looks fine */
[PATCH 07/14] refs/packed-backend: express constants using the_hash_algo
Switch uses of GIT_SHA1_HEXSZ to use the_hash_algo so that they are appropriate for the any given hash length. Signed-off-by: brian m. carlson --- refs/packed-backend.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 74e2996e93..c01c7f5901 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -274,8 +274,8 @@ struct snapshot_record { static int cmp_packed_ref_records(const void *v1, const void *v2) { const struct snapshot_record *e1 = v1, *e2 = v2; - const char *r1 = e1->start + GIT_SHA1_HEXSZ + 1; - const char *r2 = e2->start + GIT_SHA1_HEXSZ + 1; + const char *r1 = e1->start + the_hash_algo->hexsz + 1; + const char *r2 = e2->start + the_hash_algo->hexsz + 1; while (1) { if (*r1 == '\n') @@ -297,7 +297,7 @@ static int cmp_packed_ref_records(const void *v1, const void *v2) */ static int cmp_record_to_refname(const char *rec, const char *refname) { - const char *r1 = rec + GIT_SHA1_HEXSZ + 1; + const char *r1 = rec + the_hash_algo->hexsz + 1; const char *r2 = refname; while (1) { @@ -344,7 +344,7 @@ static void sort_snapshot(struct snapshot *snapshot) if (!eol) /* The safety check should prevent this. */ BUG("unterminated line found in packed-refs"); - if (eol - pos < GIT_SHA1_HEXSZ + 2) + if (eol - pos < the_hash_algo->hexsz + 2) die_invalid_line(snapshot->refs->path, pos, eof - pos); eol++; @@ -456,7 +456,7 @@ static void verify_buffer_safe(struct snapshot *snapshot) return; last_line = find_start_of_record(start, eof - 1); - if (*(eof - 1) != '\n' || eof - last_line < GIT_SHA1_HEXSZ + 2) + if (*(eof - 1) != '\n' || eof - last_line < the_hash_algo->hexsz + 2) die_invalid_line(snapshot->refs->path, last_line, eof - last_line); } @@ -796,7 +796,7 @@ static int next_record(struct packed_ref_iterator *iter) iter->base.flags = REF_ISPACKED; - if (iter->eof - p < GIT_SHA1_HEXSZ + 2 || + if (iter->eof - p < the_hash_algo->hexsz + 2 || parse_oid_hex(p, &iter->oid, &p) || !isspace(*p++)) die_invalid_line(iter->snapshot->refs->path, @@ -826,7 +826,7 @@ static int next_record(struct packed_ref_iterator *iter) if (iter->pos < iter->eof && *iter->pos == '^') { p = iter->pos + 1; - if (iter->eof - p < GIT_SHA1_HEXSZ + 1 || + if (iter->eof - p < the_hash_algo->hexsz + 1 || parse_oid_hex(p, &iter->peeled, &p) || *p++ != '\n') die_invalid_line(iter->snapshot->refs->path,
[PATCH 05/14] pack-revindex: express constants in terms of the_hash_algo
Express the various constants used in terms of the_hash_algo. Signed-off-by: brian m. carlson --- pack-revindex.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/pack-revindex.c b/pack-revindex.c index bb521cf7fb..3756ec71a8 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -122,13 +122,14 @@ static void create_pack_revindex(struct packed_git *p) unsigned num_ent = p->num_objects; unsigned i; const char *index = p->index_data; + const unsigned hashsz = the_hash_algo->rawsz; ALLOC_ARRAY(p->revindex, num_ent + 1); index += 4 * 256; if (p->index_version > 1) { const uint32_t *off_32 = - (uint32_t *)(index + 8 + p->num_objects * (20 + 4)); + (uint32_t *)(index + 8 + p->num_objects * (hashsz + 4)); const uint32_t *off_64 = off_32 + p->num_objects; for (i = 0; i < num_ent; i++) { uint32_t off = ntohl(*off_32++); @@ -142,16 +143,16 @@ static void create_pack_revindex(struct packed_git *p) } } else { for (i = 0; i < num_ent; i++) { - uint32_t hl = *((uint32_t *)(index + 24 * i)); + uint32_t hl = *((uint32_t *)(index + (hashsz + 4) * i)); p->revindex[i].offset = ntohl(hl); p->revindex[i].nr = i; } } - /* This knows the pack format -- the 20-byte trailer + /* This knows the pack format -- the hash trailer * follows immediately after the last object data. */ - p->revindex[num_ent].offset = p->pack_size - 20; + p->revindex[num_ent].offset = p->pack_size - hashsz; p->revindex[num_ent].nr = -1; sort_revindex(p->revindex, num_ent, p->pack_size); }
[PATCH 12/14] apply: rename new_sha1_prefix and old_sha1_prefix
Rename these structure members to "new_oid_prefix" and "old_oid_prefix". Signed-off-by: brian m. carlson --- apply.c | 40 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/apply.c b/apply.c index 792ecea36a..b9eb02ec12 100644 --- a/apply.c +++ b/apply.c @@ -223,8 +223,8 @@ struct patch { struct fragment *fragments; char *result; size_t resultsize; - char old_sha1_prefix[GIT_MAX_HEXSZ + 1]; - char new_sha1_prefix[GIT_MAX_HEXSZ + 1]; + char old_oid_prefix[GIT_MAX_HEXSZ + 1]; + char new_oid_prefix[GIT_MAX_HEXSZ + 1]; struct patch *next; /* three-way fallback result */ @@ -1099,8 +1099,8 @@ static int gitdiff_index(struct apply_state *state, if (!ptr || ptr[1] != '.' || hexsz < ptr - line) return 0; len = ptr - line; - memcpy(patch->old_sha1_prefix, line, len); - patch->old_sha1_prefix[len] = 0; + memcpy(patch->old_oid_prefix, line, len); + patch->old_oid_prefix[len] = 0; line = ptr + 2; ptr = strchr(line, ' '); @@ -1112,8 +1112,8 @@ static int gitdiff_index(struct apply_state *state, if (hexsz < len) return 0; - memcpy(patch->new_sha1_prefix, line, len); - patch->new_sha1_prefix[len] = 0; + memcpy(patch->new_oid_prefix, line, len); + patch->new_oid_prefix[len] = 0; if (*ptr == ' ') return gitdiff_oldmode(state, ptr + 1, patch); return 0; @@ -2205,7 +2205,7 @@ static void reverse_patches(struct patch *p) SWAP(p->new_mode, p->old_mode); SWAP(p->is_new, p->is_delete); SWAP(p->lines_added, p->lines_deleted); - SWAP(p->old_sha1_prefix, p->new_sha1_prefix); + SWAP(p->old_oid_prefix, p->new_oid_prefix); for (; frag; frag = frag->next) { SWAP(frag->newpos, frag->oldpos); @@ -3149,10 +3149,10 @@ static int apply_binary(struct apply_state *state, * For safety, we require patch index line to contain * full hex textual object ID for old and new, at least for now. */ - if (strlen(patch->old_sha1_prefix) != hexsz || - strlen(patch->new_sha1_prefix) != hexsz || - get_oid_hex(patch->old_sha1_prefix, &oid) || - get_oid_hex(patch->new_sha1_prefix, &oid)) + if (strlen(patch->old_oid_prefix) != hexsz || + strlen(patch->new_oid_prefix) != hexsz || + get_oid_hex(patch->old_oid_prefix, &oid) || + get_oid_hex(patch->new_oid_prefix, &oid)) return error(_("cannot apply binary patch to '%s' " "without full index line"), name); @@ -3162,7 +3162,7 @@ static int apply_binary(struct apply_state *state, * applies to. */ hash_object_file(img->buf, img->len, blob_type, &oid); - if (strcmp(oid_to_hex(&oid), patch->old_sha1_prefix)) + if (strcmp(oid_to_hex(&oid), patch->old_oid_prefix)) return error(_("the patch applies to '%s' (%s), " "which does not match the " "current contents."), @@ -3175,7 +3175,7 @@ static int apply_binary(struct apply_state *state, "'%s' but it is not empty"), name); } - get_oid_hex(patch->new_sha1_prefix, &oid); + get_oid_hex(patch->new_oid_prefix, &oid); if (is_null_oid(&oid)) { clear_image(img); return 0; /* deletion patch */ @@ -3191,7 +3191,7 @@ static int apply_binary(struct apply_state *state, if (!result) return error(_("the necessary postimage %s for " "'%s' cannot be read"), -patch->new_sha1_prefix, name); +patch->new_oid_prefix, name); clear_image(img); img->buf = result; img->len = size; @@ -3207,9 +3207,9 @@ static int apply_binary(struct apply_state *state, /* verify that the result matches */ hash_object_file(img->buf, img->len, blob_type, &oid); - if (strcmp(oid_to_hex(&oid), patch->new_sha1_prefix)) + if (strcmp(oid_to_hex(&oid), patch->new_oid_prefix)) return error(_("binary patch to '%s' creates incorrect result (expecting %s, got %s)"), - name, patch->new_sha1_prefix, oid_to_hex(&oid)); + name, patch->new_oid_prefix, oid_to_hex(&oid)); } return 0; @@ -3565,7 +3565,7 @@ static int try_threeway(struct apply_state *state, /* Preimage the patch was prepared for */ if (patch->is_new) write_object_file("",
[PATCH 13/14] submodule: make zero-oid comparison hash function agnostic
With SHA-256, the length of the all-zeros object ID is longer. Add a function to git-submodule.sh to check if a full hex object ID is the all-zeros value, and use it to check the output we're parsing from git diff-files or diff-index. Signed-off-by: brian m. carlson --- git-submodule.sh | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/git-submodule.sh b/git-submodule.sh index 1b568e29b9..c09eb3e03d 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -82,6 +82,11 @@ isnumber() n=$(($1 + 0)) 2>/dev/null && test "$n" = "$1" } +# Given a full hex object ID, is this the zero OID? +is_zero_oid () { + echo "$1" | sane_egrep '^0+$' >/dev/null 2>&1 +} + # Sanitize the local git environment for use within a submodule. We # can't simply use clear_local_git_env since we want to preserve some # of the settings from GIT_CONFIG_PARAMETERS. @@ -780,7 +785,7 @@ cmd_summary() { while read -r mod_src mod_dst sha1_src sha1_dst status name do if test -z "$cached" && - test $sha1_dst = + is_zero_oid $sha1_dst then case "$mod_dst" in 16)
[PATCH 14/14] rerere: convert to use the_hash_algo
Since this data is stored in the .git directory, it makes sense for us to use the same hash algorithm for it as for everything else. Convert the remaining uses of SHA-1 to use the_hash_algo. Use GIT_MAX_RAWSZ for allocations. Rename various struct members, local variables, and a function to be named "hash" instead of "sha1". Signed-off-by: brian m. carlson --- rerere.c | 81 +--- 1 file changed, 42 insertions(+), 39 deletions(-) diff --git a/rerere.c b/rerere.c index 7aa149e849..ceb98015ff 100644 --- a/rerere.c +++ b/rerere.c @@ -29,7 +29,7 @@ static int rerere_dir_alloc; #define RR_HAS_POSTIMAGE 1 #define RR_HAS_PREIMAGE 2 static struct rerere_dir { - unsigned char sha1[20]; + unsigned char hash[GIT_MAX_HEXSZ]; int status_alloc, status_nr; unsigned char *status; } **rerere_dir; @@ -52,7 +52,7 @@ static void free_rerere_id(struct string_list_item *item) static const char *rerere_id_hex(const struct rerere_id *id) { - return sha1_to_hex(id->collection->sha1); + return sha1_to_hex(id->collection->hash); } static void fit_variant(struct rerere_dir *rr_dir, int variant) @@ -115,7 +115,7 @@ static int is_rr_file(const char *name, const char *filename, int *variant) static void scan_rerere_dir(struct rerere_dir *rr_dir) { struct dirent *de; - DIR *dir = opendir(git_path("rr-cache/%s", sha1_to_hex(rr_dir->sha1))); + DIR *dir = opendir(git_path("rr-cache/%s", sha1_to_hex(rr_dir->hash))); if (!dir) return; @@ -133,24 +133,24 @@ static void scan_rerere_dir(struct rerere_dir *rr_dir) closedir(dir); } -static const unsigned char *rerere_dir_sha1(size_t i, void *table) +static const unsigned char *rerere_dir_hash(size_t i, void *table) { struct rerere_dir **rr_dir = table; - return rr_dir[i]->sha1; + return rr_dir[i]->hash; } static struct rerere_dir *find_rerere_dir(const char *hex) { - unsigned char sha1[20]; + unsigned char hash[GIT_MAX_RAWSZ]; struct rerere_dir *rr_dir; int pos; - if (get_sha1_hex(hex, sha1)) + if (get_sha1_hex(hex, hash)) return NULL; /* BUG */ - pos = sha1_pos(sha1, rerere_dir, rerere_dir_nr, rerere_dir_sha1); + pos = sha1_pos(hash, rerere_dir, rerere_dir_nr, rerere_dir_hash); if (pos < 0) { rr_dir = xmalloc(sizeof(*rr_dir)); - hashcpy(rr_dir->sha1, sha1); + hashcpy(rr_dir->hash, hash); rr_dir->status = NULL; rr_dir->status_nr = 0; rr_dir->status_alloc = 0; @@ -207,26 +207,27 @@ static void read_rr(struct string_list *rr) return; while (!strbuf_getwholeline(&buf, in, '\0')) { char *path; - unsigned char sha1[20]; + unsigned char hash[GIT_MAX_RAWSZ]; struct rerere_id *id; int variant; + const unsigned hexsz = the_hash_algo->hexsz; /* There has to be the hash, tab, path and then NUL */ - if (buf.len < 42 || get_sha1_hex(buf.buf, sha1)) + if (buf.len < hexsz + 2 || get_sha1_hex(buf.buf, hash)) die(_("corrupt MERGE_RR")); - if (buf.buf[40] != '.') { + if (buf.buf[hexsz] != '.') { variant = 0; - path = buf.buf + 40; + path = buf.buf + hexsz; } else { errno = 0; - variant = strtol(buf.buf + 41, &path, 10); + variant = strtol(buf.buf + hexsz + 1, &path, 10); if (errno) die(_("corrupt MERGE_RR")); } if (*(path++) != '\t') die(_("corrupt MERGE_RR")); - buf.buf[40] = '\0'; + buf.buf[hexsz] = '\0'; id = new_rerere_id_hex(buf.buf); id->variant = variant; string_list_insert(rr, path)->util = id; @@ -360,7 +361,7 @@ static void rerere_strbuf_putconflict(struct strbuf *buf, int ch, size_t size) } static int handle_conflict(struct strbuf *out, struct rerere_io *io, - int marker_size, git_SHA_CTX *ctx) + int marker_size, git_hash_ctx *ctx) { enum { RR_SIDE_1 = 0, RR_SIDE_2, RR_ORIGINAL @@ -398,10 +399,12 @@ static int handle_conflict(struct strbuf *out, struct rerere_io *io, strbuf_addbuf(out, &two); rerere_strbuf_putconflict(out, '>', marker_size); if (ctx) { - git_SHA1_Update(ctx, one.buf ? one.buf : "", - one.len + 1); - git_SHA1_Update(ctx, two.buf ? two.b
[PATCH 10/14] tag: express constant in terms of the_hash_algo
Signed-off-by: brian m. carlson --- tag.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tag.c b/tag.c index 1db663d716..7445b8f6ea 100644 --- a/tag.c +++ b/tag.c @@ -144,7 +144,7 @@ int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, u return 0; item->object.parsed = 1; - if (size < GIT_SHA1_HEXSZ + 24) + if (size < the_hash_algo->hexsz + 24) return -1; if (memcmp("object ", bufptr, 7) || parse_oid_hex(bufptr + 7, &oid, &bufptr) || *bufptr++ != '\n') return -1;
[PATCH 09/14] transport: use parse_oid_hex instead of a constant
Use parse_oid_hex to compute a pointer instead of using GIT_SHA1_HEXSZ. This simplifies the code and makes it independent of the hash length. Signed-off-by: brian m. carlson --- transport.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/transport.c b/transport.c index 1c76d64aba..44b9ddf670 100644 --- a/transport.c +++ b/transport.c @@ -1346,15 +1346,16 @@ static void read_alternate_refs(const char *path, fh = xfdopen(cmd.out, "r"); while (strbuf_getline_lf(&line, fh) != EOF) { struct object_id oid; + const char *p; - if (get_oid_hex(line.buf, &oid) || - line.buf[GIT_SHA1_HEXSZ] != ' ') { + if (parse_oid_hex(line.buf, &oid, &p) || + *p != ' ') { warning(_("invalid line while parsing alternate refs: %s"), line.buf); break; } - cb(line.buf + GIT_SHA1_HEXSZ + 1, &oid, data); + cb(p + 1, &oid, data); } fclose(fh);
[PATCH 01/14] pack-bitmap-write: use GIT_MAX_RAWSZ for allocation
Signed-off-by: brian m. carlson --- pack-bitmap-write.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index fc82f37a02..6f0c78d6aa 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -37,7 +37,7 @@ struct bitmap_writer { struct progress *progress; int show_progress; - unsigned char pack_checksum[20]; + unsigned char pack_checksum[GIT_MAX_RAWSZ]; }; static struct bitmap_writer writer;
[PATCH 04/14] builtin/fetch-pack: remove constants with parse_oid_hex
Instead of using GIT_SHA1_HEXSZ, use parse_oid_hex to compute a pointer and use that in comparisons. This is both simpler to read and works independent of the hash length. Update references to SHA-1 in the same function to refer to object IDs instead. Signed-off-by: brian m. carlson --- builtin/fetch-pack.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 1a1bc63566..63e69a5801 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -16,13 +16,14 @@ static void add_sought_entry(struct ref ***sought, int *nr, int *alloc, { struct ref *ref; struct object_id oid; + const char *p; - if (!get_oid_hex(name, &oid)) { - if (name[GIT_SHA1_HEXSZ] == ' ') { - /* , find refname */ - name += GIT_SHA1_HEXSZ + 1; - } else if (name[GIT_SHA1_HEXSZ] == '\0') { - ; /* , leave sha1 as name */ + if (!parse_oid_hex(name, &oid, &p)) { + if (*p == ' ') { + /* , find refname */ + name = p + 1; + } else if (*p == '\0') { + ; /* , leave oid as name */ } else { /* , clear cruft from oid */ oidclr(&oid);
[PATCH 08/14] upload-pack: express constants in terms of the_hash_algo
Convert all uses of the GIT_SHA1_HEXSZ to use the_hash_algo so that they are appropriate for any given hash length. Signed-off-by: brian m. carlson --- upload-pack.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/upload-pack.c b/upload-pack.c index 62a1000f44..1aae5dd828 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -443,6 +443,7 @@ static int do_reachable_revlist(struct child_process *cmd, struct object *o; char namebuf[GIT_MAX_HEXSZ + 2]; /* ^ + hash + LF */ int i; + const unsigned hexsz = the_hash_algo->hexsz; cmd->argv = argv; cmd->git_cmd = 1; @@ -461,7 +462,7 @@ static int do_reachable_revlist(struct child_process *cmd, goto error; namebuf[0] = '^'; - namebuf[GIT_SHA1_HEXSZ + 1] = '\n'; + namebuf[hexsz + 1] = '\n'; for (i = get_max_object_index(); 0 < i; ) { o = get_indexed_object(--i); if (!o) @@ -470,11 +471,11 @@ static int do_reachable_revlist(struct child_process *cmd, o->flags &= ~TMP_MARK; if (!is_our_ref(o)) continue; - memcpy(namebuf + 1, oid_to_hex(&o->oid), GIT_SHA1_HEXSZ); - if (write_in_full(cmd->in, namebuf, GIT_SHA1_HEXSZ + 2) < 0) + memcpy(namebuf + 1, oid_to_hex(&o->oid), hexsz); + if (write_in_full(cmd->in, namebuf, hexsz + 2) < 0) goto error; } - namebuf[GIT_SHA1_HEXSZ] = '\n'; + namebuf[hexsz] = '\n'; for (i = 0; i < src->nr; i++) { o = src->objects[i].item; if (is_our_ref(o)) { @@ -484,8 +485,8 @@ static int do_reachable_revlist(struct child_process *cmd, } if (reachable && o->type == OBJ_COMMIT) o->flags |= TMP_MARK; - memcpy(namebuf, oid_to_hex(&o->oid), GIT_SHA1_HEXSZ); - if (write_in_full(cmd->in, namebuf, GIT_SHA1_HEXSZ + 1) < 0) + memcpy(namebuf, oid_to_hex(&o->oid), hexsz); + if (write_in_full(cmd->in, namebuf, hexsz + 1) < 0) goto error; } close(cmd->in);
[PATCH 06/14] packfile: express constants in terms of the_hash_algo
Replace uses of GIT_SHA1_RAWSZ with references to the_hash_algo to avoid dependence on a particular hash length. Signed-off-by: brian m. carlson --- packfile.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/packfile.c b/packfile.c index 841b36182f..17f993b5bf 100644 --- a/packfile.c +++ b/packfile.c @@ -1121,13 +1121,14 @@ int unpack_object_header(struct packed_git *p, void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1) { unsigned i; + const unsigned hashsz = the_hash_algo->rawsz; for (i = 0; i < p->num_bad_objects; i++) - if (hasheq(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i)) + if (hasheq(sha1, p->bad_object_sha1 + hashsz * i)) return; p->bad_object_sha1 = xrealloc(p->bad_object_sha1, st_mult(GIT_MAX_RAWSZ, st_add(p->num_bad_objects, 1))); - hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, sha1); + hashcpy(p->bad_object_sha1 + hashsz * p->num_bad_objects, sha1); p->num_bad_objects++; }
[PATCH 00/14] Hash function transition part 15
This is the fifteenth series in the ongoing hash function transition. This series includes several conversions to use the_hash_algo, combined with some use of parse_oid_hex and GIT_MAX_RAWSZ. None of the transformations here are especially surprising. brian m. carlson (14): pack-bitmap-write: use GIT_MAX_RAWSZ for allocation builtin/repack: replace hard-coded constant builtin/mktree: remove hard-coded constant builtin/fetch-pack: remove constants with parse_oid_hex pack-revindex: express constants in terms of the_hash_algo packfile: express constants in terms of the_hash_algo refs/packed-backend: express constants using the_hash_algo upload-pack: express constants in terms of the_hash_algo transport: use parse_oid_hex instead of a constant tag: express constant in terms of the_hash_algo apply: replace hard-coded constants apply: rename new_sha1_prefix and old_sha1_prefix submodule: make zero-oid comparison hash function agnostic rerere: convert to use the_hash_algo apply.c | 50 +- builtin/fetch-pack.c | 13 +++ builtin/mktree.c | 2 +- builtin/repack.c | 4 +-- git-submodule.sh | 7 +++- pack-bitmap-write.c | 2 +- pack-revindex.c | 9 ++--- packfile.c| 5 +-- refs/packed-backend.c | 14 rerere.c | 81 ++- tag.c | 2 +- transport.c | 7 ++-- upload-pack.c | 13 +++ 13 files changed, 112 insertions(+), 97 deletions(-)
[PATCH 03/14] builtin/mktree: remove hard-coded constant
Instead of using a hard-coded constant for the size of a hex object ID, switch to use the computed pointer from parse_oid_hex that points after the parsed object ID. Signed-off-by: brian m. carlson --- builtin/mktree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/mktree.c b/builtin/mktree.c index 2dc4ad6ba8..94e82b8504 100644 --- a/builtin/mktree.c +++ b/builtin/mktree.c @@ -98,7 +98,7 @@ static void mktree_line(char *buf, size_t len, int nul_term_line, int allow_miss *ntr++ = 0; /* now at the beginning of SHA1 */ - path = ntr + 41; /* at the beginning of name */ + path = (char *)p + 1; /* at the beginning of name */ if (!nul_term_line && path[0] == '"') { struct strbuf p_uq = STRBUF_INIT; if (unquote_c_style(&p_uq, path, NULL))
[PATCH 02/14] builtin/repack: replace hard-coded constant
Signed-off-by: brian m. carlson --- builtin/repack.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index c6a7943d5c..e77859062d 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -407,8 +407,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) out = xfdopen(cmd.out, "r"); while (strbuf_getline_lf(&line, out) != EOF) { - if (line.len != 40) - die("repack: Expecting 40 character sha1 lines only from pack-objects."); + if (line.len != the_hash_algo->hexsz) + die("repack: Expecting full hex object ID lines only from pack-objects."); string_list_append(&names, line.buf); } fclose(out);
Re: [PATCH v2 1/2] editorconfig: provide editor settings for Git developers
On Mon, Oct 08, 2018 at 10:53:52PM +0200, Ævar Arnfjörð Bjarmason wrote: > > It looks like we can add at least "pm" and "pl" to that pattern: Sure. I didn't think we had any of those, but you've just proven me wrong. I'll send a v3 shortly. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
[PATCH] unpack-trees: allow missing CE_SKIP_WORKTREE objs
Whenever a sparse checkout occurs, the existence of all blobs in the index is verified, whether or not they are included or excluded by the .git/info/sparse-checkout specification. This degrades performance, significantly in the case of a partial clone, because a lazy fetch occurs whenever the existence of a missing blob is checked. At the point of invoking cache_tree_update() in unpack_trees(), CE_SKIP_WORKTREE is already set on all excluded blobs (mark_new_skip_worktree() is called twice to set CE_NEW_SKIP_WORKTREE, then apply_sparse_checkout() is called which copies over CE_NEW_SKIP_WORKTREE to CE_SKIP_WORKTREE). cache_tree_update() can use this information to know which blobs are excluded, and thus skip the checking of these. Because cache_tree_update() is used from multiple places, this new behavior is guarded by a new flag WRITE_TREE_SKIP_WORKTREE_MISSING_OK. Implement this new flag, and teach unpack_trees() to invoke cache_tree_update() with this new flag. Signed-off-by: Jonathan Tan --- cache-tree.c | 6 +- cache-tree.h | 4 t/t1090-sparse-checkout-scope.sh | 33 unpack-trees.c | 1 + 4 files changed, 43 insertions(+), 1 deletion(-) diff --git a/cache-tree.c b/cache-tree.c index 5ce51468f0..340caf2d34 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -246,6 +246,8 @@ static int update_one(struct cache_tree *it, int missing_ok = flags & WRITE_TREE_MISSING_OK; int dryrun = flags & WRITE_TREE_DRY_RUN; int repair = flags & WRITE_TREE_REPAIR; + int skip_worktree_missing_ok = + flags & WRITE_TREE_SKIP_WORKTREE_MISSING_OK; int to_invalidate = 0; int i; @@ -356,7 +358,9 @@ static int update_one(struct cache_tree *it, } if (is_null_oid(oid) || - (mode != S_IFGITLINK && !missing_ok && !has_object_file(oid))) { + (mode != S_IFGITLINK && !missing_ok && +!(skip_worktree_missing_ok && ce_skip_worktree(ce)) && +!has_object_file(oid))) { strbuf_release(&buffer); if (expected_missing) return -1; diff --git a/cache-tree.h b/cache-tree.h index 0ab6784ffe..655d487619 100644 --- a/cache-tree.h +++ b/cache-tree.h @@ -40,6 +40,10 @@ void cache_tree_verify(struct index_state *); #define WRITE_TREE_DRY_RUN 4 #define WRITE_TREE_SILENT 8 #define WRITE_TREE_REPAIR 16 +/* + * Do not check for the presence of cache entry objects with CE_SKIP_WORKTREE. + */ +#define WRITE_TREE_SKIP_WORKTREE_MISSING_OK 32 /* error return codes */ #define WRITE_TREE_UNREADABLE_INDEX (-1) diff --git a/t/t1090-sparse-checkout-scope.sh b/t/t1090-sparse-checkout-scope.sh index 25d7c700f6..090b7fc3d3 100755 --- a/t/t1090-sparse-checkout-scope.sh +++ b/t/t1090-sparse-checkout-scope.sh @@ -63,4 +63,37 @@ test_expect_success 'return to full checkout of master' ' test "$(cat b)" = "modified" ' +test_expect_success 'in partial clone, sparse checkout only fetches needed blobs' ' + test_create_repo server && + git clone "file://$(pwd)/server" client && + + test_config -C server uploadpack.allowfilter 1 && + test_config -C server uploadpack.allowanysha1inwant 1 && + echo a >server/a && + echo bb >server/b && + mkdir server/c && + echo ccc >server/c/c && + git -C server add a b c/c && + git -C server commit -m message && + + test_config -C client core.sparsecheckout 1 && + test_config -C client extensions.partialclone origin && + echo "!/*" >client/.git/info/sparse-checkout && + echo "/a" >>client/.git/info/sparse-checkout && + git -C client fetch --filter=blob:none origin && + git -C client checkout FETCH_HEAD && + + git -C client rev-list HEAD \ + --quiet --objects --missing=print >unsorted_actual && + ( + printf "?" && + git hash-object server/b && + printf "?" && + git hash-object server/c/c + ) >unsorted_expect && + sort unsorted_actual >actual && + sort unsorted_expect >expect && + test_cmp expect actual +' + test_done diff --git a/unpack-trees.c b/unpack-trees.c index 51bfac6aa0..39e0e7a6c7 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1635,6 +1635,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options o->result.cache_tree = cache_tree(); if (!cache_tree_fully_valid(o->result.cache_tree)) cache_tree_update(&o->result, + WRITE_TREE_SKIP_WORKTREE_MISSING_OK | WRITE_TREE_SILENT | WRITE_TREE_REPAIR); }
Re: [PATCH v2 1/2] editorconfig: provide editor settings for Git developers
On Mon, Oct 08 2018, brian m. carlson wrote: > Contributors to Git use a variety of editors, each with their own > configuration files. Because C lacks the defined norms on how to indent > and style code that other languages, such as Ruby and Rust, have, it's > possible for various contributors, especially new ones, to have > configured their editor to use a style other than the style the Git > community prefers. > > To make automatically configuring one's editor easier, provide an > EditorConfig file. This is an INI-style configuration file that can be > used to specify editor settings and can be understood by a wide variety > of editors. Some editors include this support natively; others require > a plugin. Regardless, providing such a file allows users to > automatically configure their editor of choice with the correct settings > by default. > > Provide global settings to set the character set to UTF-8 and insert a > final newline into files. Provide language-specific settings for C, > Shell, Perl, and Python files according to what CodingGuidelines already > specifies. Since the indentation of other files varies, especially > certain AsciiDoc files, don't provide any settings for them until a > clear consensus forward emerges. > > Set the line length for commit messages to 72 characters, which is the > generally accepted line length for emails, since we send patches by > email. > > Don't specify an end of line type. While the Git community uses > Unix-style line endings in the repository, some Windows users may use > Git's auto-conversion support and forcing Unix-style line endings might > cause problems for those users. > > Finally, leave out a root directive, which would prevent reading other > EditorConfig files higher up in the tree, in case someone wants to set > the end of line type for their system in such a file. > > Signed-off-by: brian m. carlson > --- > .editorconfig | 14 ++ > 1 file changed, 14 insertions(+) > create mode 100644 .editorconfig > > diff --git a/.editorconfig b/.editorconfig > new file mode 100644 > index 00..83227fa0b2 > --- /dev/null > +++ b/.editorconfig > @@ -0,0 +1,14 @@ > +[*] > +charset = utf-8 > +insert_final_newline = true > + > +[*.{c,h,sh,perl}] > +indent_style = tab > +tab_width = 8 It looks like we can add at least "pm" and "pl" to that pattern: $ git ls-files|grep -E -v -e '\.(c|h|sh|perl)$' | grep -F .| sed 's/.*\.//'|sort|uniq -c|sort -nr|head -n 15 631 txt 56 expect 48 po 41 test 40 tcl 34 gitignore 24 pm 18 patch 18 diff 16 pl 15 side 14 gitattributes 12 dump 11 sample 9 master > +[*.py] > +indent_style = space > +indent_size = 4 > + > +[COMMIT_EDITMSG] > +max_line_length = 72
[PATCH v2 0/2] EditorConfig file
This series introduces an EditorConfig file to help developers using any editor set their editor's settings in conformance with the Git Project's settings. This is helpful for developers who work on different projects with different indentation standards to keep their work in sync. Changes since v1: * Add notes to both .editorconfig and .clang-format that they should be kept in sync. * Add commit message line length. brian m. carlson (2): editorconfig: provide editor settings for Git developers editorconfig: indicate settings should be kept in sync .clang-format | 2 ++ .editorconfig | 16 2 files changed, 18 insertions(+) create mode 100644 .editorconfig
[PATCH v2 1/2] editorconfig: provide editor settings for Git developers
Contributors to Git use a variety of editors, each with their own configuration files. Because C lacks the defined norms on how to indent and style code that other languages, such as Ruby and Rust, have, it's possible for various contributors, especially new ones, to have configured their editor to use a style other than the style the Git community prefers. To make automatically configuring one's editor easier, provide an EditorConfig file. This is an INI-style configuration file that can be used to specify editor settings and can be understood by a wide variety of editors. Some editors include this support natively; others require a plugin. Regardless, providing such a file allows users to automatically configure their editor of choice with the correct settings by default. Provide global settings to set the character set to UTF-8 and insert a final newline into files. Provide language-specific settings for C, Shell, Perl, and Python files according to what CodingGuidelines already specifies. Since the indentation of other files varies, especially certain AsciiDoc files, don't provide any settings for them until a clear consensus forward emerges. Set the line length for commit messages to 72 characters, which is the generally accepted line length for emails, since we send patches by email. Don't specify an end of line type. While the Git community uses Unix-style line endings in the repository, some Windows users may use Git's auto-conversion support and forcing Unix-style line endings might cause problems for those users. Finally, leave out a root directive, which would prevent reading other EditorConfig files higher up in the tree, in case someone wants to set the end of line type for their system in such a file. Signed-off-by: brian m. carlson --- .editorconfig | 14 ++ 1 file changed, 14 insertions(+) create mode 100644 .editorconfig diff --git a/.editorconfig b/.editorconfig new file mode 100644 index 00..83227fa0b2 --- /dev/null +++ b/.editorconfig @@ -0,0 +1,14 @@ +[*] +charset = utf-8 +insert_final_newline = true + +[*.{c,h,sh,perl}] +indent_style = tab +tab_width = 8 + +[*.py] +indent_style = space +indent_size = 4 + +[COMMIT_EDITMSG] +max_line_length = 72
[PATCH v2 2/2] editorconfig: indicate settings should be kept in sync
Now that we have two places where we set code formatting settings, .editorconfig and .clang-format, it's possible that they could fall out of sync. This is relatively unlikely, since we're not likely to change the tab width or our preference for tabs, but just in case, add comments to both files reminding future developers to keep them in sync. Signed-off-by: brian m. carlson --- .clang-format | 2 ++ .editorconfig | 2 ++ 2 files changed, 4 insertions(+) diff --git a/.clang-format b/.clang-format index 12a89f95f9..de1c8b5c77 100644 --- a/.clang-format +++ b/.clang-format @@ -6,6 +6,8 @@ # Use tabs whenever we need to fill whitespace that spans at least from one tab # stop to the next one. +# +# These settings are mirrored in .editorconfig. Keep them in sync. UseTab: Always TabWidth: 8 IndentWidth: 8 diff --git a/.editorconfig b/.editorconfig index 83227fa0b2..98d7c5ff99 100644 --- a/.editorconfig +++ b/.editorconfig @@ -2,6 +2,8 @@ charset = utf-8 insert_final_newline = true +# The settings for C (*.c and *.h) files are mirrored in .clang-format. Keep +# them in sync. [*.{c,h,sh,perl}] indent_style = tab tab_width = 8
Re: What's so special about objects/17/ ?
On Sun, Oct 7, 2018 at 1:07 PM Junio C Hamano wrote: > > Junio C Hamano writes: > > ... > > by general public and I do not have to explain the choice to the > > general public ;-) > > One thing that is more important than "why not 00 but 17?" to answer > is why a hardcoded number rather than a runtime random. It is for > repeatability. Let's talk about repeatability vs statistics for a second. ;-) If I am a user and I were really into optimizing my loose object count for some reason, so I would want to choose a low number of gc.auto. Let's say I go with 128. At the low end of loose objects the approximation is yielding some high relative errors. This is because of the granularity, i.e. gc would implicitly estimate the loose objects to be 0 or 256 or 512, (or more) if there is 0, 1, 2 (or more) loose objects in the objects/17. As each object can be viewed following an unfair coin flip (With a chance of 1/256 it is in objects/17), the distribution in objects/17 (and hence any other objects/XX bin) follows the Bernoulli distribution. If I do have say about 157 loose objects (and having auto.gc configured anywhere in 1..255), then the probability to not gc is 54% (as that is the probability to have 0 objects in /17, following probability mass function of the Bernoulli distribution, (i.e. Pr(0 objects) = (157 over 0) x (1/256)^0 x (255/256)^157)) As it is repeatable (by picking the same /17 every time), I can run "gc --auto" multiple times and still have 157 loose objects, despite wanting to have only 128 loose objects at a 54% chance. If we'd roll the 256 dice every time to pick a different bin, then we might hit another bin and gc in the second or third gc, which would be more precise on average. By having repeatability we allow for these numbers to be far off more often when configuring small numbers. I think that is the right choice, as we probably do not care about the exactness of auto-gc for small numbers, as it is a performance thing anyway. Although documenting it properly might be a challenge. The current wording of auto.gc seems to suggest that we are right for the number as we compute it via the implying the expected value, (i.e. we pick a bin and multiply the fullness of the bin by the number of bins to estimate the whole fullness, see the mean=n p on [1]) I think a user would be far more interested in giving an upper bound, i.e. expressing something like "I will have at most $auto.gc objects before gc kicks in" or "The likelihood to exceed the $auto.gc number of loose objects by $this much is less than 5%", for which the math would be more complicated, but easier to document with the words of statistics. [1] https://en.wikipedia.org/wiki/Binomial_distribution Stefan
Re: We should add a "git gc --auto" after "git clone" due to commit graph
On 10/8/2018 2:10 PM, SZEDER Gábor wrote: On Mon, Oct 08, 2018 at 12:57:34PM -0400, Derrick Stolee wrote: Nice! These numbers make sense to me, in terms of how many TREESAME queries we actually need to perform for such a query. Yeah... because you didn't notice that I deliberately cheated :) As it turned out, it's not just about the number of diff queries that we can spare, but, for the speedup _ratio_, it's more about how expensive those diff queries are. git.git has a rather flat hierarchy, and 't/' is the 372th entry in the current root tree object, while 'valgrind/' is the 923th entry, and the diff machinery spends considerable time wading through the previous entries. Notice the "carefully chosen path" remark in my previous email; I think this particular path has the highest number of preceeding tree entries, and, in addition, 't/' changes rather frequently, so the diff machinery often has to scan two relatively big tree objects. Had I chosen 'Documentation/RelNotes/1.5.0.1.txt' instead, i.e. another path two directories deep, but whose leading path components are both near the beginning of the tree objects, the speedup would be much less impressive: 0.282s vs. 0.049s, i.e. "only" ~5.7x instead of ~24.8x. This is expected. The performance ratio is better when the path is any of the following: 1. A very deep path (need to walk multiple trees to answer TREESAME) 2. An entry is late in a very wide tree (need to spend extra time parsing tree object) 3. The path doesn't change very often (need to inspect many TREESAME pairs before finding enough interesting commits) 4. Some sub-path changes often (so the TREESAME comparison needs to parse beyond that sub-path often) Our standard examples (Git and Linux repos) don't have many paths that have these properties. But: they do exist. In other projects, this is actually typical. Think about Java projects that frequently have ~5 levels of folders before actually touching a code file. When I was implementing the Bloom filter feature for Azure Repos, I ran performance tests on the Linux repo using a random sampling of paths. The typical speedup was 5x while some outliers were in the 25x range. But I'm afraid it will take a while until I get around to turn it into something presentable... Do you have the code pushed somewhere public where one could take a look? I Do you have the code pushed somewhere public where one could take a look? I could provide some early feedback. Nah, definitely not... I know full well how embarassingly broken this implementation is, I don't need others to tell me that ;) There are two questions that I was hoping to answer by looking at your code: 1. How do you store your Bloom filter? Is it connected to the commit-graph and split on a commit-by-commit basis (storing "$path" as a key), or is it one huge Bloom filter (storing "$commitid:$path" as key)? 2. Where does your Bloom filter check plug into the TREESAME logic? I haven't investigated this part, but hopefully it isn't too complicated. Thanks, -Stolee
Re: t3404.6 breaks on master under GIT_FSMONITOR_TEST
On 10/8/2018 10:19 AM, Ævar Arnfjörð Bjarmason wrote: On Thu, Sep 06 2018, Ævar Arnfjörð Bjarmason wrote: On Thu, Feb 01 2018, Ævar Arnfjörð Bjarmason wrote: The GIT_FSMONITOR_TEST variable allows you to roundtrip the fsmonitor codpath in the whole test suite. On both Debian & CentOS this breaks for me: (cd t && GIT_FSMONITOR_TEST=$PWD/t7519/fsmonitor-all ./t3404-rebase-interactive.sh -i) Whereas this works: (cd t && GIT_FSMONITOR_TEST=$PWD/t7519/fsmonitor-all GIT_SKIP_TESTS=t3404.6 ./t3404-rebase-interactive.sh -i) The entirety of the rest of the test suite still passes with GIT_FSMONITOR_TEST. This has been failing ever since GIT_FSMONITOR_TEST was introduced in 883e248b8a ("fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.", 2017-09-22). Under -v -x -i: + echo test_must_fail: command succeeded: env FAKE_LINES=exec_echo_foo_>file1 1 git rebase -i HEAD^ test_must_fail: command succeeded: env FAKE_LINES=exec_echo_foo_>file1 1 git rebase -i HEAD^ + return 1 error: last command exited with $?=1 not ok 6 - rebase -i with the exec command checks tree cleanness # # git checkout master && # set_fake_editor && # test_must_fail env FAKE_LINES="exec_echo_foo_>file1 1" git rebase -i HEAD^ && Maybe once this is fixed running the test suite under GIT_FSMONITOR_TEST would be a useful Travis target, but I don't know the current status of adding new options to Travis. *Poke* at this again. Ben, or anyone else with knowledge of fsmonitor: Can you reproduce this? This failure along with the one I noted in https://public-inbox.org/git/87tvn2remn@evledraar.gmail.com/ is failing the tests on Linux when run with GIT_FSMONITOR_TEST. I'm looking at this again because SZEDER's patches to the split index reminded me again that we have these long-standing failures in rare test modes (see https://public-inbox.org/git/87va7ireuu@evledraar.gmail.com/ for the split index discussion). For what it's worth this is still broken, but more importantly (I'm not just keeping bumping the same thing) the only thing that's now broken under fsmonitor. I.e. my skip config is now GIT_SKIP_TESTS="t3404.7" whereas before 43f1180814 ("git-mv: allow submodules and fsmonitor to work together", 2018-09-10) I needed to add "t7411.3 t7411.4" to that. I glanced at this for a few minutes but it wasn't obvious what was happening. It will take some additional effort to dig into and figure out the underlying issue. I haven't forgotten about this - it's still on my list, just below some other things I need to get finished up first.
Re: We should add a "git gc --auto" after "git clone" due to commit graph
On Mon, Oct 08, 2018 at 12:57:34PM -0400, Derrick Stolee wrote: > On 10/8/2018 12:41 PM, SZEDER Gábor wrote: > >On Wed, Oct 03, 2018 at 03:18:05PM -0400, Jeff King wrote: > >>I'm still excited about the prospect of a bloom filter for paths which > >>each commit touches. I think that's the next big frontier in getting > >>things like "git log -- path" to a reasonable run-time. > >There is certainly potential there. With a (very) rough PoC > >experiment, a 8MB bloom filter, and a carefully choosen path I can > >achieve a nice, almost 25x speedup: > > > > $ time git rev-list --count HEAD -- t/valgrind/valgrind.sh > > 6 > > > > real0m1.563s > > user0m1.519s > > sys 0m0.045s > > > > $ time GIT_USE_POC_BLOOM_FILTER=y ~/src/git/git rev-list --count HEAD -- > > t/valgrind/valgrind.sh > > 6 > > > > real0m0.063s > > user0m0.043s > > sys 0m0.020s > > > > bloom filter total queries: 16269 definitely not: 16195 maybe: 74 false > > positives: 64 fp ratio: 0.003934 > Nice! These numbers make sense to me, in terms of how many TREESAME queries > we actually need to perform for such a query. Yeah... because you didn't notice that I deliberately cheated :) As it turned out, it's not just about the number of diff queries that we can spare, but, for the speedup _ratio_, it's more about how expensive those diff queries are. git.git has a rather flat hierarchy, and 't/' is the 372th entry in the current root tree object, while 'valgrind/' is the 923th entry, and the diff machinery spends considerable time wading through the previous entries. Notice the "carefully chosen path" remark in my previous email; I think this particular path has the highest number of preceeding tree entries, and, in addition, 't/' changes rather frequently, so the diff machinery often has to scan two relatively big tree objects. Had I chosen 'Documentation/RelNotes/1.5.0.1.txt' instead, i.e. another path two directories deep, but whose leading path components are both near the beginning of the tree objects, the speedup would be much less impressive: 0.282s vs. 0.049s, i.e. "only" ~5.7x instead of ~24.8x. > >But I'm afraid it will take a while until I get around to turn it into > >something presentable... > Do you have the code pushed somewhere public where one could take a look? I > Do you have the code pushed somewhere public where one could take a > look? I could provide some early feedback. Nah, definitely not... I know full well how embarassingly broken this implementation is, I don't need others to tell me that ;)
[PATCH v5 0/4] Filter alternate references
Hi, Attached is (what I anticipate to be) the final re-roll of my series to introduce 'core.alternateRefsCommand' and 'core.alternateRefsPrefixes' in order to limit the ".have" advertisement when pushing over protocol v1 to a repository with configured alternates. Not much has changed from last time, expect for: - Taking a documentation suggestion from Peff (in 3/4), and - Fixing a typo pointed out by Ramsay (in 4/4). I believe that this series is otherwise ready for queueing, if everyone else feels sufficiently OK about the changes. Thanks in advance for your review. Thanks, Taylor Jeff King (1): transport: drop refnames from for_each_alternate_ref Taylor Blau (3): transport.c: extract 'fill_alternate_refs_command' transport.c: introduce core.alternateRefsCommand transport.c: introduce core.alternateRefsPrefixes Documentation/config.txt | 18 + builtin/receive-pack.c | 3 +-- fetch-pack.c | 3 +-- t/t5410-receive-pack-alternates.sh | 41 ++ transport.c| 38 +-- transport.h| 2 +- 6 files changed, 92 insertions(+), 13 deletions(-) create mode 100755 t/t5410-receive-pack-alternates.sh Range-diff against v4: 1: 76482a7eba = 1: e4947f557b transport: drop refnames from for_each_alternate_ref 2: 120df009df = 2: 3d77a46c61 transport.c: extract 'fill_alternate_refs_command' 3: c63864c89a ! 3: 7451b4872a transport.c: introduce core.alternateRefsCommand @@ -42,14 +42,9 @@ + hex object id per line (i.e., the same as produce by `git for-each-ref + --format='%(objectname)'`). ++ -+This is useful when a repository only wishes to advertise some of its -+alternate's references as `.have`'s. For example, to only advertise branch -+heads, configure `core.alternateRefsCommand` to the path of a script which runs -+`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`. -++ -+Note that the configured value is executed in a shell, and thus -+linkgit:git-for-each-ref[1] by itself does not work, as scripts have to handle -+the path argument specially. ++Note that you cannot generally put `git for-each-ref` directly into the config ++value, as it does not take a repository path as an argument (but you can wrap ++the command above in a shell script). + core.bare:: If true this repository is assumed to be 'bare' and has no 4: 0f6cdc7ea4 ! 4: 28cbbe63f7 transport.c: introduce core.alternateRefsPrefixes @@ -39,8 +39,8 @@ --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ - linkgit:git-for-each-ref[1] by itself does not work, as scripts have to handle - the path argument specially. + value, as it does not take a repository path as an argument (but you can wrap + the command above in a shell script). +core.alternateRefsPrefixes:: + When listing references from an alternate, list only references that begin @@ -62,7 +62,7 @@ +test_expect_success 'with core.alternateRefsPrefixes' ' + test_config -C fork core.alternateRefsPrefixes "refs/heads/private" && -+ git rev-parse private/branch expect && ++ git rev-parse private/branch >expect && + printf "" | git receive-pack fork >actual && + extract_haves actual.haves && + test_cmp expect actual.haves -- 2.19.0.221.g150f307af
[PATCH v5 4/4] transport.c: introduce core.alternateRefsPrefixes
The recently-introduced "core.alternateRefsCommand" allows callers to specify with high flexibility the tips that they wish to advertise from alternates. This flexibility comes at the cost of some inconvenience when the caller only wishes to limit the advertisement to one or more prefixes. For example, to advertise only tags, a caller using 'core.alternateRefsCommand' would have to do: $ git config core.alternateRefsCommand ' \ f() { git -C "$1" for-each-ref \ refs/tags --format="%(objectname)" }; f "$@"' The above is cumbersome to write, so let's introduce a "core.alternateRefsPrefixes" to address this common case. Instead, the caller can run: $ git config core.alternateRefsPrefixes 'refs/tags' Which will behave identically to the longer example using "core.alternateRefsCommand". Since the value of "core.alternateRefsPrefixes" is appended to 'git for-each-ref' and then executed, include a "--" before taking the configured value to avoid misinterpreting arguments as flags to 'git for-each-ref'. In the case that the caller wishes to specify multiple prefixes, they may separate them by whitespace. If "core.alternateRefsCommand" is set, it will take precedence over "core.alternateRefsPrefixes". Signed-off-by: Taylor Blau --- Documentation/config.txt | 7 +++ t/t5410-receive-pack-alternates.sh | 8 transport.c| 5 + 3 files changed, 20 insertions(+) diff --git a/Documentation/config.txt b/Documentation/config.txt index c51e82d8a5..a133a709f3 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -627,6 +627,13 @@ Note that you cannot generally put `git for-each-ref` directly into the config value, as it does not take a repository path as an argument (but you can wrap the command above in a shell script). +core.alternateRefsPrefixes:: + When listing references from an alternate, list only references that begin + with the given prefix. Prefixes match as if they were given as arguments to + linkgit:git-for-each-ref[1]. To list multiple prefixes, separate them with + whitespace. If `core.alternateRefsCommand` is set, setting + `core.alternateRefsPrefixes` has no effect. + core.bare:: If true this repository is assumed to be 'bare' and has no working directory associated with it. If this is the case a diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack-alternates.sh index 49d0fe44fb..457c20c2a5 100755 --- a/t/t5410-receive-pack-alternates.sh +++ b/t/t5410-receive-pack-alternates.sh @@ -30,4 +30,12 @@ test_expect_success 'with core.alternateRefsCommand' ' test_cmp expect actual.haves ' +test_expect_success 'with core.alternateRefsPrefixes' ' + test_config -C fork core.alternateRefsPrefixes "refs/heads/private" && + git rev-parse private/branch >expect && + printf "" | git receive-pack fork >actual && + extract_haves actual.haves && + test_cmp expect actual.haves +' + test_done diff --git a/transport.c b/transport.c index e271b66603..83474add28 100644 --- a/transport.c +++ b/transport.c @@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd, argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path); argv_array_push(&cmd->args, "for-each-ref"); argv_array_push(&cmd->args, "--format=%(objectname)"); + + if (!git_config_get_value("core.alternateRefsPrefixes", &value)) { + argv_array_push(&cmd->args, "--"); + argv_array_split(&cmd->args, value); + } } cmd->env = local_repo_env; -- 2.19.0.221.g150f307af
[PATCH v5 2/4] transport.c: extract 'fill_alternate_refs_command'
To list alternate references, 'read_alternate_refs' creates a child process running 'git for-each-ref' in the alternate's Git directory. Prepare to run other commands besides 'git for-each-ref' by introducing and moving the relevant code from 'read_alternate_refs' to 'fill_alternate_refs_command'. Signed-off-by: Taylor Blau --- transport.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/transport.c b/transport.c index 2e0bc414d0..2825debac5 100644 --- a/transport.c +++ b/transport.c @@ -1325,6 +1325,17 @@ char *transport_anonymize_url(const char *url) return xstrdup(url); } +static void fill_alternate_refs_command(struct child_process *cmd, + const char *repo_path) +{ + cmd->git_cmd = 1; + argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path); + argv_array_push(&cmd->args, "for-each-ref"); + argv_array_push(&cmd->args, "--format=%(objectname)"); + cmd->env = local_repo_env; + cmd->out = -1; +} + static void read_alternate_refs(const char *path, alternate_ref_fn *cb, void *data) @@ -1333,12 +1344,7 @@ static void read_alternate_refs(const char *path, struct strbuf line = STRBUF_INIT; FILE *fh; - cmd.git_cmd = 1; - argv_array_pushf(&cmd.args, "--git-dir=%s", path); - argv_array_push(&cmd.args, "for-each-ref"); - argv_array_push(&cmd.args, "--format=%(objectname)"); - cmd.env = local_repo_env; - cmd.out = -1; + fill_alternate_refs_command(&cmd, path); if (start_command(&cmd)) return; -- 2.19.0.221.g150f307af
[PATCH v5 3/4] transport.c: introduce core.alternateRefsCommand
When in a repository containing one or more alternates, Git would sometimes like to list references from those alternates. For example, 'git receive-pack' lists the "tips" pointed to by references in those alternates as special ".have" references. Listing ".have" references is designed to make pushing changes from upstream to a fork a lightweight operation, by advertising to the pusher that the fork already has the objects (via its alternate). Thus, the client can avoid sending them. However, when the alternate (upstream, in the previous example) has a pathologically large number of references, the initial advertisement is too expensive. In fact, it can dominate any such optimization where the pusher avoids sending certain objects. Introduce "core.alternateRefsCommand" in order to provide a facility to limit or filter alternate references. This can be used, for example, to filter out references the alternate does not wish to send (for space concerns, or otherwise) during the initial advertisement. Let the repository that has alternates configure this command to avoid trusting the alternate to provide us a safe command to run in the shell. To find the alternate, pass its absolute path as the first argument. Signed-off-by: Taylor Blau --- Documentation/config.txt | 11 ++ t/t5410-receive-pack-alternates.sh | 33 ++ transport.c| 19 + 3 files changed, 59 insertions(+), 4 deletions(-) create mode 100755 t/t5410-receive-pack-alternates.sh diff --git a/Documentation/config.txt b/Documentation/config.txt index ad0f4510c3..c51e82d8a5 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -616,6 +616,17 @@ core.preferSymlinkRefs:: This is sometimes needed to work with old scripts that expect HEAD to be a symbolic link. +core.alternateRefsCommand:: + When advertising tips of available history from an alternate, use the shell to + execute the specified command instead of linkgit:git-for-each-ref[1]. The + first argument is the absolute path of the alternate. Output must contain one + hex object id per line (i.e., the same as produce by `git for-each-ref + --format='%(objectname)'`). ++ +Note that you cannot generally put `git for-each-ref` directly into the config +value, as it does not take a repository path as an argument (but you can wrap +the command above in a shell script). + core.bare:: If true this repository is assumed to be 'bare' and has no working directory associated with it. If this is the case a diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack-alternates.sh new file mode 100755 index 00..49d0fe44fb --- /dev/null +++ b/t/t5410-receive-pack-alternates.sh @@ -0,0 +1,33 @@ +#!/bin/sh + +test_description='git receive-pack with alternate ref filtering' + +. ./test-lib.sh + +test_expect_success 'setup' ' + test_commit base && + git clone -s --bare . fork && + git checkout -b public/branch master && + test_commit public && + git checkout -b private/branch master && + test_commit private +' + +extract_haves () { + depacketize | perl -lne '/^(\S+) \.have/ and print $1' +} + +test_expect_success 'with core.alternateRefsCommand' ' + write_script fork/alternate-refs <<-\EOF && + git --git-dir="$1" for-each-ref \ + --format="%(objectname)" \ + refs/heads/public/ + EOF + test_config -C fork core.alternateRefsCommand alternate-refs && + git rev-parse public/branch >expect && + printf "" | git receive-pack fork >actual && + extract_haves actual.haves && + test_cmp expect actual.haves +' + +test_done diff --git a/transport.c b/transport.c index 2825debac5..e271b66603 100644 --- a/transport.c +++ b/transport.c @@ -1328,10 +1328,21 @@ char *transport_anonymize_url(const char *url) static void fill_alternate_refs_command(struct child_process *cmd, const char *repo_path) { - cmd->git_cmd = 1; - argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path); - argv_array_push(&cmd->args, "for-each-ref"); - argv_array_push(&cmd->args, "--format=%(objectname)"); + const char *value; + + if (!git_config_get_value("core.alternateRefsCommand", &value)) { + cmd->use_shell = 1; + + argv_array_push(&cmd->args, value); + argv_array_push(&cmd->args, repo_path); + } else { + cmd->git_cmd = 1; + + argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path); + argv_array_push(&cmd->args, "for-each-ref"); + argv_array_push(&cmd->args, "--format=%(objectname)"); + } + cmd->env = local_repo_env; cmd->out = -1; } -- 2.19.0.221.g150f307af
[PATCH v5 1/4] transport: drop refnames from for_each_alternate_ref
From: Jeff King None of the current callers use the refname parameter we pass to their callbacks. In theory somebody _could_ do so, but it's actually quite weird if you think about it: it's a ref in somebody else's repository. So the name has no meaning locally, and in fact there may be duplicates if there are multiple alternates. The users of this interface really only care about seeing some ref tips, since that promises that the alternate has the full commit graph reachable from there. So let's keep the information we pass back to the bare minimum. Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- builtin/receive-pack.c | 3 +-- fetch-pack.c | 3 +-- transport.c| 6 +++--- transport.h| 2 +- 4 files changed, 6 insertions(+), 8 deletions(-) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 4d30001950..6792291f5e 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -281,8 +281,7 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid, return 0; } -static void show_one_alternate_ref(const char *refname, - const struct object_id *oid, +static void show_one_alternate_ref(const struct object_id *oid, void *data) { struct oidset *seen = data; diff --git a/fetch-pack.c b/fetch-pack.c index 75047a4b2a..b643de143b 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -76,8 +76,7 @@ struct alternate_object_cache { size_t nr, alloc; }; -static void cache_one_alternate(const char *refname, - const struct object_id *oid, +static void cache_one_alternate(const struct object_id *oid, void *vcache) { struct alternate_object_cache *cache = vcache; diff --git a/transport.c b/transport.c index 1c76d64aba..2e0bc414d0 100644 --- a/transport.c +++ b/transport.c @@ -1336,7 +1336,7 @@ static void read_alternate_refs(const char *path, cmd.git_cmd = 1; argv_array_pushf(&cmd.args, "--git-dir=%s", path); argv_array_push(&cmd.args, "for-each-ref"); - argv_array_push(&cmd.args, "--format=%(objectname) %(refname)"); + argv_array_push(&cmd.args, "--format=%(objectname)"); cmd.env = local_repo_env; cmd.out = -1; @@ -1348,13 +1348,13 @@ static void read_alternate_refs(const char *path, struct object_id oid; if (get_oid_hex(line.buf, &oid) || - line.buf[GIT_SHA1_HEXSZ] != ' ') { + line.buf[GIT_SHA1_HEXSZ]) { warning(_("invalid line while parsing alternate refs: %s"), line.buf); break; } - cb(line.buf + GIT_SHA1_HEXSZ + 1, &oid, data); + cb(&oid, data); } fclose(fh); diff --git a/transport.h b/transport.h index 01e717c29e..9baeca2d7a 100644 --- a/transport.h +++ b/transport.h @@ -261,6 +261,6 @@ int transport_refs_pushed(struct ref *ref); void transport_print_push_status(const char *dest, struct ref *refs, int verbose, int porcelain, unsigned int *reject_reasons); -typedef void alternate_ref_fn(const char *refname, const struct object_id *oid, void *); +typedef void alternate_ref_fn(const struct object_id *oid, void *); extern void for_each_alternate_ref(alternate_ref_fn, void *); #endif -- 2.19.0.221.g150f307af
I want you to Distribute my funds to less priviledge if interested reply now
Re: [PATCH v11 8/8] list-objects-filter: implement filter tree:0
On Sat, Oct 6, 2018 at 5:10 PM Junio C Hamano wrote: > > As output made inside test_expect_{succcess,failure} are discarded > by default and shown while debugging tests, there is no strong > reason to use "grep -q" in our tests. I saw a few instances of > "grep -q" added in this series including this one > > test_must_fail grep -q "$file_4" observed > > that should probably be > > ! grep "$file_4" observed Yeah, I remember I read in the testing guidelines that you should just use ! for non-Git commands since it's not our job to make sure these tools are not crashing. Thank you for pointing this out. > > > + printf "blob\ncommit\ntree\n" >unique_types.expected && > > ... > > + printf "blob\ntree\n" >expected && > > Using test_write_lines is probably easier to read. Done. Below is an interdiff. Let me know if you want a reroll soon. Otherwise, I will send one later this week. - Matt diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh index 510d3537f..d9dccf4d4 100755 --- a/t/t5317-pack-objects-filter-objects.sh +++ b/t/t5317-pack-objects-filter-objects.sh @@ -69,7 +69,7 @@ test_expect_success 'get an error for missing tree object' ' test_must_fail git -C r5 pack-objects --rev --stdout 2>bad_tree <<-EOF && HEAD EOF -grep -q "bad tree object" bad_tree +grep "bad tree object" bad_tree ' test_expect_success 'setup for tests of tree:0' ' diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index 53fbf7db8..392caa08f 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -192,7 +192,7 @@ test_expect_success 'use fsck before and after manually fetching a missing subtr xargs -n1 git -C dst cat-file -t >fetched_types && sort -u fetched_types >unique_types.observed && -printf "blob\ncommit\ntree\n" >unique_types.expected && +test_write_lines blob commit tree >unique_types.expected && test_cmp unique_types.expected unique_types.observed ' diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh index c8e3d87c4..08e0c7db6 100755 --- a/t/t6112-rev-list-filters-objects.sh +++ b/t/t6112-rev-list-filters-objects.sh @@ -38,8 +38,8 @@ test_expect_success 'specify blob explicitly prevents filtering' ' awk -f print_2.awk) && git -C r1 rev-list --objects --filter=blob:none HEAD $file_3 >observed && -grep -q "$file_3" observed && -test_must_fail grep -q "$file_4" observed +grep "$file_3" observed && +! grep "$file_4" observed ' test_expect_success 'verify emitted+omitted == all' ' @@ -240,7 +240,7 @@ test_expect_success 'verify tree:0 includes trees in "filtered" output' ' xargs -n1 git -C r3 cat-file -t >unsorted_filtered_types && sort -u unsorted_filtered_types >filtered_types && -printf "blob\ntree\n" >expected && +test_write_lines blob tree >expected && test_cmp expected filtered_types '
Re: [PATCH][Outreachy] remove all the inclusions of git-compat-util.h in header files
On 10/8/2018 1:05 PM, Ananya Krishna Maram wrote: Hi All, Hello, Ananya! Welcome. I was searching through #leftovers and found this. https://public-inbox.org/git/cabpp-bgvvxcbzx44er6to-pusfen_6gnyj1u5cuon9deaa4...@mail.gmail.com/ This patch address the task discussed in the above link. The discussion above seems to not be intended for your commit message, but it does show up when I run `git am` and provide your email as input. The typical way to avoid this is to place all commentary below the "---" that signifies the commit message is over. From: Ananya Krishan Maram skip the #include of git-compat-util.h since all .c files include it. Signed-off-by: Ananya Krishna Maram --- advice.h | 1 - commit-graph.h | 1 - hash.h | 1 - pkt-line.h | 1 - t/helper/test-tool.h | 1 - 5 files changed, 5 deletions(-) diff --git a/advice.h b/advice.h index ab24df0fd..09148baa6 100644 --- a/advice.h +++ b/advice.h @@ -1,7 +1,6 @@ #ifndef ADVICE_H #define ADVICE_H -#include "git-compat-util.h" extern int advice_push_update_rejected; extern int advice_push_non_ff_current; diff --git a/commit-graph.h b/commit-graph.h index b05047676..0e93c2bed 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -1,7 +1,6 @@ #ifndef COMMIT_GRAPH_H #define COMMIT_GRAPH_H -#include "git-compat-util.h" #include "repository.h" #include "string-list.h" #include "cache.h" diff --git a/hash.h b/hash.h index 7c8238bc2..9a4334c5d 100644 --- a/hash.h +++ b/hash.h @@ -1,7 +1,6 @@ #ifndef HASH_H #define HASH_H -#include "git-compat-util.h" #if defined(SHA1_PPC) #include "ppc/sha1.h" diff --git a/pkt-line.h b/pkt-line.h index 5b28d4347..fdd316494 100644 --- a/pkt-line.h +++ b/pkt-line.h @@ -1,7 +1,6 @@ #ifndef PKTLINE_H #define PKTLINE_H -#include "git-compat-util.h" #include "strbuf.h" /* diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index e07495727..24e0a1589 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -1,7 +1,6 @@ #ifndef __TEST_TOOL_H__ #define __TEST_TOOL_H__ -#include "git-compat-util.h" int cmd__chmtime(int argc, const char **argv); int cmd__config(int argc, const char **argv); I applied these changes locally and confirmed the code compiles, so all .c files including these _do_ include git-compat-util.h properly. Thanks, -Stolee
[PATCH][Outreachy] remove all the inclusions of git-compat-util.h in header files
Hi All, I was searching through #leftovers and found this. https://public-inbox.org/git/cabpp-bgvvxcbzx44er6to-pusfen_6gnyj1u5cuon9deaa4...@mail.gmail.com/ This patch address the task discussed in the above link. From: Ananya Krishan Maram skip the #include of git-compat-util.h since all .c files include it. Signed-off-by: Ananya Krishna Maram --- advice.h | 1 - commit-graph.h | 1 - hash.h | 1 - pkt-line.h | 1 - t/helper/test-tool.h | 1 - 5 files changed, 5 deletions(-) diff --git a/advice.h b/advice.h index ab24df0fd..09148baa6 100644 --- a/advice.h +++ b/advice.h @@ -1,7 +1,6 @@ #ifndef ADVICE_H #define ADVICE_H -#include "git-compat-util.h" extern int advice_push_update_rejected; extern int advice_push_non_ff_current; diff --git a/commit-graph.h b/commit-graph.h index b05047676..0e93c2bed 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -1,7 +1,6 @@ #ifndef COMMIT_GRAPH_H #define COMMIT_GRAPH_H -#include "git-compat-util.h" #include "repository.h" #include "string-list.h" #include "cache.h" diff --git a/hash.h b/hash.h index 7c8238bc2..9a4334c5d 100644 --- a/hash.h +++ b/hash.h @@ -1,7 +1,6 @@ #ifndef HASH_H #define HASH_H -#include "git-compat-util.h" #if defined(SHA1_PPC) #include "ppc/sha1.h" diff --git a/pkt-line.h b/pkt-line.h index 5b28d4347..fdd316494 100644 --- a/pkt-line.h +++ b/pkt-line.h @@ -1,7 +1,6 @@ #ifndef PKTLINE_H #define PKTLINE_H -#include "git-compat-util.h" #include "strbuf.h" /* diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index e07495727..24e0a1589 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -1,7 +1,6 @@ #ifndef __TEST_TOOL_H__ #define __TEST_TOOL_H__ -#include "git-compat-util.h" int cmd__chmtime(int argc, const char **argv); int cmd__config(int argc, const char **argv); -- 2.19.0.272.ga00e0029e
Re: We should add a "git gc --auto" after "git clone" due to commit graph
On 10/8/2018 12:41 PM, SZEDER Gábor wrote: On Wed, Oct 03, 2018 at 03:18:05PM -0400, Jeff King wrote: I'm still excited about the prospect of a bloom filter for paths which each commit touches. I think that's the next big frontier in getting things like "git log -- path" to a reasonable run-time. There is certainly potential there. With a (very) rough PoC experiment, a 8MB bloom filter, and a carefully choosen path I can achieve a nice, almost 25x speedup: $ time git rev-list --count HEAD -- t/valgrind/valgrind.sh 6 real0m1.563s user0m1.519s sys 0m0.045s $ time GIT_USE_POC_BLOOM_FILTER=y ~/src/git/git rev-list --count HEAD -- t/valgrind/valgrind.sh 6 real0m0.063s user0m0.043s sys 0m0.020s bloom filter total queries: 16269 definitely not: 16195 maybe: 74 false positives: 64 fp ratio: 0.003934 Nice! These numbers make sense to me, in terms of how many TREESAME queries we actually need to perform for such a query. But I'm afraid it will take a while until I get around to turn it into something presentable... Do you have the code pushed somewhere public where one could take a look? I could provide some early feedback. Thanks, -Stolee
Re: We should add a "git gc --auto" after "git clone" due to commit graph
On Wed, Oct 03, 2018 at 03:18:05PM -0400, Jeff King wrote: > I'm still excited about the prospect of a bloom filter for paths which > each commit touches. I think that's the next big frontier in getting > things like "git log -- path" to a reasonable run-time. There is certainly potential there. With a (very) rough PoC experiment, a 8MB bloom filter, and a carefully choosen path I can achieve a nice, almost 25x speedup: $ time git rev-list --count HEAD -- t/valgrind/valgrind.sh 6 real0m1.563s user0m1.519s sys 0m0.045s $ time GIT_USE_POC_BLOOM_FILTER=y ~/src/git/git rev-list --count HEAD -- t/valgrind/valgrind.sh 6 real0m0.063s user0m0.043s sys 0m0.020s bloom filter total queries: 16269 definitely not: 16195 maybe: 74 false positives: 64 fp ratio: 0.003934 But I'm afraid it will take a while until I get around to turn it into something presentable...
Loan
-- Are you a Business Owner or a Financial Consultant looking for a Business loan or are you looking for a Personal loan. Our interest rates are affordable, ranging from 3% to 4.5% interest rate per annual depends on the loan type. If interested, please contact us today here on email: lawrencecorpor...@gmail.com
Re: [PATCH v2 0/5] Fix the racy split index problem
On Mon, Oct 08, 2018 at 04:54:51PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> Thanks. I had ~400 runs of the tests I ran before and they were all > >> OK. Now trying also with t1701 (which I hadn't noticed was a new > >> test...). > > > > Ran that overnight with the same conditions as before. 2683 OK runs and > > 0 failures (and counting). So it seems like the combination of the two > > fixed the split index bugs. > > I forgot I ad this running, and got up to 45482 OKs and 0 FAILs before > finally Ctrl+C-ing it now :) Yay! \o/ Thanks for testing, and I feel sorry for your poor machine... I will send an updated version to address the (minor) points raised in [1]... well, most likely not today, but hopefully soon-ish. 1 - https://public-inbox.org/git/20180929091429.GF23446@localhost/
Re:Business proposition for you
Hello, Business proposition for you. I have a client from Syrian who will like to invest with your company. My client is willing to invest $4 Million. Can I have your company website to show to my client your company so that they will check and decide if they will invest there funds with you as joint partner. This information is needed urgently. Please reply. Best Regards, Agent Melvin Greg Tel:+1 4045966532
[PATCH 1/3] midx: fix broken free() in close_midx()
From: Derrick Stolee When closing a multi-pack-index, we intend to close each pack-file and free the struct packed_git that represents it. However, this line was previously freeing the array of pointers, not the pointer itself. This leads to a double-free issue. Signed-off-by: Derrick Stolee --- midx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/midx.c b/midx.c index f3e8dbc108..999717b96f 100644 --- a/midx.c +++ b/midx.c @@ -190,7 +190,7 @@ static void close_midx(struct multi_pack_index *m) for (i = 0; i < m->num_packs; i++) { if (m->packs[i]) { close_pack(m->packs[i]); - free(m->packs); + free(m->packs[i]); } } FREE_AND_NULL(m->packs); -- gitgitgadget
[PATCH 0/3] Add GIT_TEST_MULTI_PACK_INDEX environment variable
To increase coverage of the multi-pack-index feature, add a GIT_TEST_MULTI_PACK_INDEX environment variable similar to other GIT_TEST_* variables. After creating the environment variable and running the test suite with it enabled, I found a few bugs in the multi-pack-index implementation. These are handled by the first two patches. I have set up a CI build on Azure Pipelines [1] that runs the test suite with a few optional features enabled, including GIT_TEST_MULTI_PACK_INDEX and GIT_TEST_COMMIT_GRAPH. I'll use this to watch the features and ensure they work well with the rest of the ongoing work. Eventually, we can add these variables to the Travis CI scripts. [1] https://git.visualstudio.com/git/_build?definitionId=4 Derrick Stolee (3): midx: fix broken free() in close_midx() midx: close multi-pack-index on repack multi-pack-index: define GIT_TEST_MULTI_PACK_INDEX builtin/repack.c| 8 midx.c | 17 + midx.h | 4 t/README| 4 t/t5310-pack-bitmaps.sh | 1 + t/t5319-multi-pack-index.sh | 2 +- t/t9300-fast-import.sh | 2 +- 7 files changed, 32 insertions(+), 6 deletions(-) base-commit: f84b9b09d40408cf91bbc500d9f190a7866c3e0f Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-27%2Fderrickstolee%2Fmidx-test%2Fupstream-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-27/derrickstolee/midx-test/upstream-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/27 -- gitgitgadget
[PATCH 2/3] midx: close multi-pack-index on repack
From: Derrick Stolee When repacking, we may remove pack-files. This invalidates the multi-pack-index (if it exists). Previously, we removed the multi-pack-index file before removing any pack-file. In some cases, the repack command may load the multi-pack-index into memory. This may lead to later in-memory references to the non-existent pack- files. Signed-off-by: Derrick Stolee --- builtin/repack.c | 4 midx.c | 6 +- midx.h | 2 ++ 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/builtin/repack.c b/builtin/repack.c index c6a7943d5c..7925bb976e 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -432,6 +432,10 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (!midx_cleared) { /* if we move a packfile, it will invalidated the midx */ + if (the_repository->objects) { + close_midx(the_repository->objects->multi_pack_index); + the_repository->objects->multi_pack_index = NULL; + } clear_midx_file(get_object_directory()); midx_cleared = 1; } diff --git a/midx.c b/midx.c index 999717b96f..fe8532a9d1 100644 --- a/midx.c +++ b/midx.c @@ -180,9 +180,13 @@ cleanup_fail: return NULL; } -static void close_midx(struct multi_pack_index *m) +void close_midx(struct multi_pack_index *m) { uint32_t i; + + if (!m) + return; + munmap((unsigned char *)m->data, m->data_len); close(m->fd); m->fd = -1; diff --git a/midx.h b/midx.h index a210f1af2a..af6b5cb58f 100644 --- a/midx.h +++ b/midx.h @@ -44,4 +44,6 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i int write_midx_file(const char *object_dir); void clear_midx_file(const char *object_dir); +void close_midx(struct multi_pack_index *m); + #endif -- gitgitgadget
[PATCH 3/3] multi-pack-index: define GIT_TEST_MULTI_PACK_INDEX
From: Derrick Stolee The multi-pack-index feature is tested in isolation by t5319-multi-pack-index.sh, but there are many more interesting scenarios in the test suite surrounding pack-file data shapes and interactions. Since the multi-pack-index is an optional data structure, it does not make sense to include it by default in those tests. Instead, add a new GIT_TEST_MULTI_PACK_INDEX environment variable that enables core.multiPackIndex and writes a multi-pack-index after each 'git repack' command. This adds extra test coverage when needed. There are a few spots in the test suite that need to react to this change: * t5319-multi-pack-index.sh: there is a test that checks that 'git repack' deletes the multi-pack-index. Disable the environment variable to ensure this still happens. * t5310-pack-bitmaps.sh: One test moves a pack-file from the object directory to an alternate. This breaks the multi-pack-index, so delete the multi-pack-index at this point, if it exists. * t9300-fast-import.sh: One test verifies the number of files in the .git/objects/pack directory is exactly 8. Exclude the multi-pack-index from this count so it is still 8 in all cases. Signed-off-by: Derrick Stolee --- builtin/repack.c| 4 midx.c | 9 +++-- midx.h | 2 ++ t/README| 4 t/t5310-pack-bitmaps.sh | 1 + t/t5319-multi-pack-index.sh | 2 +- t/t9300-fast-import.sh | 2 +- 7 files changed, 20 insertions(+), 4 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 7925bb976e..418442bfe2 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -558,6 +558,10 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (!no_update_server_info) update_server_info(0); remove_temporary_files(); + + if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) + write_midx_file(get_object_directory()); + string_list_clear(&names, 0); string_list_clear(&rollback, 0); string_list_clear(&existing_packs, 0); diff --git a/midx.c b/midx.c index fe8532a9d1..aeafb58fa3 100644 --- a/midx.c +++ b/midx.c @@ -338,9 +338,14 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i struct multi_pack_index *m; struct multi_pack_index *m_search; int config_value; + static int env_value = -1; - if (repo_config_get_bool(r, "core.multipackindex", &config_value) || - !config_value) + if (env_value < 0) + env_value = git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0); + + if (!env_value && + (repo_config_get_bool(r, "core.multipackindex", &config_value) || + !config_value)) return 0; for (m_search = r->objects->multi_pack_index; m_search; m_search = m_search->next) diff --git a/midx.h b/midx.h index af6b5cb58f..bec8f73d28 100644 --- a/midx.h +++ b/midx.h @@ -3,6 +3,8 @@ #include "repository.h" +#define GIT_TEST_MULTI_PACK_INDEX "GIT_TEST_MULTI_PACK_INDEX" + struct multi_pack_index { struct multi_pack_index *next; diff --git a/t/README b/t/README index 3ea6c85460..9d0277c338 100644 --- a/t/README +++ b/t/README @@ -327,6 +327,10 @@ GIT_TEST_COMMIT_GRAPH=, when true, forces the commit-graph to be written after every 'git commit' command, and overrides the 'core.commitGraph' setting to true. +GIT_TEST_MULTI_PACK_INDEX=, when true, forces the multi-pack- +index to be written after every 'git repack' command, and overrides the +'core.multiPackIndex' setting to true. + Naming Tests diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index 1be3459c5b..82d7f7f6a5 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -191,6 +191,7 @@ test_expect_success 'pack-objects respects --honor-pack-keep (local bitmapped pa test_expect_success 'pack-objects respects --local (non-local bitmapped pack)' ' mv .git/objects/pack/$packbitmap.* alt.git/objects/pack/ && + rm -f .git/objects/pack/multi-pack-index && test_when_finished "mv alt.git/objects/pack/$packbitmap.* .git/objects/pack/" && echo HEAD | git pack-objects --local --stdout --revs >3b.pack && git index-pack 3b.pack && diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 6f56b38674..4024ff9a39 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -152,7 +152,7 @@ compare_results_with_midx "twelve packs" test_expect_success 'repack removes multi-pack-index' ' test_path_is_file $objdir/pack/multi-pack-index && - git repack -adf && + GIT_TEST_MULTI_PACK_INDEX=0 git repack -adf && test_path_is_missing $objdir/pack/multi-pack-index ' diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 40fe7e4976..59a13b6a77 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -1558,7 +1558,7 @@
Re: [PATCH 1/1] commit-graph: define GIT_TEST_COMMIT_GRAPH
On 10/8/2018 10:58 AM, Ævar Arnfjörð Bjarmason wrote: On Mon, Oct 08 2018, Derrick Stolee wrote: On 10/8/2018 9:43 AM, Ævar Arnfjörð Bjarmason wrote: On Tue, Aug 28 2018, Derrick Stolee via GitGitGadget wrote: From: Derrick Stolee The commit-graph feature is tested in isolation by t5318-commit-graph.sh and t6600-test-reach.sh, but there are many more interesting scenarios involving commit walks. Many of these scenarios are covered by the existing test suite, but we need to maintain coverage when the optional commit-graph structure is not present. To allow running the full test suite with the commit-graph present, add a new test environment variable, GIT_TEST_COMMIT_GRAPH. Similar to GIT_TEST_SPLIT_INDEX, this variable makes every Git command try to load the commit-graph when parsing commits, and writes the commit-graph file after every 'git commit' command. There are a few tests that rely on commits not existing in pack-files to trigger important events, so manually set GIT_TEST_COMMIT_GRAPH to false for the necessary commands. There is one test in t6024-recursive-merge.sh that relies on the merge-base algorithm picking one of two ambiguous merge-bases, and the commit-graph feature changes which merge-base is picked. The test feature itself seems fine, but this consistently fails ever since it got introduced (a reset --hard on the commit merged to msater in git.git): GIT_TEST_COMMIT_GRAPH=true prove -j$(parallel --number-of-cores) t5500-fetch-pack.sh t6001-rev-list-graft.sh t6050-replace.sh Test Summary Report --- t6001-rev-list-graft.sh (Wstat: 256 Tests: 14 Failed: 6) Failed tests: 3, 5, 7, 9, 11, 13 Non-zero exit status: 1 t6050-replace.sh (Wstat: 256 Tests: 35 Failed: 9) Failed tests: 12-16, 24-25, 30, 35 Non-zero exit status: 1 t5500-fetch-pack.sh(Wstat: 256 Tests: 357 Failed: 1) Failed test: 351 Non-zero exit status: 1 This is on Linux/Debian 4.17.0-1-amd64. Can you reproduce this? If not I can provide more info (-x output etc..). I see these failures, too, but I believe they are due to ds/commit-graph-with-grafts not being merged to 'next' yet. The purpose of that branch is to fix these test breaks. The environment variable got merged a lot faster. I just built & tested the 'jch' branch at 515d82d9 with GIT_TEST_COMMIT_GRAPH=1 and they all passed. I should have tested "pu" first. These failures are indeed fixed there. Thanks, and sorry about the noise. Thanks for testing with the optional features! It's good to keep them exercised.
Re: [PATCH 1/1] commit-graph: define GIT_TEST_COMMIT_GRAPH
On Mon, Oct 08 2018, Derrick Stolee wrote: > On 10/8/2018 9:43 AM, Ævar Arnfjörð Bjarmason wrote: >> On Tue, Aug 28 2018, Derrick Stolee via GitGitGadget wrote: >> >>> From: Derrick Stolee >>> >>> The commit-graph feature is tested in isolation by >>> t5318-commit-graph.sh and t6600-test-reach.sh, but there are many >>> more interesting scenarios involving commit walks. Many of these >>> scenarios are covered by the existing test suite, but we need to >>> maintain coverage when the optional commit-graph structure is not >>> present. >>> >>> To allow running the full test suite with the commit-graph present, >>> add a new test environment variable, GIT_TEST_COMMIT_GRAPH. Similar >>> to GIT_TEST_SPLIT_INDEX, this variable makes every Git command try >>> to load the commit-graph when parsing commits, and writes the >>> commit-graph file after every 'git commit' command. >>> >>> There are a few tests that rely on commits not existing in >>> pack-files to trigger important events, so manually set >>> GIT_TEST_COMMIT_GRAPH to false for the necessary commands. >>> >>> There is one test in t6024-recursive-merge.sh that relies on the >>> merge-base algorithm picking one of two ambiguous merge-bases, and >>> the commit-graph feature changes which merge-base is picked. >>> >> The test feature itself seems fine, but this consistently fails ever >> since it got introduced (a reset --hard on the commit merged to msater >> in git.git): >> >> GIT_TEST_COMMIT_GRAPH=true prove -j$(parallel --number-of-cores) >> t5500-fetch-pack.sh t6001-rev-list-graft.sh t6050-replace.sh >> Test Summary Report >> --- >> t6001-rev-list-graft.sh (Wstat: 256 Tests: 14 Failed: 6) >>Failed tests: 3, 5, 7, 9, 11, 13 >>Non-zero exit status: 1 >> t6050-replace.sh (Wstat: 256 Tests: 35 Failed: 9) >>Failed tests: 12-16, 24-25, 30, 35 >>Non-zero exit status: 1 >> t5500-fetch-pack.sh(Wstat: 256 Tests: 357 Failed: 1) >>Failed test: 351 >>Non-zero exit status: 1 >> >> This is on Linux/Debian 4.17.0-1-amd64. Can you reproduce this? If not I >> can provide more info (-x output etc..). > I see these failures, too, but I believe they are due to > ds/commit-graph-with-grafts not being merged to 'next' yet. The > purpose of that branch is to fix these test breaks. The environment > variable got merged a lot faster. > > I just built & tested the 'jch' branch at 515d82d9 with > GIT_TEST_COMMIT_GRAPH=1 and they all passed. I should have tested "pu" first. These failures are indeed fixed there. Thanks, and sorry about the noise.
Re: [PATCH v2 0/5] Fix the racy split index problem
On Fri, Sep 28 2018, Ævar Arnfjörð Bjarmason wrote: > On Thu, Sep 27 2018, Ævar Arnfjörð Bjarmason wrote: > >> On Thu, Sep 27 2018, SZEDER Gábor wrote: >> >>> On Thu, Sep 27, 2018 at 03:53:24PM +0200, Ævar Arnfjörð Bjarmason wrote: On Thu, Sep 27 2018, SZEDER Gábor wrote: > This is the second attempt to fix the racy split index problem, which > causes occasional failures in several random test scripts when run > with 'GIT_TEST_SPLIT_INDEX=yes'. The important details are in patches > 1 and 5 (corresponding to v1's 3 and 5). Thanks. I'm running the same sorts of tests I noted in https://public-inbox.org/git/87va7ireuu@evledraar.gmail.com/ on this. The fix Jeff had that you noted in https://public-inbox.org/git/20180906151439.GA8016@localhost/ is now in "master". I take it your https://github.com/szeder/git/commits/racy-split-index-fix is the same as this submission? >>> >>> Yes. >>> Anyway, I'm testing that cherry-picked on top of the latest master. Unfortunate that we couldn't get the isolated test you made in https://public-inbox.org/git/20180907034942.GA10370@localhost/ >>> >>> Nah, that's not an isolated test case, that's only a somewhat >>> narrowed-down, but rather reliable reproduction recipe while I still >>> had no idea what was going on :) >>> >>> The _real_ isolated test is the last test in t1701, that's what it >>> eventually boiled down to. >> >> Thanks. I had ~400 runs of the tests I ran before and they were all >> OK. Now trying also with t1701 (which I hadn't noticed was a new >> test...). > > Ran that overnight with the same conditions as before. 2683 OK runs and > 0 failures (and counting). So it seems like the combination of the two > fixed the split index bugs. I forgot I ad this running, and got up to 45482 OKs and 0 FAILs before finally Ctrl+C-ing it now :) but I don't see how it could be added without some very liberal getenv("GIT_TEST_blahblah"), so it's probably best to not add it, particularly with the C rewrite of git-stash in-flight. I'll report back when I have enough test data to say how these patches affect the intermittent test failures under GIT_TEST_SPLIT_INDEX=yes.
[PATCH v4 1/1] contrib: add coverage-diff script
From: Derrick Stolee We have coverage targets in our Makefile for using gcov to display line coverage based on our test suite. The way I like to do it is to run: make coverage-test make coverage-report This leaves the repo in a state where every X.c file that was covered has an X.c.gcov file containing the coverage counts for every line, and "#" at every uncovered line. There have been a few bugs in recent patches what would have been caught if the test suite covered those blocks (including a few of mine). I want to work towards a "sensible" amount of coverage on new topics. In my opinion, this means that any logic should be covered, but the 'die()' blocks covering very unlikely (or near-impossible) situations may not warrant coverage. It is important to not measure the coverage of the codebase by what old code is not covered. To help, I created the 'contrib/coverage-diff.sh' script. After creating the coverage statistics at a version (say, 'topic') you can then run contrib/coverage-diff.sh base topic to see the lines added between 'base' and 'topic' that are not covered by the test suite. The output uses 'git blame -s' format so you can find the commits responsible and view the line numbers for quick access to the context, but trims leading tabs in the file contents to reduce output width. Signed-off-by: Derrick Stolee --- contrib/coverage-diff.sh | 108 +++ 1 file changed, 108 insertions(+) create mode 100755 contrib/coverage-diff.sh diff --git a/contrib/coverage-diff.sh b/contrib/coverage-diff.sh new file mode 100755 index 00..4ec419f900 --- /dev/null +++ b/contrib/coverage-diff.sh @@ -0,0 +1,108 @@ +#!/bin/sh + +# Usage: Run 'contrib/coverage-diff.sh ' from source-root +# after running +# +# make coverage-test +# make coverage-report +# +# while checked out at . This script combines the *.gcov files +# generated by the 'make' commands above with 'git diff ' +# to report new lines that are not covered by the test suite. + +V1=$1 +V2=$2 + +diff_lines () { + perl -e ' + my $line_num; + while (<>) { + # Hunk header? Grab the beginning in postimage. + if (/^@@ -\d+(?:,\d+)? \+(\d+)(?:,\d+)? @@/) { + $line_num = $1; + next; + } + + # Have we seen a hunk? Ignore "diff --git" etc. + next unless defined $line_num; + + # Deleted line? Ignore. + if (/^-/) { + next; + } + + # Show only the line number of added lines. + if (/^\+/) { + print "$line_num\n"; + } + # Either common context or added line appear in + # the postimage. Count it. + $line_num++; + } + ' +} + +files=$(git diff --name-only "$V1" "$V2" -- \*.c) + +# create empty file +>coverage-data.txt + +for file in $files +do + git diff "$V1" "$V2" -- "$file" | + diff_lines | + sort >new_lines.txt + + if ! test -s new_lines.txt + then + continue + fi + + hash_file=$(echo $file | sed "s/\//\#/") + + if ! test -s "$hash_file.gcov" + then + continue + fi + + sed -ne '/#:/{ + s/#:// + s/:.*// + s/ //g + p + }' "$hash_file.gcov" | + sort >uncovered_lines.txt + + comm -12 uncovered_lines.txt new_lines.txt | + sed -e 's/$/\)/' | + sed -e 's/^/ /' >uncovered_new_lines.txt + + grep -q '[^[:space:]]' >coverage-data.txt && + git blame -s "$V2" -- "$file" | + sed 's/\t//g' | + grep -f uncovered_new_lines.txt >>coverage-data.txt && + echo >>coverage-data.txt + + rm -f new_lines.txt uncovered_lines.txt uncovered_new_lines.txt +done + +cat coverage-data.txt + +echo "Commits introducing uncovered code:" + +commit_list=$(cat coverage-data.txt | + grep -E '^[0-9a-f]{7,} ' | + awk '{print $1;}' | + sort | + uniq) + +( + for commit in $commit_list + do + git log --no-decorate --pretty=format:'%an %h: %s' -1 $commit + echo + done +) | sort + +rm coverage-data.txt -- gitgitgadget
[PATCH v4 0/1] contrib: Add script to show uncovered "new" lines
We have coverage targets in our Makefile for using gcov to display line coverage based on our test suite. The way I like to do it is to run: make coverage-test make coverage-report This leaves the repo in a state where every X.c file that was covered has an X.c.gcov file containing the coverage counts for every line, and "#" at every uncovered line. There have been a few bugs in recent patches what would have been caught if the test suite covered those blocks (including a few of mine). I want to work towards a "sensible" amount of coverage on new topics. In my opinion, this means that any logic should be covered, but the 'die()' blocks in error cases do not need to be covered. It is important to not measure the coverage of the codebase by what old code is not covered. To help, I created the 'contrib/coverage-diff.sh' script. After creating the coverage statistics at a version (say, 'topic') you can then run contrib/coverage-diff.sh base topic to see the lines added between 'base' and 'topic' that are not covered by the test suite. For example, I ran this against the 'next' branch (e82ca0) versus 'master' (f84b9b) and got the following output: builtin/commit.c 76f2f5c1e3 builtin/commit.c 1657) write_commit_graph_reachable(get_object_directory(), 0, 0); builtin/fsck.c 66ec0390e7 builtin/fsck.c 862) midx_argv[2] = "--object-dir"; 66ec0390e7 builtin/fsck.c 863) midx_argv[3] = alt->path; 66ec0390e7 builtin/fsck.c 864) if (run_command(&midx_verify)) 66ec0390e7 builtin/fsck.c 865) errors_found |= ERROR_COMMIT_GRAPH; fsck.c fb8952077d 214) die_errno("Could not read '%s'", path); midx.c 56ee7ff156 949) return 0; cc6af73c02 990) midx_report(_("failed to load pack-index for packfile %s"), cc6af73c02 991) e.p->pack_name); cc6af73c02 992) break; Commits introducing uncovered code: Derrick Stolee 56ee7ff15: multi-pack-index: add 'verify' verb Derrick Stolee 66ec0390e: fsck: verify multi-pack-index Derrick Stolee cc6af73c0: multi-pack-index: verify object offsets Junio C Hamano 76f2f5c1e: Merge branch 'ab/commit-graph-progress' into next René Scharfe fb8952077: fsck: use strbuf_getline() to read skiplist file Thanks, -Stolee CHANGES IN V3: I took Junio's perl script verbatim, which speeds up the performance greatly. Some of the other sed commands needed some massaging, but also added extra cleanup. Thanks for the help! CHANGES IN V4: I reduced the blame output using -s which decreases the width. I include a summary of the commit authors at the end to help people see the lines they wrote. This version is also copied into a build definition in the public Git project on Azure Pipelines [1]. I'll use this build definition to generate the coverage report after each "What's Cooking" email. [1] https://git.visualstudio.com/git/_build?definitionId=5 Derrick Stolee (1): contrib: add coverage-diff script contrib/coverage-diff.sh | 108 +++ 1 file changed, 108 insertions(+) create mode 100755 contrib/coverage-diff.sh base-commit: 1d4361b0f344188ab5eec6dcea01f61a3a3a1670 Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-40%2Fderrickstolee%2Fcoverage-v4 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-40/derrickstolee/coverage-v4 Pull-Request: https://github.com/gitgitgadget/git/pull/40 Range-diff vs v3: 1: 21214cc321 ! 1: 6daf310a43 contrib: add coverage-diff script @@ -26,10 +26,10 @@ contrib/coverage-diff.sh base topic to see the lines added between 'base' and 'topic' that are not covered by the -test suite. The output uses 'git blame -c' format so you can find the commits -responsible and view the line numbers for quick access to the context. +test suite. The output uses 'git blame -s' format so you can find the commits +responsible and view the line numbers for quick access to the context, but +trims leading tabs in the file contents to reduce output width. -Helped-by: Junio C Hamano Signed-off-by: Derrick Stolee diff --git a/contrib/coverage-diff.sh b/contrib/coverage-diff.sh @@ -81,13 +81,16 @@ + ' +} + -+files=$(git diff --name-only $V1 $V2 -- *.c) ++files=$(git diff --name-only "$V1" "$V2" -- \*.c) ++ ++# create empty file ++>coverage-data.txt + +for file in $files +do -+ git diff $V1 $V2 -- $file \ -+ | diff_lines \ -+ | sort >new_lines.txt ++ git diff "$V1" "$V2" -- "$file" | ++ diff_lines | ++ sort >new_lines.txt + + if ! test -s new_lines.txt + then @@ -95,24 +98,50 @@ + fi + + hash_file=$(echo $file | sed "s/\//\#/") ++ ++ if ! test -s "$hash_file.gcov" ++ then ++ continue ++ fi ++ + sed -ne '/#:/{ + s/#:// + s/:.*
Re: [PATCH 1/1] commit-graph: define GIT_TEST_COMMIT_GRAPH
On 10/8/2018 9:43 AM, Ævar Arnfjörð Bjarmason wrote: On Tue, Aug 28 2018, Derrick Stolee via GitGitGadget wrote: From: Derrick Stolee The commit-graph feature is tested in isolation by t5318-commit-graph.sh and t6600-test-reach.sh, but there are many more interesting scenarios involving commit walks. Many of these scenarios are covered by the existing test suite, but we need to maintain coverage when the optional commit-graph structure is not present. To allow running the full test suite with the commit-graph present, add a new test environment variable, GIT_TEST_COMMIT_GRAPH. Similar to GIT_TEST_SPLIT_INDEX, this variable makes every Git command try to load the commit-graph when parsing commits, and writes the commit-graph file after every 'git commit' command. There are a few tests that rely on commits not existing in pack-files to trigger important events, so manually set GIT_TEST_COMMIT_GRAPH to false for the necessary commands. There is one test in t6024-recursive-merge.sh that relies on the merge-base algorithm picking one of two ambiguous merge-bases, and the commit-graph feature changes which merge-base is picked. The test feature itself seems fine, but this consistently fails ever since it got introduced (a reset --hard on the commit merged to msater in git.git): GIT_TEST_COMMIT_GRAPH=true prove -j$(parallel --number-of-cores) t5500-fetch-pack.sh t6001-rev-list-graft.sh t6050-replace.sh Test Summary Report --- t6001-rev-list-graft.sh (Wstat: 256 Tests: 14 Failed: 6) Failed tests: 3, 5, 7, 9, 11, 13 Non-zero exit status: 1 t6050-replace.sh (Wstat: 256 Tests: 35 Failed: 9) Failed tests: 12-16, 24-25, 30, 35 Non-zero exit status: 1 t5500-fetch-pack.sh(Wstat: 256 Tests: 357 Failed: 1) Failed test: 351 Non-zero exit status: 1 This is on Linux/Debian 4.17.0-1-amd64. Can you reproduce this? If not I can provide more info (-x output etc..). I see these failures, too, but I believe they are due to ds/commit-graph-with-grafts not being merged to 'next' yet. The purpose of that branch is to fix these test breaks. The environment variable got merged a lot faster. I just built & tested the 'jch' branch at 515d82d9 with GIT_TEST_COMMIT_GRAPH=1 and they all passed. Thanks, -Stolee
Re: t3404.6 breaks on master under GIT_FSMONITOR_TEST
On Thu, Sep 06 2018, Ævar Arnfjörð Bjarmason wrote: > On Thu, Feb 01 2018, Ævar Arnfjörð Bjarmason wrote: > >> The GIT_FSMONITOR_TEST variable allows you to roundtrip the fsmonitor >> codpath in the whole test suite. On both Debian & CentOS this breaks for >> me: >> >> (cd t && GIT_FSMONITOR_TEST=$PWD/t7519/fsmonitor-all >> ./t3404-rebase-interactive.sh -i) >> >> Whereas this works: >> >> (cd t && GIT_FSMONITOR_TEST=$PWD/t7519/fsmonitor-all >> GIT_SKIP_TESTS=t3404.6 ./t3404-rebase-interactive.sh -i) >> >> The entirety of the rest of the test suite still passes with >> GIT_FSMONITOR_TEST. >> >> This has been failing ever since GIT_FSMONITOR_TEST was introduced in >> 883e248b8a ("fsmonitor: teach git to optionally utilize a file system >> monitor to speed up detecting new or changed files.", 2017-09-22). Under >> -v -x -i: >> >> + echo test_must_fail: command succeeded: env >> FAKE_LINES=exec_echo_foo_>file1 1 git rebase -i HEAD^ >> test_must_fail: command succeeded: env FAKE_LINES=exec_echo_foo_>file1 1 >> git rebase -i HEAD^ >> + return 1 >> error: last command exited with $?=1 >> not ok 6 - rebase -i with the exec command checks tree cleanness >> # >> # git checkout master && >> # set_fake_editor && >> # test_must_fail env FAKE_LINES="exec_echo_foo_>file1 1" >> git rebase -i HEAD^ && >> >> Maybe once this is fixed running the test suite under GIT_FSMONITOR_TEST >> would be a useful Travis target, but I don't know the current status of >> adding new options to Travis. > > *Poke* at this again. Ben, or anyone else with knowledge of fsmonitor: > Can you reproduce this? > > This failure along with the one I noted in > https://public-inbox.org/git/87tvn2remn@evledraar.gmail.com/ is > failing the tests on Linux when run with GIT_FSMONITOR_TEST. > > I'm looking at this again because SZEDER's patches to the split index > reminded me again that we have these long-standing failures in rare test > modes (see > https://public-inbox.org/git/87va7ireuu@evledraar.gmail.com/ for the > split index discussion). For what it's worth this is still broken, but more importantly (I'm not just keeping bumping the same thing) the only thing that's now broken under fsmonitor. I.e. my skip config is now GIT_SKIP_TESTS="t3404.7" whereas before 43f1180814 ("git-mv: allow submodules and fsmonitor to work together", 2018-09-10) I needed to add "t7411.3 t7411.4" to that.
RE:
unsubscribe git
Re: [PATCH 1/1] commit-graph: define GIT_TEST_COMMIT_GRAPH
On Tue, Aug 28 2018, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee > > The commit-graph feature is tested in isolation by > t5318-commit-graph.sh and t6600-test-reach.sh, but there are many > more interesting scenarios involving commit walks. Many of these > scenarios are covered by the existing test suite, but we need to > maintain coverage when the optional commit-graph structure is not > present. > > To allow running the full test suite with the commit-graph present, > add a new test environment variable, GIT_TEST_COMMIT_GRAPH. Similar > to GIT_TEST_SPLIT_INDEX, this variable makes every Git command try > to load the commit-graph when parsing commits, and writes the > commit-graph file after every 'git commit' command. > > There are a few tests that rely on commits not existing in > pack-files to trigger important events, so manually set > GIT_TEST_COMMIT_GRAPH to false for the necessary commands. > > There is one test in t6024-recursive-merge.sh that relies on the > merge-base algorithm picking one of two ambiguous merge-bases, and > the commit-graph feature changes which merge-base is picked. > The test feature itself seems fine, but this consistently fails ever since it got introduced (a reset --hard on the commit merged to msater in git.git): GIT_TEST_COMMIT_GRAPH=true prove -j$(parallel --number-of-cores) t5500-fetch-pack.sh t6001-rev-list-graft.sh t6050-replace.sh Test Summary Report --- t6001-rev-list-graft.sh (Wstat: 256 Tests: 14 Failed: 6) Failed tests: 3, 5, 7, 9, 11, 13 Non-zero exit status: 1 t6050-replace.sh (Wstat: 256 Tests: 35 Failed: 9) Failed tests: 12-16, 24-25, 30, 35 Non-zero exit status: 1 t5500-fetch-pack.sh(Wstat: 256 Tests: 357 Failed: 1) Failed test: 351 Non-zero exit status: 1 This is on Linux/Debian 4.17.0-1-amd64. Can you reproduce this? If not I can provide more info (-x output etc..).
Re: [PATCH v4 4/4] rebase --skip: clean up commit message after a failed fixup/squash
Hi Johannes On 02/10/2018 14:50, Johannes Schindelin wrote: Hi Phillip, [sorry, I just got to this mail now] Don't worry, I'm impressed you remembered it, I'd completely forgotten about it. On Sun, 6 May 2018, Phillip Wood wrote: On 27/04/18 21:48, Johannes Schindelin wrote: During a series of fixup/squash commands, the interactive rebase builds up a commit message with comments. This will be presented to the user in the editor if at least one of those commands was a `squash`. In any case, the commit message will be cleaned up eventually, removing all those intermediate comments, in the final step of such a fixup/squash chain. However, if the last fixup/squash command in such a chain fails with merge conflicts, and if the user then decides to skip it (or resolve it to a clean worktree and then continue the rebase), the current code fails to clean up the commit message. This commit fixes that behavior. The fix is quite a bit more involved than meets the eye because it is not only about the question whether we are `git rebase --skip`ing a fixup or squash. It is also about removing the skipped fixup/squash's commit message from the accumulated commit message. And it is also about the question whether we should let the user edit the final commit message or not ("Was there a squash in the chain *that was not skipped*?"). For example, in this case we will want to fix the commit message, but not open it in an editor: pick<- succeeds fixup <- succeeds squash <- fails, will be skipped This is where the newly-introduced `current-fixups` file comes in real handy. A quick look and we can determine whether there was a non-skipped squash. We only need to make sure to keep it up to date with respect to skipped fixup/squash commands. As a bonus, we can even avoid committing unnecessarily, e.g. when there was only one fixup, and it failed, and was skipped. To fix only the bug where the final commit message was not cleaned up properly, but without fixing the rest, would have been more complicated than fixing it all in one go, hence this commit lumps together more than a single concern. For the same reason, this commit also adds a bit more to the existing test case for the regression we just fixed. The diff is best viewed with --color-moved. Signed-off-by: Johannes Schindelin --- sequencer.c| 113 - t/t3418-rebase-continue.sh | 35 ++-- 2 files changed, 131 insertions(+), 17 deletions(-) diff --git a/sequencer.c b/sequencer.c index 56166b0d6c7..cec180714ef 100644 --- a/sequencer.c +++ b/sequencer.c [...] @@ -2804,19 +2801,107 @@ static int commit_staged_changes(struct replay_opts *opts) if (get_oid_hex(rev.buf, &to_amend)) return error(_("invalid contents: '%s'"), rebase_path_amend()); - if (oidcmp(&head, &to_amend)) + if (!is_clean && oidcmp(&head, &to_amend)) return error(_("\nYou have uncommitted changes in your " "working tree. Please, commit them\n" "first and then run 'git rebase " "--continue' again.")); + /* +* When skipping a failed fixup/squash, we need to edit the +* commit message, the current fixup list and count, and if it +* was the last fixup/squash in the chain, we need to clean up +* the commit message and if there was a squash, let the user +* edit it. +*/ + if (is_clean && !oidcmp(&head, &to_amend) && + opts->current_fixup_count > 0 && + file_exists(rebase_path_stopped_sha())) { + const char *p = opts->current_fixups.buf; + int len = opts->current_fixups.len; + + opts->current_fixup_count--; + if (!len) + BUG("Incorrect current_fixups:\n%s", p); + while (len && p[len - 1] != '\n') + len--; + strbuf_setlen(&opts->current_fixups, len); + if (write_message(p, len, rebase_path_current_fixups(), + 0) < 0) + return error(_("could not write file: '%s'"), +rebase_path_current_fixups()); + + /* +* If a fixup/squash in a fixup/squash chain failed, the +* commit message is already correct, no need to commit +* it again. +* +* Only if it is the final command in the fixup/squash +* chain, and only if the chain is long
[no subject]
unsubscribe git
Translation to Portuguese
Hello Git Team. I would like to help to continue the books' translation to Brazilian Portuguese and I don't know how to proceed. Thanks in advance for your help. Regards, -- Thiago Saife (11) 97236-8742
Re: [PATCH v6 08/10] submodule: add a helper to check if it is safe to write to .gitmodules
On Sun, 07 Oct 2018 08:44:20 +0900 Junio C Hamano wrote: > Stefan Beller writes: > > >> static int module_config(int argc, const char **argv, const char *prefix) > >> { > >> + enum { > >> + CHECK_WRITEABLE = 1 > >> + } command = 0; > > > > Can we have the default named? Then we would only use states > > from within the enum? > > Why? Do we use a half-intelligent "switch () { case ...: ... }" > checker that would otherwise complain if we handled "case 0" in such > a switch statement, or something like that? > > Are we going to gain a lot more enum members, by the way? At this > point, this looks more like a > > unsigned check_writable = 0; /* default is not to check */ > > to me. Hi, the CHECK_WRITEABLE operation is alternative to the get/set ones, not an addition, so I can see the rationale behind Stefan's suggestion: either have named enums members for all command "modes" or for none of them; however other users of enum+OPT_CMDMODE seems to think like the enum is for commands passed as *options* and the unnamed default is for actions derived from *arguments*. I don't have a strong opinion on this matter, tho, so just tell me what you prefer and I'll do it for v7. Using an enum was to have a more explicit syntax in case other commands were going to be added in the future (I imagine "--stage" or "--list-all" as possible additions), and does not affect the generated code, so I though it was worth it. Anyways, these are really details, let's concentrate on patches 9 and 10 which deserve much more attention. :) Thanks you, Antonio -- Antonio Ospite https://ao2.it https://twitter.com/ao2it A: Because it messes up the order in which people normally read text. See http://en.wikipedia.org/wiki/Posting_style Q: Why is top-posting such a bad thing?
Re: What's so special about objects/17/ ?
On Sun, Oct 07 2018, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason writes: > >> 1. We still have this check of objects/17/ in builtin/gc.c today. Why >>objects/17/ and not e.g. objects/00/ to go with other 000* magic such >>as the SHA-1?d Statistically >>it doesn't matter, but 17 seems like an odd thing to pick at random >>out of 00..ff, does it have any significance? > > There is no "other 000* magic such as ...". There is only one 0{40} > magic and that one must be memorable and explainable. Depending on how we're counting there's at least two. We also use as a placeholder for "couldn't read a ref" in addition or "this is a placeholder for an invalid ref" in addition to how it's used to signify creation/deletion to the in the likes of the pre-receive hook: $ echo hello > .git/refs/something $ git fsck [...] error: refs/something: invalid sha1 pointer $ > .git/refs/something $ git fsck [...] error: refs/something: invalid sha1 pointer This is because the refs backend will memzero the oid struct, and if we fail to read things it'll still be zero'd out. This manifests e.g. in this confusing fsck output, due to a bug where GitLab will write empty refs/keep-around/* refs sometimes: https://gitlab.com/gitlab-org/gitlab-ce/issues/44431 > The 1/256 sample can be any one among 256. Just like the date > string on the first line of the output to be used as the /etc/magic > signature by format-patch, it was an arbitrary choice, rather than a > random choice, and unlike 0{40} this does not have to be memorable > by general public and I do not have to explain the choice to the > general public ;-) I wanted to elaborate on the explanation for "gc.auto" in git-config. Now we just say "approximately 6700". Since this behavior has been really stable for a long time we could say we sample 1/256 of the .git/objects/?? dirs, and this explains any perceived discrepancies between the 6700 number and $(find .git/objects/?? -type f | wc -l). >> 2. It seems overly paranoid to be checking that the files in >> .git/objects/17/ look like a SHA-1. > > There is no other reason than futureproofing. We were paying cost > to open and scan the directory anyway, and checking that we only > count the loose object files was (and still is) a sensible thing to > do to allow us not even worry about the other kind of things we > might end up creating there. Makes sense. Just wanted to ask if it was that or some workaround for historical files being there.