Re: [PATCH v3 4/7] revision.c: begin refactoring --topo-order logic
"Derrick Stolee via GitGitGadget" writes: > * revs->limited implies we run limit_list() to walk the entire > reachable set. There are some short-cuts here, such as if we > perform a range query like 'git rev-list COMPARE..HEAD' and we > can stop limit_list() when all queued commits are uninteresting. > > * revs->topo_order implies we run sort_in_topological_order(). See > the implementation of that method in commit.c. It implies that > the full set of commits to order is in the given commit_list. > > These two methods imply that a 'git rev-list --topo-order HEAD' > command must walk the entire reachable set of commits _twice_ before > returning a single result. With or without "--topo-order", running rev-list without any negative commit means we must dig down to the roots that can be reached from the positive commits we have. I am to sure if having to run the "sort" of order N counts as "walk the entire reachable set once" (in addition to the enumeration that must be done to prepare that N commits, performed in limit_list()). > In v2.18.0, the commit-graph file contains zero-valued bytes in the > positions where the generation number is stored in v2.19.0 and later. > Thus, we use generation_numbers_enabled() to check if the commit-graph > is available and has non-zero generation numbers. > > When setting revs->limited only because revs->topo_order is true, > only do so if generation numbers are not available. There is no > reason to use the new logic as it will behave similarly when all > generation numbers are INFINITY or ZERO. > In prepare_revision_walk(), if we have revs->topo_order but not > revs->limited, then we trigger the new logic. It breaks the logic > into three pieces, to fit with the existing framework: > > 1. init_topo_walk() fills a new struct topo_walk_info in the rev_info >struct. We use the presence of this struct as a signal to use the >new methods during our walk. In this patch, this method simply >calls limit_list() and sort_in_topological_order(). In the future, >this method will set up a new data structure to perform that logic >in-line. > > 2. next_topo_commit() provides get_revision_1() with the next topo- >ordered commit in the list. Currently, this simply pops the commit >from revs->commits. ... because everything is already done in #1 above. Which makes sense. > 3. expand_topo_walk() provides get_revision_1() with a way to signal >walking beyond the latest commit. Currently, this calls >add_parents_to_list() exactly like the old logic. "latest"? We dig down the history from newer to older, so at some point we hit an old commit and need to find the parents to keep walking towards even older parts of the history. Did you mean "earliest" instead? > While this commit presents method redirection for performing the > exact same logic as before, it allows the next commit to focus only > on the new logic. OK. > diff --git a/revision.c b/revision.c > index e18bd530e4..2dcde8a8ac 100644 > --- a/revision.c > +++ b/revision.c > @@ -25,6 +25,7 @@ > #include "worktree.h" > #include "argv-array.h" > #include "commit-reach.h" > +#include "commit-graph.h" > > volatile show_early_output_fn_t show_early_output; > > @@ -2454,7 +2455,7 @@ int setup_revisions(int argc, const char **argv, struct > rev_info *revs, struct s > if (revs->diffopt.objfind) > revs->simplify_history = 0; > > - if (revs->topo_order) > + if (revs->topo_order && !generation_numbers_enabled(the_repository)) > revs->limited = 1; Are we expecting that this is always a bool? Can there be new commits for which generation numbers are not computed and stored while all the old, stable and packed commits have generation numbers? > @@ -2892,6 +2893,33 @@ static int mark_uninteresting(const struct object_id > *oid, > return 0; > } > > +struct topo_walk_info {}; > + > +static void init_topo_walk(struct rev_info *revs) > +{ > + struct topo_walk_info *info; > + revs->topo_walk_info = xmalloc(sizeof(struct topo_walk_info)); > + info = revs->topo_walk_info; > + memset(info, 0, sizeof(struct topo_walk_info)); There is no member in the struct at this point. Are we sure this is safe? Just being curious. I know xmalloc() gives us at least one byte and info won't be NULL. I just do not know offhand if we have a guarantee that memset() acts sensibly to fill the first 0 bytes. > + limit_list(revs); > + sort_in_topological_order(&revs->commits, revs->sort_order); > +} > + > +static struct commit *next_topo_commit(struct rev_info *revs) > +{ > + return pop_commit(&revs->commits); > +} > + > +static void expand_topo_walk(struct rev_info *revs, struct commit *commit) > +{ > + if (add_parents_to_list(revs, commit, &revs->commits, NULL) < 0) { > + if (!revs->ignore_missing_links) > + die("Failed to traverse parents of commit %s", > + oid_to
Re: [PATCH v3 1/2] commit-graph write: add progress output
SZEDER Gábor writes: >> for (i = 0; i < oids->nr; i++) { >> +display_progress(progress, ++j); >> commit = lookup_commit(the_repository, &oids->list[i]); >> >> if (commit && !parse_commit(commit)) >> @@ -611,19 +624,28 @@ static void close_reachable(struct packed_oid_list >> *oids) >> } > > The above loops have already counted all the commits, and, more > importantly, did all the hard work that takes time and makes the > progress indicator useful. > >> for (i = 0; i < oids->nr; i++) { >> +display_progress(progress, ++j); > > This display_progress() call, however, doesn't seem to be necessary. > First, it counts all commits for a second time, resulting in the ~2x > difference compared to the actual number of commits, and then causing > my confusion. Second, all what this loop is doing is setting a flag > in commits that were already looked up and parsed in the above loops. > IOW this loop is very fast, and the progress indicator jumps from > ~780k right to 1.5M, even on my tiny laptop, so it doesn't need a > progress indicator at all. Makes sense. If this second iteration were also time consuming, then it probably is a good idea to split these into two separate phases? "Counting 1...N" followed by "Inspecting 1...N" or something like that. Of course, if the latter does not take much time, then doing without any progress indicator is also fine.
[PATCHv1 3/3] git-p4: fully support unshelving changelists
The previous git-p4 unshelve support would check for changes in Perforce to the files being unshelved since the original shelve, and would complain if any were found. This was to ensure that the user wouldn't end up with both the shelved change delta, and some deltas from other changes in their git commit. e.g. given fileA: the quick brown fox change1: s/the/The/ <- p4 shelve this change change2: s/fox/Fox/ <- p4 submit this change git p4 unshelve 1 <- FAIL This change teaches the P4Unshelve class to always create a parent commit which matches the P4 tree (for the files being unshelved) at the point prior to the P4 shelve being created (which is reported in the p4 description for a shelved changelist). That then means git-p4 can always create a git commit matching the P4 shelve that was originally created, without any extra deltas. The user might still need to use the --origin option though - there is no way for git-p4 to work out the versions of all of the other *unchanged* files in the shelve, since this information is not recorded by Perforce. Additionally this fixes handling of shelved 'move' operations. Signed-off-by: Luke Diamand --- Documentation/git-p4.txt | 4 +- git-p4.py| 84 +++- t/t9832-unshelve.sh | 69 ++--- 3 files changed, 106 insertions(+), 51 deletions(-) diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt index c7705ae6e7..d8332e99c1 100644 --- a/Documentation/git-p4.txt +++ b/Documentation/git-p4.txt @@ -177,8 +177,8 @@ Unshelving will take a shelved P4 changelist, and produce the equivalent git com in the branch refs/remotes/p4-unshelved/. The git commit is created relative to the current origin revision (HEAD by default). -If the shelved changelist's parent revisions differ, git-p4 will refuse to unshelve; -you need to be unshelving onto an equivalent tree. +A parent commit is created based on the origin, and then the unshelve commit is +created based on that. The origin revision can be changed with the "--origin" option. diff --git a/git-p4.py b/git-p4.py index 76c18a22e9..1998c3e141 100755 --- a/git-p4.py +++ b/git-p4.py @@ -1306,6 +1306,9 @@ def processContent(self, git_mode, relPath, contents): return LargeFileSystem.processContent(self, git_mode, relPath, contents) class Command: +delete_actions = ( "delete", "move/delete", "purge" ) +add_actions = ( "add", "move/add" ) + def __init__(self): self.usage = "usage: %prog [options]" self.needsGit = True @@ -2524,7 +2527,6 @@ def map_in_client(self, depot_path): return "" class P4Sync(Command, P4UserMap): -delete_actions = ( "delete", "move/delete", "purge" ) def __init__(self): Command.__init__(self) @@ -2612,20 +2614,7 @@ def checkpoint(self): if self.verbose: print("checkpoint finished: " + out) -def cmp_shelved(self, path, filerev, revision): -""" Determine if a path at revision #filerev is the same as the file -at revision @revision for a shelved changelist. If they don't match, -unshelving won't be safe (we will get other changes mixed in). - -This is comparing the revision that the shelved changelist is *based* on, not -the shelved changelist itself. -""" -ret = p4Cmd(["diff2", "{0}#{1}".format(path, filerev), "{0}@{1}".format(path, revision)]) -if verbose: -print("p4 diff2 path %s filerev %s revision %s => %s" % (path, filerev, revision, ret)) -return ret["status"] == "identical" - -def extractFilesFromCommit(self, commit, shelved=False, shelved_cl = 0, origin_revision = 0): +def extractFilesFromCommit(self, commit, shelved=False, shelved_cl = 0): self.cloneExclude = [re.sub(r"\.\.\.$", "", path) for path in self.cloneExclude] files = [] @@ -2650,17 +2639,6 @@ def extractFilesFromCommit(self, commit, shelved=False, shelved_cl = 0, origin_r file["type"] = commit["type%s" % fnum] if shelved: file["shelved_cl"] = int(shelved_cl) - -# For shelved changelists, check that the revision of each file that the -# shelve was based on matches the revision that we are using for the -# starting point for git-fast-import (self.initialParent). Otherwise -# the resulting diff will contain deltas from multiple commits. - -if file["action"] != "add" and \ -not self.cmp_shelved(path, file["rev"], origin_revision): -sys.exit("change {0} not based on {1} for {2}, cannot unshelve".format( -commit["change"], self.initialParent, path)) - files.append(file) fnum = fnum + 1 return files @@ -3032,7 +3010,7 @@
[PATCHv1 0/3] git-p4: improved unshelving
This patch series teaches the git-p4 unshelve command to handle intervening changes to the Perforce files. At the moment if you try to unshelve a file, and that file has been modified since the shelving, git-p4 refuses. That is so that it doesn't end up generating a commit containing deltas from several P4 changes. This gets to be more annoying as time goes on and the files you are interested in get updated by other people. However, what we can do is to create a parent commit matching the state of the tree when the shelve happened, which then lets git-p4 create a git commit containing just the changes that are wanted. It's still impossible to determine the true state of the complete tree when the P4 shelve was created, since this information is not recorded by Perforce. Manual intervention is required to fix that. There are also a few other smaller fixes, the main one being that it no longer unshelves into refs/remotes/p4/master/unshelved, but instead into refs/remotes/p4-unshelved. That's because the git-p4 branch detection gets confused by branches appearing in refs/remotes/p4. Luke Diamand (3): git-p4: do not fail in verbose mode for missing 'fileSize' key git-p4: unshelve into refs/remotes/p4-unshelved, not refs/remotes/p4/unshelved git-p4: fully support unshelving changelists Documentation/git-p4.txt | 10 ++--- git-p4.py| 90 +++- t/t9832-unshelve.sh | 75 ++--- 3 files changed, 117 insertions(+), 58 deletions(-) -- 2.19.1.272.gf84b9b09d4
[PATCHv1 1/3] git-p4: do not fail in verbose mode for missing 'fileSize' key
If deleting or moving a file, sometimes P4 doesn't report the file size. The code handles this just fine but some logging crashes. Stop this happening. There was some earlier discussion on the list about this: https://public-inbox.org/git/xmqq1sqpp1vv@gitster.mtv.corp.google.com/ Signed-off-by: Luke Diamand --- git-p4.py | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/git-p4.py b/git-p4.py index 7fab255584..5701bad06a 100755 --- a/git-p4.py +++ b/git-p4.py @@ -2775,7 +2775,10 @@ def streamOneP4File(self, file, contents): relPath = self.stripRepoPath(file['depotFile'], self.branchPrefixes) relPath = self.encodeWithUTF8(relPath) if verbose: -size = int(self.stream_file['fileSize']) +if 'fileSize' in self.stream_file: +size = int(self.stream_file['fileSize']) +else: +size = 0 # deleted files don't get a fileSize apparently sys.stdout.write('\r%s --> %s (%i MB)\n' % (file['depotFile'], relPath, size/1024/1024)) sys.stdout.flush() -- 2.19.1.272.gf84b9b09d4
[PATCHv1 2/3] git-p4: unshelve into refs/remotes/p4-unshelved, not refs/remotes/p4/unshelved
The branch detection code looks for branches under refs/remotes/p4/... and can end up getting confused if there are unshelved changes in there as well. This happens in the function p4BranchesInGit(). Instead, put the unshelved changes into refs/remotes/p4-unshelved/. Signed-off-by: Luke Diamand --- Documentation/git-p4.txt | 6 +++--- git-p4.py| 3 ++- t/t9832-unshelve.sh | 6 +++--- 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt index 41780a5aa9..c7705ae6e7 100644 --- a/Documentation/git-p4.txt +++ b/Documentation/git-p4.txt @@ -174,7 +174,7 @@ $ git p4 submit --update-shelve 1234 --update-shelve 2345 Unshelve Unshelving will take a shelved P4 changelist, and produce the equivalent git commit -in the branch refs/remotes/p4/unshelved/. +in the branch refs/remotes/p4-unshelved/. The git commit is created relative to the current origin revision (HEAD by default). If the shelved changelist's parent revisions differ, git-p4 will refuse to unshelve; @@ -182,13 +182,13 @@ you need to be unshelving onto an equivalent tree. The origin revision can be changed with the "--origin" option. -If the target branch in refs/remotes/p4/unshelved already exists, the old one will +If the target branch in refs/remotes/p4-unshelved already exists, the old one will be renamed. $ git p4 sync $ git p4 unshelve 12345 -$ git show refs/remotes/p4/unshelved/12345 +$ git show p4/unshelved/12345 $ git p4 unshelve 12345 diff --git a/git-p4.py b/git-p4.py index 5701bad06a..76c18a22e9 100755 --- a/git-p4.py +++ b/git-p4.py @@ -3956,7 +3956,8 @@ def __init__(self): ] self.verbose = False self.noCommit = False -self.destbranch = "refs/remotes/p4/unshelved" +self.destbranch = "refs/remotes/p4-unshelved" +self.origin = "p4/master" def renameBranch(self, branch_name): """ Rename the existing branch to branch_name.N diff --git a/t/t9832-unshelve.sh b/t/t9832-unshelve.sh index 48ec7679b8..c3d15ceea8 100755 --- a/t/t9832-unshelve.sh +++ b/t/t9832-unshelve.sh @@ -54,8 +54,8 @@ EOF cd "$git" && change=$(last_shelved_change) && git p4 unshelve $change && - git show refs/remotes/p4/unshelved/$change | grep -q "Further description" && - git cherry-pick refs/remotes/p4/unshelved/$change && + git show refs/remotes/p4-unshelved/$change | grep -q "Further description" && + git cherry-pick refs/remotes/p4-unshelved/$change && test_path_is_file file2 && test_cmp file1 "$cli"/file1 && test_cmp file2 "$cli"/file2 && @@ -88,7 +88,7 @@ EOF cd "$git" && change=$(last_shelved_change) && git p4 unshelve $change && - git diff refs/remotes/p4/unshelved/$change.0 refs/remotes/p4/unshelved/$change | grep -q file3 + git diff refs/remotes/p4-unshelved/$change.0 refs/remotes/p4-unshelved/$change | grep -q file3 ) ' -- 2.19.1.272.gf84b9b09d4
Re: [PATCH v3 3/7] test-reach: add rev-list tests
Jeff King writes: > On Fri, Sep 21, 2018 at 10:39:30AM -0700, Derrick Stolee via GitGitGadget > wrote: > >> From: Derrick Stolee >> >> The rev-list command is critical to Git's functionality. Ensure it >> works in the three commit-graph environments constructed in >> t6600-test-reach.sh. Here are a few important types of rev-list >> operations: >> >> * Basic: git rev-list --topo-order HEAD >> * Range: git rev-list --topo-order compare..HEAD >> * Ancestry: git rev-list --topo-order --ancestry-path compare..HEAD >> * Symmetric Difference: git rev-list --topo-order compare...HEAD > > Makes sense. I'll assume you filled out all those "expect" blocks > correctly. ;) Well, otherwise three-modes test would barf at least when it is running in its "no graph" mode, so I'd assume we are covered.
Re: [PATCH v6] log: fix coloring of certain octupus merge shapes
I'll do the s/octu/octo/; again on the title while queuing. Let's merge this to 'next'. Thanks.
Re: [PATCH v4 0/6] Fix the racy split index problem
Ævar Arnfjörð Bjarmason writes: > On Thu, Oct 11 2018, SZEDER Gábor wrote: > >> Fourth and hopefully final round of fixing occasional test failures when >> run with 'GIT_TEST_SPLIT_INDEX=yes'. The only code change is the >> extraction of a helper function to compare two cache entries' content, >> and then a couple of minor log message clarifications. The range-diff >> below is rather clear on that. > > Looks good. I'm not going to run the stress test I did on v5 on this > since the changes are just moving existing code into a fuction, unless > you'd like me to or think there's a reason to that is. Thanks, both. Let's merge this round to 'next'; it would be OK to do any further tweaks incrementally on top.
Re: [PATCH v8 0/7] speed up index load through parallelization
Ben Peart writes: > From: Ben Peart > > Fixed issues identified in review the most impactful probably being plugging > some leaks and improved error handling. Also added better error messages > and some code cleanup to code I'd touched. > > The biggest change in the interdiff is the impact of renaming ieot_offset to > ieot_start and ieot_work to ieot_blocks in hopes of making it easier to read > and understand the code. Thanks, I think this one is ready to be in 'next' and any further tweaks can be done incrementally.
Re: [PATCH v4 0/3] alias help tweaks
Rasmus Villemoes writes: > v2: Added patches 2 and 3, made "git cmd --help" unconditionally (no > config option, no delay) redirect to the aliased command's help, > preserve pre-existing behaviour of the spelling "git help cmd". > > v3: Add some additional comments in patch 1 and avoid triggering leak > checker reports. Use better wording in patch 3. > > v4: Reword commit log in patch 1. Sorry for failing to point it out and let the style carried over to v4, but the above is insufficient for a cover latter. Those who missed an earlier round has no clue what these patches are about, and there is not even a pointer to find an earlier discussion in the list archive. I think the patches are good with the rounds of reviews it went through, so let's merge it to 'next'. Here is what I plan to start the merge message of the series: "git cmd --help" when "cmd" is aliased used to only say "cmd is aliased to ...". Now it shows that to the standard error stream and runs "git $cmd --help" where $cmd is the first word of the alias expansion. Please do the cover-letter better next time. Thanks. > > Rasmus Villemoes (3): > help: redirect to aliased commands for "git cmd --help" > git.c: handle_alias: prepend alias info when first argument is -h > git-help.txt: document "git help cmd" vs "git cmd --help" for aliases > > Documentation/git-help.txt | 4 > builtin/help.c | 34 +++--- > git.c | 3 +++ > 3 files changed, 38 insertions(+), 3 deletions(-)
Re: [PATCH v4 0/1] contrib: Add script to show uncovered "new" lines
"Derrick Stolee via GitGitGadget" writes: > CHANGES IN V4: I reduced the blame output using -s which decreases the > width. I include a summary of the commit authors at the end to help people > see the lines they wrote. This version is also copied into a build > definition in the public Git project on Azure Pipelines [1]. I'll use this > build definition to generate the coverage report after each "What's Cooking" > email. > > [1] https://git.visualstudio.com/git/_build?definitionId=5 Thanks. Let's move this forward by merging it down to 'next'.
Re: [PATCH 1/3] ref-filter: free memory from used_atom
Junio C Hamano writes: > Olga Telezhnaya writes: These three patches seem to cause t6300 to fail with an attempt to free an invalid pointer in "git for-each-ref --format='%(push)'" (6300.25) *** Error in `/home/gitster/w/git.git/git': free(): invalid pointer: 0x55cca3a9f920 *** === Backtrace: = /lib/x86_64-linux-gnu/libc.so.6(+0x70bcb)[0x7f052fdacbcb] /lib/x86_64-linux-gnu/libc.so.6(+0x76f96)[0x7f052fdb2f96] /home/gitster/w/git.git/git(+0x15a866)[0x55cca35ca866] /home/gitster/w/git.git/git(+0x15ab48)[0x55cca35cab48] /home/gitster/w/git.git/git(+0x15b6d3)[0x55cca35cb6d3] /home/gitster/w/git.git/git(+0x15b7dd)[0x55cca35cb7dd] /home/gitster/w/git.git/git(+0x49e18)[0x55cca34b9e18] /home/gitster/w/git.git/git(+0x19b20)[0x55cca3489b20] /home/gitster/w/git.git/git(+0x1aab5)[0x55cca348aab5] /home/gitster/w/git.git/git(+0x19809)[0x55cca3489809] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f052fd5c2b1] /home/gitster/w/git.git/git(+0x1984a)[0x55cca348984a] === Memory map: 55cca347-55cca36cc000 r-xp fe:00 2760695 /home/gitster/w/git.git/git 55cca38cc000-55cca38cf000 r--p 0025c000 fe:00 2760695 /home/gitster/w/git.git/git 55cca38cf000-55cca38de000 rw-p 0025f000 fe:00 2760695 /home/gitster/w/git.git/git 55cca38de000-55cca3921000 rw-p 00:00 0 55cca3a9e000-55cca3abf000 rw-p 00:00 0 [heap] 7f052fb24000-7f052fb3b000 r-xp fe:00 393287 /lib/x86_64-linux-gnu/libgcc_s.so.1 7f052fb3b000-7f052fd3a000 ---p 00017000 fe:00 393287 /lib/x86_64-linux-gnu/libgcc_s.so.1 7f052fd3a000-7f052fd3b000 r--p 00016000 fe:00 393287 /lib/x86_64-linux-gnu/libgcc_s.so.1 7f052fd3b000-7f052fd3c000 rw-p 00017000 fe:00 393287 /lib/x86_64-linux-gnu/libgcc_s.so.1 7f052fd3c000-7f052fed1000 r-xp fe:00 392469 /lib/x86_64-linux-gnu/libc-2.24.so 7f052fed1000-7f05300d1000 ---p 00195000 fe:00 392469 /lib/x86_64-linux-gnu/libc-2.24.so 7f05300d1000-7f05300d5000 r--p 00195000 fe:00 392469 /lib/x86_64-linux-gnu/libc-2.24.so 7f05300d5000-7f05300d7000 rw-p 00199000 fe:00 392469 /lib/x86_64-linux-gnu/libc-2.24.so 7f05300d7000-7f05300db000 rw-p 00:00 0 7f05300db000-7f05300e2000 r-xp fe:00 392487 /lib/x86_64-linux-gnu/librt-2.24.so 7f05300e2000-7f05302e1000 ---p 7000 fe:00 392487 /lib/x86_64-linux-gnu/librt-2.24.so 7f05302e1000-7f05302e2000 r--p 6000 fe:00 392487 /lib/x86_64-linux-gnu/librt-2.24.so 7f05302e2000-7f05302e3000 rw-p 7000 fe:00 392487 /lib/x86_64-linux-gnu/librt-2.24.so 7f05302e3000-7f05302fb000 r-xp fe:00 392485 /lib/x86_64-linux-gnu/libpthread-2.24.so 7f05302fb000-7f05304fa000 ---p 00018000 fe:00 392485 /lib/x86_64-linux-gnu/libpthread-2.24.so 7f05304fa000-7f05304fb000 r--p 00017000 fe:00 392485 /lib/x86_64-linux-gnu/libpthread-2.24.so 7f05304fb000-7f05304fc000 rw-p 00018000 fe:00 392485 /lib/x86_64-linux-gnu/libpthread-2.24.so 7f05304fc000-7f053050 rw-p 00:00 0 7f053050-7f0530519000 r-xp fe:00 392698 /lib/x86_64-linux-gnu/libz.so.1.2.8 7f0530519000-7f0530718000 ---p 00019000 fe:00 392698 /lib/x86_64-linux-gnu/libz.so.1.2.8 7f0530718000-7f0530719000 r--p 00018000 fe:00 392698 /lib/x86_64-linux-gnu/libz.so.1.2.8 7f0530719000-7f053071a000 rw-p 00019000 fe:00 392698 /lib/x86_64-linux-gnu/libz.so.1.2.8 7f053071a000-7f053073d000 r-xp fe:00 392461 /lib/x86_64-linux-gnu/ld-2.24.so 7f0530916000-7f0530918000 rw-p 00:00 0 7f0530939000-7f053093d000 rw-p 00:00 0 7f053093d000-7f053093e000 r--p 00023000 fe:00 392461 /lib/x86_64-linux-gnu/ld-2.24.so 7f053093e000-7f053093f000 rw-p 00024000 fe:00 392461 /lib/x86_64-linux-gnu/ld-2.24.so 7f053093f000-7f053094 rw-p 00:00 0 7ffe894ee000-7ffe8951 rw-p 00:00 0 [stack] 7ffe8959e000-7ffe895a1000 r--p 00:00 0 [vvar] 7ffe895a1000-7ffe895a3000 r-xp 00:00 0 [vdso] ./test-lib.sh: line 631: 262132 Aborted git for-each-ref --format='%(push)' refs/heads/master > actual not ok 25 - basic atom: head push # # git for-each-ref --format='%(push)' refs/heads/master >actual && # sanitize_pgp actual.clean && # test_cmp expected actual.clean #
MARKETING & SALES PROMOTION EXPO – Exhibit at Japan's growing advertising market to expand your business in Japan/Asia-Pacific!
Dear International Sales & Marketing Director Zhejiang Wuchuan Industrial Co., Ltd, This is Reed Exhibitions Japan Ltd., organiser of MARKETING & SALES PROMOTION EXPO [June]. MARKETING & SALES PROMOTION EXPO [June] is Japan's largest trade show for marketing & sales promotion products and solutions. Are you interested in exporting your products or expanding your business in Japan/Asia-Pacific? If yes, why not exhibit at MARKETING & SALES PROMOTION EXPO [June]? https://www.sp-world.jp/en/ ---NOW is your time to enter Japan!-- - Strong growth in the Japanese advertising market <8% Market Growth from 2012 to 2017> $US 53,000 million → $US 57,000 million (*1) The market has been growing continuously for 6 years! - Trends In preparation to the 2020 Olympics in Japan, companies are spending a lot of money on advertising and novelty. -- If you are planning on expanding your business in Japan/Asia-Pacific region, why not exhibit at MARKETING & SALES PROMOTION EXPO [June]? In the next June show, 650 exhibitors and 44,000 visitors are expected to gather. (*2) Many buyers and importers are waiting for your products to come to Japan! Please fill in the Reply Form below to receive detailed information on exhibiting. Show Management will assist you with the latest booth availability, cost estimation and any other inquiries. == Reply Form mailto:spwo...@reedexpo.co.jp Company Name: Contact Person: Email: TEL: Main Products: Your Request ( ) Cost Information ( ) Available Booth Locations ( ) Information on Packaged Booths ( ) Other ( ) === We look forward to hearing from you soon. Sincerely, MARKETING & SALES PROMOTION EXPO Show Management Reed Exhibitions Japan Ltd. TEL: +81-3-3349-8505 mailto:spwo...@reedexpo.co.jp - MARKETING & SALES PROMOTION EXPO [June] Date: June 19 (Wed.) - 21 (Fri.), 2019 Venue: Tokyo Big Sight, Japan https://www.sp-world.jp/en/ - (*1) Figures from DENTSU INC. (*2) Expected ID:E36-G1402-0075 This message is delivered to you to provide details of exhibitions and conferences organized, co-organized, or managed by Reed Exhibitions Japan Ltd. If you would like to change your contact information, or prefer not to receive further information on this exhibition/conference, please follow the directions below. Please click the URL below and follow the directions on the website to update your e-mail and other information. https://contact.reedexpo.co.jp/expo/REED/?lg=en&tp=ch&ec=CHANGE Please reply to this mail changing the subject to "REMOVE FROM LIST". You will not receive any further information on this exhibition/conference.
Re: [PATCH] diff.c: die on unknown color-moved ws mode
Stefan Beller writes: > Noticed-by: Junio C Hamano > Signed-off-by: Stefan Beller > --- > > >There is no "ignore-any" supported by the feature---I think that >the parser for the option should have noticed and barfed, but it >did not. It merely emitted a message to the standard output and >let it scroll away with the huge diff before the reader noticed >it. > > Addressed in this patch. > >Am I missing something [...] ? > > Note that this parsing is used for both the parsing from command line > as well as options, i.e. Hmph, is it our convention for a value that is not yet known to the current version of Git found in a configuration file to cause it to die? I somehow thought that command line options are checked more strictly and configuration variables are parsed more leniently. If that is the case, the place that dies need to be raised in the callchain; iow, instead of dying inside the parser, it is necessary to let it only detect a problem and allow the caller to decide what to do with the problem, I would think. > git config diff.colorMovedWS asdf > git format-patch HEAD^ > fatal: ignoring unknown color-moved-ws mode 'asdf' > git config --unset diff.colorMovedWS > > (format-patch parses these color specific things, but doesn't apply it) > > diff.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/diff.c b/diff.c > index 145cfbae59..bdf4535d69 100644 > --- a/diff.c > +++ b/diff.c > @@ -313,7 +313,7 @@ static int parse_color_moved_ws(const char *arg) > else if (!strcmp(sb.buf, "allow-indentation-change")) > ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE; > else > - error(_("ignoring unknown color-moved-ws mode '%s'"), > sb.buf); > + die(_("ignoring unknown color-moved-ws mode '%s'"), > sb.buf); > > strbuf_release(&sb); > }
[PATCH v2 0/1] Advertise multiple supported proto versions
From: Josh Steadmon This is an alternate approach to the previous series. We add a registry of supported wire protocol versions that individual commands can use to declare supported versions before contacting a server. The client will then advertise all supported versions, while the server will choose the first recognized version from the advertised list. Compared to the previous series, this approach is more convenient for protocol_v2, which is intended to work on a single server endpoint. However, it has the drawback that every command that acts as a client must register its supported versions; it is not always obvious which (if any) network operations a given command will perform. Thank you to Stefan for his review of the previous series and for helping me think through the requirements for this new approach. Josh Steadmon (1): protocol: advertise multiple supported versions builtin/archive.c | 3 ++ builtin/clone.c| 4 ++ builtin/fetch-pack.c | 4 ++ builtin/fetch.c| 5 ++ builtin/ls-remote.c| 5 ++ builtin/pull.c | 5 ++ builtin/push.c | 4 ++ builtin/send-pack.c| 3 ++ connect.c | 47 - protocol.c | 115 ++--- protocol.h | 17 ++ remote-curl.c | 28 ++ t/t5570-git-daemon.sh | 2 +- t/t5700-protocol-v1.sh | 8 +-- t/t5702-protocol-v2.sh | 16 +++--- transport-helper.c | 6 +++ 16 files changed, 217 insertions(+), 55 deletions(-) -- 2.19.0.605.g01d371f741-goog
[PATCH v2 1/1] protocol: advertise multiple supported versions
From: Josh Steadmon Currently the client advertises that it supports the wire protocol version set in the protocol.version config. However, not all services support the same set of protocol versions. When connecting to git-receive-pack, the client automatically downgrades to v0 if config.protocol is set to v2, but this check is not performed for other services. This patch creates a protocol version registry. Individual commands register all the protocol versions they support prior to communicating with a server. Versions should be listed in preference order; the version specified in protocol.version will automatically be moved to the front of the registry. The protocol version registry is passed to remote helpers via the GIT_PROTOCOL environment variable. Clients now advertise the full list of registered versions. Servers select the first recognized version from this advertisement. Signed-off-by: Josh Steadmon --- builtin/archive.c | 3 ++ builtin/clone.c| 4 ++ builtin/fetch-pack.c | 4 ++ builtin/fetch.c| 5 ++ builtin/ls-remote.c| 5 ++ builtin/pull.c | 5 ++ builtin/push.c | 4 ++ builtin/send-pack.c| 3 ++ connect.c | 47 - protocol.c | 115 ++--- protocol.h | 17 ++ remote-curl.c | 28 ++ t/t5570-git-daemon.sh | 2 +- t/t5700-protocol-v1.sh | 8 +-- t/t5702-protocol-v2.sh | 16 +++--- transport-helper.c | 6 +++ 16 files changed, 217 insertions(+), 55 deletions(-) diff --git a/builtin/archive.c b/builtin/archive.c index e74f675390..8adb9f381b 100644 --- a/builtin/archive.c +++ b/builtin/archive.c @@ -8,6 +8,7 @@ #include "transport.h" #include "parse-options.h" #include "pkt-line.h" +#include "protocol.h" #include "sideband.h" static void create_output_file(const char *output_file) @@ -94,6 +95,8 @@ int cmd_archive(int argc, const char **argv, const char *prefix) OPT_END() }; + register_allowed_protocol_version(protocol_v0); + argc = parse_options(argc, argv, prefix, local_opts, NULL, PARSE_OPT_KEEP_ALL); diff --git a/builtin/clone.c b/builtin/clone.c index fd2c3ef090..1651a950b6 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -900,6 +900,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix) struct refspec rs = REFSPEC_INIT_FETCH; struct argv_array ref_prefixes = ARGV_ARRAY_INIT; + register_allowed_protocol_version(protocol_v2); + register_allowed_protocol_version(protocol_v1); + register_allowed_protocol_version(protocol_v0); + fetch_if_missing = 0; packet_trace_identity("clone"); diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 1a1bc63566..cba935e4d3 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -57,6 +57,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) fetch_if_missing = 0; + register_allowed_protocol_version(protocol_v2); + register_allowed_protocol_version(protocol_v1); + register_allowed_protocol_version(protocol_v0); + packet_trace_identity("fetch-pack"); memset(&args, 0, sizeof(args)); diff --git a/builtin/fetch.c b/builtin/fetch.c index 61bec5d213..2a20cf8bfd 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -21,6 +21,7 @@ #include "argv-array.h" #include "utf8.h" #include "packfile.h" +#include "protocol.h" #include "list-objects-filter-options.h" static const char * const builtin_fetch_usage[] = { @@ -1476,6 +1477,10 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) int prune_tags_ok = 1; struct argv_array argv_gc_auto = ARGV_ARRAY_INIT; + register_allowed_protocol_version(protocol_v2); + register_allowed_protocol_version(protocol_v1); + register_allowed_protocol_version(protocol_v0); + packet_trace_identity("fetch"); fetch_if_missing = 0; diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c index 1a25df7ee1..ea685e8bb9 100644 --- a/builtin/ls-remote.c +++ b/builtin/ls-remote.c @@ -1,5 +1,6 @@ #include "builtin.h" #include "cache.h" +#include "protocol.h" #include "transport.h" #include "ref-filter.h" #include "remote.h" @@ -80,6 +81,10 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix) memset(&ref_array, 0, sizeof(ref_array)); + register_allowed_protocol_version(protocol_v2); + register_allowed_protocol_version(protocol_v1); + register_allowed_protocol_version(protocol_v0); + argc = parse_options(argc, argv, prefix, options, ls_remote_usage, PARSE_OPT_STOP_AT_NON_OPTION); dest = argv[0]; diff --git a/builtin/pull.c b/builtin/pull.c index 681c127a07..cb64146834 100644 --- a/builtin/pull.c +++ b/builtin/pull.c @@ -9,6 +9,7 @@ #include "config.h" #include "bui
Re: What's cooking in git.git (Oct 2018, #01; Wed, 10)
Stefan Beller writes: > I think we should add these tweaks, such that > color-moved-ws implies color-moved (both config and CLI options) > and --color-moved implies --color (command line only) I am not sure what you mean by "both config and". I'd find it entirely sensible for a user to say "I do not always use the color-moved option, but when I do, I want it to handle the whitespace differences this way by default". So presence of diff.colorMovedWS configuration variable should never trigger the color-moved feature by itself, I would think.
Re: [PATCH v6] log: fix coloring of certain octupus merge shapes
On Tue, 9 Oct 2018 at 21:43, Junio C Hamano wrote: > I had a bit hard time parsing the above, especially with "then", > which probably would make it easier to read if it is not there. Okay, I guess better to separate the explanation from the diagrams, rather than weaving them together: For octopus merges where the first parent edge immediately merges into the next column to the left, the number of columns should be one less than the usual case. First parent to the left case: | *-. | |\ \ |/ / / The usual case: | *-. | |\ \ | | | * > > Also refactor the code to iterate over columns rather than dashes, > > building from an initial patch suggestion by Jeff King. > > s/suggestion/suggested/ perhaps? Ok. > It is unclear to me what "delta of columns" means here. Is this > because I am unfamiliar with the internal of graph.[ch] API (and > 'delta of columns' is used elsewhere in the API internals already)? No, I just meant difference in number of columns from the previous line to the next. Actually, I had kind of wanted to use "new columns", but that would be confusing with the num_new_columns variable actually meaning the total number of columns in the next line. > > + int delta_cols = (graph->num_new_columns - graph->num_columns); > > So in the second picture above, new-columns (which is the columns > used after showing the current line) is narrower (because 'x' reuses > an already allocated column without getting a new one) than columns > (which is the columns for the octopus merge we are showing)? > > I am not sure I follow what is going on around here, sorry. Maybe it's clearer by saying "added columns" (also expanded the comments a bit)? /* * Usually, we add one new column for each parent (like the diagram * above) but sometimes the first parent goes into an existing column, * like this: * * | *---. * | |\ \ \ * |/ / / / * x 0 1 2 * * In which case the number of parents will be one greater than the * number of added columns. */ int added_cols = (graph->num_new_columns - graph->num_columns); int parent_in_old_cols = graph->num_parents - added_cols; From a9c90605c062b30273dad35adbf319905028cacc Mon Sep 17 00:00:00 2001 From: Noam Postavsky Date: Sat, 1 Sep 2018 20:07:16 -0400 Subject: [PATCH v7] log: fix coloring of certain octupus merge shapes For octopus merges where the first parent edge immediately merges into the next column to the left, the number of columns should be one less than the usual case. First parent to the left case: | *-. | |\ \ |/ / / The usual case: | *-. | |\ \ | | | * Also refactor the code to iterate over columns rather than dashes, building from an initial patch suggested by Jeff King. Signed-off-by: Noam Postavsky Reviewed-by: Jeff King --- graph.c | 58 +--- t/t4214-log-graph-octopus.sh | 102 +++ 2 files changed, 145 insertions(+), 15 deletions(-) create mode 100755 t/t4214-log-graph-octopus.sh diff --git a/graph.c b/graph.c index e1f6d3bddb..f53135485f 100644 --- a/graph.c +++ b/graph.c @@ -842,27 +842,55 @@ static void graph_output_commit_char(struct git_graph *graph, struct strbuf *sb) } /* - * Draw an octopus merge and return the number of characters written. + * Draw the horizontal dashes of an octopus merge and return the number of + * characters written. */ static int graph_draw_octopus_merge(struct git_graph *graph, struct strbuf *sb) { /* - * Here dashless_commits represents the number of parents - * which don't need to have dashes (because their edges fit - * neatly under the commit). - */ - const int dashless_commits = 2; - int col_num, i; - int num_dashes = - ((graph->num_parents - dashless_commits) * 2) - 1; - for (i = 0; i < num_dashes; i++) { - col_num = (i / 2) + dashless_commits + graph->commit_index; - strbuf_write_column(sb, &graph->new_columns[col_num], '-'); + * Here dashless_parents represents the number of parents which don't + * need to have dashes (the edges labeled "0" and "1"). And + * dashful_parents are the remaining ones. + * + * | *---. + * | |\ \ \ + * | | | | | + * x 0 1 2 3 + * + */ + const int dashless_parents = 2; + int dashful_parents = graph->num_parents - dashless_parents; + + /* + * Usually, we add one new column for each parent (like the diagram + * above) but sometimes the first parent goes into an existing column, + * like this: + * + * | *---. + * | |\ \ \ + * |/ / / / + * x 0 1 2 + * + * In which case the number of parents will be one greater than the + * number of added columns. + */ + int added_cols = (graph->num_new_columns - graph->num_columns); + int parent_in_old_cols = graph->num_parents - added_cols; + + /* + * In both cases, commit_index corresponds to the edge labeled "0". + */ + int first_col = graph->commit_index + dashless_parents + - parent_in_o
Re: [RFC PATCH 00/19] Bring more repository handles into our code base
Stefan Beller writes: > Additionally each patch adds a semantic patch, that would port from the old to > the new function. These semantic patches are all applied in the very last > patch, > but we could omit applying the last patch if it causes too many merge > conflicts > and trickl in the semantic patches over time when there are no merge > conflicts. That's an interesting approach ;-) > The original goal of all these refactoring series was to remove > add_submodule_odb > in submodule.c, which was partially reached with this series. Yup, that is a very good goalpost to keep in mind. > remaining calls in another series, but it shows we're close to be done with > these > large refactorings as far as I am concerned. Nice.
Re: [PATCH v3] branch: introduce --show-current display option
On 10/12/18 1:15 AM, Junio C Hamano wrote: > It is a bit curious why you remove the branch but not the tag after > this test. [..] > > So two equally valid choices are to remove "branch -d" and then > either: > > (1) leave both branch and tag after this test in the test > repository > > (2) use test_when_finished [..] Thanks for this explanation! You're right, I removed the branch because it otherwise breaks subsequent tests, while the tag doesn't matter. I'll go take a look at how test_when_finished can be used. > Please do *not* cd around without being in a subshell. Understood, thanks for explaining this as well.
Re: [PATCH v3] branch: introduce --show-current display option
Daniels Umanovskis writes: > +static void print_current_branch_name(void) Thanks for fixing this (I fixed this in the previous round in my tree but forgot to tell you about it). > diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh > index ee6787614..8d2020aea 100755 > --- a/t/t3203-branch-output.sh > +++ b/t/t3203-branch-output.sh > @@ -100,6 +100,47 @@ test_expect_success 'git branch -v pattern does not show > branch summaries' ' > test_must_fail git branch -v branch* > ' > > +test_expect_success 'git branch `--show-current` shows current branch' ' > + cat >expect <<-\EOF && > + branch-two > + EOF > + git checkout branch-two && > + git branch --show-current >actual && > + test_cmp expect actual > +' OK, that's trivial. We checkout a branch and make sure show-current reports the name of that branch. Good. > +test_expect_success 'git branch `--show-current` is silent when detached > HEAD' ' > + git checkout HEAD^0 && > + git branch --show-current >actual && > + test_must_be_empty actual > +' OK, and at the same time we make sure the command exits with success. Good. > +test_expect_success 'git branch `--show-current` works properly when tag > exists' ' > + cat >expect <<-\EOF && > + branch-and-tag-name > + EOF > + git checkout -b branch-and-tag-name && > + git tag branch-and-tag-name && > + git branch --show-current >actual && > + git checkout branch-one && > + git branch -d branch-and-tag-name && > + test_cmp expect actual > +' It is a bit curious why you remove the branch but not the tag after this test. If we are cleaning after ourselves, removing both would be equally good, if not cleaner. If having both absolutely harms later tests but having just one is OK, then any failure in this test between the time branch-and-tag-name tag gets created and the time branch-and-tag-name branch gets removed will leave the repository with both the tag and the branch, which will be the state in which later tests start, so having "branch -d" at this spot in the sequence is not a good idea anyway. So two equally valid choices are to remove "branch -d" and then either: (1) leave both branch and tag after this test in the test repository (2) use test_when_finished, i.e. echo branch-and-tag-name >expect && test_when_finished "git branch -D branch-and-tag-name" && git checkout -b branch-and-tag-name && test_when_finished "git tag -d branch-and-tag-name" && git tag branch-and-tag-name && ... to arrange them to be cleaned once this test is done. (1) is only valid if they do not harm later tests. I guess you remove the branch because you did not want to touch later tests that checks output from "git branch --list", in which case you'd want (2). > +test_expect_success 'git branch `--show-current` works properly with > worktrees' ' > + cat >expect <<-\EOF && > + branch-one > + branch-two > + EOF > + git checkout branch-one && > + git branch --show-current >actual && > + git worktree add worktree branch-two && > + cd worktree && > + git branch --show-current >>../actual && > + cd .. && > + test_cmp expect actual > +' Please do *not* cd around without being in a subshell. If the second --show-current failed for some reason, "cd .." will not be executed, and the next and subsequent test will start inside ./worktree subdirectory, which is likely to break the expectations of them. Perhaps something like git checkout branch-one && git worktree add worktree branch-two && ( git branch --show-current && cd worktree && git branch --show-current ) >actual && test_cmp expect actual or its modern equivalent git checkout branch-one && git worktree add worktree branch-two && ( git branch --show-current && git -C worktree branch --show-current ) >actual && test_cmp expect actual Note that the latter _could_ be written without subshell, i.e. git branch --show-current >actual && git -C worktree branch --show-current >>actual && but I personally tend to prefer with a single redirection into ">actual", as that is easier to later add _more_ commands to redirect into 'actual' to be inspected without having to worry about details like repeating ">>actual" or only the first one must be ">actual" (iow, the preference comes mostly from maintainability concerns). Thanks. > + > test_expect_success 'git branch shows detached HEAD properly' ' > cat >expect < * (HEAD detached at $(git rev-parse --short HEAD^0))
Re: [PATCH 18/19] submodule: don't add submodule as odb for push
> Do you know if pushing of submodules is exercised by any test? t5531-deep-submodule-push.sh (all of them) t5545-push-options.sh (ok 4 - push options and submodules)
[RFC PATCH 3/3] list-objects-filter: teach tree:# how to handle >0
Implement positive values for in the tree: filter. The exact semantics are described in Documentation/rev-list-options.txt. The long-term goal at the end of this is to allow a partial clone to eagerly fetch an entire directory of files by fetching a tree and specifying =1. This, for instance, would make a build operation fast and convenient. It is fast because the partial clone does not need to fetch each file individually, and convenient because the user does not need to supply a sparse-checkout specification. Signed-off-by: Matthew DeVore --- Documentation/rev-list-options.txt | 8 ++- list-objects-filter-options.c | 6 +-- list-objects-filter-options.h | 1 + list-objects-filter.c | 49 + t/t6112-rev-list-filters-objects.sh | 84 + 5 files changed, 132 insertions(+), 16 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index c2c1c40e6..c78985c41 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -734,8 +734,12 @@ specification contained in . + The form '--filter=tree:' omits all blobs and trees whose depth from the root tree is >= (minimum depth if an object is located -at multiple depths in the commits traversed). Currently, only =0 -is supported, which omits all blobs and trees. +at multiple depths in the commits traversed). =0 will not include +any trees or blobs unless included explicitly in . =1 +will include only the tree and blobs which are referenced directly by a +commit reachable from or an object given in . =2 +is like =1 while also including trees and blobs one more level +removed from or a reachable commit. --no-filter:: Turn off any previous `--filter=` argument. diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index e8da2e858..9dc61d6e6 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -50,12 +50,12 @@ static int gently_parse_list_objects_filter( } } else if (skip_prefix(arg, "tree:", &v0)) { - unsigned long depth; - if (!git_parse_ulong(v0, &depth) || depth != 0) { + if (!git_parse_ulong(v0, +&filter_options->tree_depth_limit_value)) { if (errbuf) { strbuf_addstr( errbuf, - _("only 'tree:0' is supported")); + _("expected 'tree:'")); } return 1; } diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h index af64e5c66..c1ae70cd8 100644 --- a/list-objects-filter-options.h +++ b/list-objects-filter-options.h @@ -44,6 +44,7 @@ struct list_objects_filter_options { struct object_id *sparse_oid_value; char *sparse_path_value; unsigned long blob_limit_value; + unsigned long tree_depth_limit_value; }; /* Normalized command line arguments */ diff --git a/list-objects-filter.c b/list-objects-filter.c index 37fba456d..e69fb9b82 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -83,53 +83,80 @@ static void *filter_blobs_none__init( * A filter for list-objects to omit ALL trees and blobs from the traversal. * Can OPTIONALLY collect a list of the omitted OIDs. */ -struct filter_trees_none_data { +struct filter_trees_depth_data { struct oidset *omits; + unsigned long max_depth; + unsigned long current_depth; }; -static enum list_objects_filter_result filter_trees_none( +static enum list_objects_filter_result filter_trees_depth( enum list_objects_filter_situation filter_situation, struct object *obj, const char *pathname, const char *filename, void *filter_data_) { - struct filter_trees_none_data *filter_data = filter_data_; + struct filter_trees_depth_data *filter_data = filter_data_; + + int too_deep = filter_data->current_depth >= filter_data->max_depth; + + /* +* Note that we do not use _MARK_SEEN in order to allow re-traversal in +* case we encounter a tree or blob again at a shallower depth. +*/ switch (filter_situation) { default: BUG("unknown filter_situation: %d", filter_situation); - case LOFS_BEGIN_TREE: case LOFS_BLOB: + if (!too_deep) goto include_it; + + if (filter_data->omits) + oidset_insert(filter_data->omits, &obj->oid); + + return LOFR_ZERO; + + case LOFS_BEGIN_TREE: + filter_data->current_depth++; + + if (!too_deep) goto include_it; + if (filter_data->omits) { oidset_insert(filter_data->omits, &obj->oid); - /* _MARK_SEEN
[RFC PATCH 2/3] Documentation/git-rev-list: s///
git-rev-list has a mode where it works on the granularity of trees and blobs, rather than commits only. When discussing this mode in documenation, it can get awkward to refer to the list of arguments that may include blobs and trees as . It is especially awkward in a follow-up patch, so prepare for that patch by renaming the argument. In addition to simply renaming the argument, also reword documentation in some places such that we include non-commit objects in our terminology. In other words, s/commit/object/ in any prose where the context obviously applies to trees and blobs in a non-pathological way. Signed-off-by: Matthew DeVore --- Documentation/git-rev-list.txt | 21 - Documentation/rev-list-options.txt | 16 builtin/rev-list.c | 2 +- 3 files changed, 21 insertions(+), 18 deletions(-) diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt index 88609ff43..b3357932c 100644 --- a/Documentation/git-rev-list.txt +++ b/Documentation/git-rev-list.txt @@ -60,20 +60,23 @@ SYNOPSIS [ --no-walk ] [ --do-walk ] [ --count ] [ --use-bitmap-index ] -... [ \-- ... ] +... [ \-- ... ] DESCRIPTION --- -List commits that are reachable by following the `parent` links from the -given commit(s), but exclude commits that are reachable from the one(s) -given with a '{caret}' in front of them. The output is given in reverse -chronological order by default. +List objects that are reachable by following references from the given +object(s), but exclude objects that are reachable from the one(s) given +with a '{caret}' in front of them. -You can think of this as a set operation. Commits given on the command -line form a set of commits that are reachable from any of them, and then -commits reachable from any of the ones given with '{caret}' in front are -subtracted from that set. The remaining commits are what comes out in the +By default, only commit objects are shown, and the commits are shown in +reverse chronological order. The '--object' flag causes non-commit objects +to also be shown. + +You can think of this as a set operation. Objects given on the command +line form a set of objects that are reachable from any of them, and then +objects reachable from any of the ones given with '{caret}' in front are +subtracted from that set. The remaining objects are what come out in the command's output. Various other options and paths parameters can be used to further limit the result. diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 5f1672913..c2c1c40e6 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -139,29 +139,29 @@ parents) and `--max-parents=-1` (negative numbers denote no upper limit). --all:: Pretend as if all the refs in `refs/`, along with `HEAD`, are - listed on the command line as ''. + listed on the command line as ''. --branches[=]:: Pretend as if all the refs in `refs/heads` are listed - on the command line as ''. If '' is given, limit + on the command line as ''. If '' is given, limit branches to ones matching given shell glob. If pattern lacks '?', '{asterisk}', or '[', '/{asterisk}' at the end is implied. --tags[=]:: Pretend as if all the refs in `refs/tags` are listed - on the command line as ''. If '' is given, limit + on the command line as ''. If '' is given, limit tags to ones matching given shell glob. If pattern lacks '?', '{asterisk}', or '[', '/{asterisk}' at the end is implied. --remotes[=]:: Pretend as if all the refs in `refs/remotes` are listed - on the command line as ''. If '' is given, limit + on the command line as ''. If '' is given, limit remote-tracking branches to ones matching given shell glob. If pattern lacks '?', '{asterisk}', or '[', '/{asterisk}' at the end is implied. --glob=:: Pretend as if all the refs matching shell glob '' - are listed on the command line as ''. Leading 'refs/', + are listed on the command line as ''. Leading 'refs/', is automatically prepended if missing. If pattern lacks '?', '{asterisk}', or '[', '/{asterisk}' at the end is implied. @@ -182,7 +182,7 @@ explicitly. --reflog:: Pretend as if all objects mentioned by reflogs are listed on the - command line as ``. + command line as ``. --single-worktree:: By default, all working trees will be examined by the @@ -205,9 +205,9 @@ ifndef::git-rev-list[] endif::git-rev-list[] --stdin:: - In addition to the '' listed on the command + In addition to the '' listed on the command line, read them from the standard input. If a `--` separator is - seen, stop reading commits and start reading paths to limit the + seen, sto
[RFC PATCH 1/3] list-objects: support for skipping tree traversal
The tree:0 filter does not need to traverse the trees that it has filtered out, so optimize list-objects and list-objects-filter to skip traversing the trees entirely. Before this patch, we iterated over all children of the tree, and did nothing for all of them, which was wasteful. Signed-off-by: Matthew DeVore --- list-objects-filter.c | 11 +-- list-objects-filter.h | 6 ++ list-objects.c | 5 - t/t6112-rev-list-filters-objects.sh | 10 ++ 4 files changed, 29 insertions(+), 3 deletions(-) diff --git a/list-objects-filter.c b/list-objects-filter.c index 09b2b05d5..37fba456d 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -102,9 +102,16 @@ static enum list_objects_filter_result filter_trees_none( case LOFS_BEGIN_TREE: case LOFS_BLOB: - if (filter_data->omits) + if (filter_data->omits) { oidset_insert(filter_data->omits, &obj->oid); - return LOFR_MARK_SEEN; /* but not LOFR_DO_SHOW (hard omit) */ + /* _MARK_SEEN but not _DO_SHOW (hard omit) */ + return LOFR_MARK_SEEN; + } + else + /* +* Not collecting omits so no need to to traverse tree. +*/ + return LOFR_SKIP_TREE | LOFR_MARK_SEEN; case LOFS_END_TREE: assert(obj->type == OBJ_TREE); diff --git a/list-objects-filter.h b/list-objects-filter.h index a6f6b4990..52b4a84da 100644 --- a/list-objects-filter.h +++ b/list-objects-filter.h @@ -24,6 +24,11 @@ struct oidset; * In general, objects should only be shown once, but * this result DOES NOT imply that we mark it SEEN. * + * _SKIP_TREE : Used in LOFS_BEGIN_TREE situation - indicates that + * the tree's children should not be iterated over. This + * is used as an optimization when all children will + * definitely be ignored. + * * Most of the time, you want the combination (_MARK_SEEN | _DO_SHOW) * but they can be used independently, such as when sparse-checkout * pattern matching is being applied. @@ -45,6 +50,7 @@ enum list_objects_filter_result { LOFR_ZERO = 0, LOFR_MARK_SEEN = 1<<0, LOFR_DO_SHOW = 1<<1, + LOFR_SKIP_TREE = 1<<2, }; enum list_objects_filter_situation { diff --git a/list-objects.c b/list-objects.c index 7a1a0929d..d1e3d217c 100644 --- a/list-objects.c +++ b/list-objects.c @@ -11,6 +11,7 @@ #include "list-objects-filter-options.h" #include "packfile.h" #include "object-store.h" +#include "trace.h" struct traversal_context { struct rev_info *revs; @@ -184,7 +185,9 @@ static void process_tree(struct traversal_context *ctx, if (base->len) strbuf_addch(base, '/'); - if (!failed_parse) + if (r & LOFR_SKIP_TREE) + trace_printf("Skipping contents of tree %s...\n", base->buf); + else if (!failed_parse) process_tree_contents(ctx, tree, base); if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn) { diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh index 08e0c7db6..efb1bee2e 100755 --- a/t/t6112-rev-list-filters-objects.sh +++ b/t/t6112-rev-list-filters-objects.sh @@ -244,6 +244,16 @@ test_expect_success 'verify tree:0 includes trees in "filtered" output' ' test_cmp expected filtered_types ' +# Make sure tree:0 does not iterate through any trees. + +test_expect_success 'filter a GIANT tree through tree:0' ' + GIT_TRACE=1 git -C r3 rev-list \ + --objects --filter=tree:0 HEAD 2>filter_trace && + grep "Skipping contents of tree [.][.][.]" filter_trace >actual && + # One line for each commit traversed. + test_line_count = 2 actual +' + # Delete some loose objects and use rev-list, but WITHOUT any filtering. # This models previously omitted objects that we did not receive. -- 2.19.1.331.ge82ca0e54c-goog
[RFC PATCH 0/3] support for filtering trees and blobs based on depth
This adds support for depth >0 in the tree: filter. Before this patch, only =0 is supported, which means all trees and blobs are filtered. The purpose of this is to allow fetching of entire directories in a partial clone use case. If I do a partial clone of a repo with no objects and then want to do something like "make" it will be quite slow of we initiate a separate fetch for every file needed. Alternatively, fetching directories at a time - as soon as any file in a directory is accessed - is a reasonable approach. Thank you, Matthew DeVore (3): list-objects: support for skipping tree traversal Documentation/git-rev-list: s/// list-objects-filter: teach tree:# how to handle >0 Documentation/git-rev-list.txt | 21 --- Documentation/rev-list-options.txt | 24 +--- builtin/rev-list.c | 2 +- list-objects-filter-options.c | 6 +- list-objects-filter-options.h | 1 + list-objects-filter.c | 52 +--- list-objects-filter.h | 6 ++ list-objects.c | 5 +- t/t6112-rev-list-filters-objects.sh | 94 + 9 files changed, 178 insertions(+), 33 deletions(-) -- 2.19.1.331.ge82ca0e54c-goog
Re: [RFC PATCH 00/19] Bring more repository handles into our code base
> This series takes another approach as it doesn't change the signature of > functions, but introduces new functions that can deal with arbitrary > repositories, keeping the old function signature around using a shallow > wrapper. > > Additionally each patch adds a semantic patch, that would port from the old to > the new function. These semantic patches are all applied in the very last > patch, > but we could omit applying the last patch if it causes too many merge > conflicts > and trickl in the semantic patches over time when there are no merge > conflicts. Thanks, this looks like a good plan. One concern is that if we leave 2 versions of functions around, it will be difficult to look at a function and see if it's truly multi-repository-compatible (or making a call to a function that internally uses the_repository, and is thus wrong). But with the plan Stefan quoted [1], mentioned in commit e675765235 ("diff.c: remove implicit dependency on the_index", 2018-09-21): The plan is these macros will always be defined for all library files and the macros are only accessible in builtin/ (The macros include NO_THE_REPOSITORY_COMPATIBILITY_MACROS, which disables the single-repository function-like macros.) This mitigates the concern somewhat. [1] https://public-inbox.org/git/20181011211754.31369-1-sbel...@google.com/
Re: What's cooking in git.git (Oct 2018, #01; Wed, 10)
On Thu, Oct 11, 2018 at 3:41 PM Junio C Hamano wrote: > * After fixing ignore-any to one of the supported option >(e.g. "ignore-all-spaces"), the color-moved feature still did not >trigger. I think the presence of --color-moved-ws by itself is a >hint that the user wants --color-moved to be used. If it turns >out that there are some valid use cases where --color-moved-ws >may have to be set but the color-moved feature should not be >enabled, then > > diff --color-moved-ws=ignore-all-space --no-color-moved > >can be used to countermand this, of course. I had the same idea for --color-moved to imply --color, but there we had some issues with configured settings, as we do not want diff.colorMoved to imply colored patches, but only when given on the command line. I think we should add these tweaks, such that color-moved-ws implies color-moved (both config and CLI options) and --color-moved implies --color (command line only)
Re: [PATCH] diff.c: die on unknown color-moved ws mode
On Thu, Oct 11, 2018 at 3:59 PM Stefan Beller wrote: > - error(_("ignoring unknown color-moved-ws mode '%s'"), > sb.buf); > + die(_("ignoring unknown color-moved-ws mode '%s'"), > sb.buf); s/ignoring// as it was sent in a haste.
Re: [PATCH 18/19] submodule: don't add submodule as odb for push
> The submodule was added as an alternative in eb21c732d6 (push: teach > --recurse-submodules the on-demand option, 2012-03-29), but was > not explained, why. > > In similar code, submodule_has_commits, the submodule is added as an > alternative to perform a quick check if we need to dive into the submodule. > > However in push_submodule > (a) for_each_remote_ref_submodule will also provide the quick check and > (b) after that we don't need to have submodule objects around, as all > further code is to spawn a separate process. After some investigation, I think I understand. I would explain it this way: In push_submodule(), because we do not actually need access to objects in the submodule, do not invoke add_submodule_odb(). (for_each_remote_ref_submodule() does not require access to those objects, and the actual push is done by spawning another process, which handles object access by itself.) I'm not sure if it's worth mentioning the commit in which the call was introduced, since nothing seems to have changed between then and now (the same bug is present when it was introduced, and now). I also checked the users of push_submodule() (transport_push()) and indeed it doesn't seem to make use of the additional objects brought in by add_submodule_odb(). Do you know if pushing of submodules is exercised by any test? Other than that, the code itself looks good.
[PATCH] diff.c: die on unknown color-moved ws mode
Noticed-by: Junio C Hamano Signed-off-by: Stefan Beller --- There is no "ignore-any" supported by the feature---I think that the parser for the option should have noticed and barfed, but it did not. It merely emitted a message to the standard output and let it scroll away with the huge diff before the reader noticed it. Addressed in this patch. Am I missing something [...] ? Note that this parsing is used for both the parsing from command line as well as options, i.e. git config diff.colorMovedWS asdf git format-patch HEAD^ fatal: ignoring unknown color-moved-ws mode 'asdf' git config --unset diff.colorMovedWS (format-patch parses these color specific things, but doesn't apply it) diff.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/diff.c b/diff.c index 145cfbae59..bdf4535d69 100644 --- a/diff.c +++ b/diff.c @@ -313,7 +313,7 @@ static int parse_color_moved_ws(const char *arg) else if (!strcmp(sb.buf, "allow-indentation-change")) ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE; else - error(_("ignoring unknown color-moved-ws mode '%s'"), sb.buf); + die(_("ignoring unknown color-moved-ws mode '%s'"), sb.buf); strbuf_release(&sb); } -- 2.19.0
Re: [PATCH v2 1/1] branch: introduce --show-current display option
On 10/12/18 12:56 AM, SZEDER Gábor wrote: > Ah, OK, just noticed v3 which has already fixed this. > Yeah - squashed the wrong commits locally for v2. Thanks for pointing this out anyway!
Re: [PATCH v2 1/1] branch: introduce --show-current display option
On Fri, Oct 12, 2018 at 12:53:26AM +0200, SZEDER Gábor wrote: > On Wed, Oct 10, 2018 at 10:54:32PM +0200, Daniels Umanovskis wrote: > > [...] > > > diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh > > index ee6787614..e9bc3b05f 100755 > > --- a/t/t3203-branch-output.sh > > +++ b/t/t3203-branch-output.sh > > @@ -100,6 +100,47 @@ test_expect_success 'git branch -v pattern does not > > show branch summaries' ' > > test_must_fail git branch -v branch* > > ' > > > > +test_expect_success 'git branch `--show-current` shows current branch' ' > > + cat >expect <<-\EOF && > > + branch-two > > + EOF > > + git checkout branch-two && > > Up to this point everything talked about '--show-current' ... > > > + git branch --current >actual && > > ... but here and in all the following tests you run > > git branch --current > > which then fails with "error: unknown option `current'" Ah, OK, just noticed v3 which has already fixed this.
Re: [PATCH v2 1/1] branch: introduce --show-current display option
On Wed, Oct 10, 2018 at 10:54:32PM +0200, Daniels Umanovskis wrote: [...] > diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh > index ee6787614..e9bc3b05f 100755 > --- a/t/t3203-branch-output.sh > +++ b/t/t3203-branch-output.sh > @@ -100,6 +100,47 @@ test_expect_success 'git branch -v pattern does not show > branch summaries' ' > test_must_fail git branch -v branch* > ' > > +test_expect_success 'git branch `--show-current` shows current branch' ' > + cat >expect <<-\EOF && > + branch-two > + EOF > + git checkout branch-two && Up to this point everything talked about '--show-current' ... > + git branch --current >actual && ... but here and in all the following tests you run git branch --current which then fails with "error: unknown option `current'" > + test_cmp expect actual > +' > + > +test_expect_success 'git branch `--show-current` is silent when detached > HEAD' ' > + git checkout HEAD^0 && > + git branch --current >actual && > + test_must_be_empty actual > +' > + > +test_expect_success 'git branch `--show-current` works properly when tag > exists' ' > + cat >expect <<-\EOF && > + branch-and-tag-name > + EOF > + git checkout -b branch-and-tag-name && > + git tag branch-and-tag-name && > + git branch --current >actual && > + git checkout branch-one && > + git branch -d branch-and-tag-name && > + test_cmp expect actual > +' > + > +test_expect_success 'git branch `--show-current` works properly with > worktrees' ' > + cat >expect <<-\EOF && > + branch-one > + branch-two > + EOF > + git checkout branch-one && > + git branch --current >actual && > + git worktree add worktree branch-two && > + cd worktree && > + git branch --current >>../actual && > + cd .. && > + test_cmp expect actual > +' > + > test_expect_success 'git branch shows detached HEAD properly' ' > cat >expect < * (HEAD detached at $(git rev-parse --short HEAD^0)) > -- > 2.19.1.330.g93276587c.dirty >
Re: [PATCH 2/2] push: add an advice on unqualified push
Ævar Arnfjörð Bjarmason writes: > On Wed, Oct 10 2018, Jeff King wrote: > >> This is much better, and I love the customized behavior based on the >> object type. >> >> I wonder if we could reword the first paragraph to be a little less >> confusing, and spell out what we tried already. E.g., something like: >> ... > > Yeah that makes sense. I was trying to avoid touching the existing > wording to make this more surgical, but you came up with it, and since > you don't like it I'll just change that too. OK, for now I'll mark these two patches "read" in my inbox and forget about them, expecting that a reroll of 2/2 with improved messages would appear not in too distant future. Thanks, both.
Re: [PATCH 17/19] submodule: use submodule repos for object lookup
> +/* > + * Initialize 'out' based on the provided submodule path. > + * > + * Unlike repo_submodule_init, this tolerates submodules not present > + * in .gitmodules. NEEDSWORK: The repo_submodule_init behavior is > + * preferrable. This function exists only to preserve historical behavior. What do you mean by "The repo_submodule_init behavior is preferable"? If we need to preserve historical behavior, then it seems that the most preferable one is something that meets our needs (like open_submodule() in this patch). > +static int open_submodule(struct repository *out, const char *path) > +{ > + struct strbuf sb = STRBUF_INIT; > + > + if (submodule_to_gitdir(&sb, path) || repo_init(out, sb.buf, NULL)) { > + strbuf_release(&sb); > + return -1; > + } > + > + out->submodule_prefix = xstrdup(path); Do we need to set submodule_prefix? > @@ -507,7 +533,7 @@ static void show_submodule_header(struct diff_options *o, > const char *path, > else if (is_null_oid(two)) > message = "(submodule deleted)"; > > - if (add_submodule_odb(path)) { > + if (open_submodule(sub, path) < 0) { This function, as a side effect, writes the open repository to "sub" for its caller to use. I think it's better if its callers open "sub" and then pass it to show_submodule_header() if successful. If opening the submodule is expected to fail sometimes (like it seems here), then we can allow callers to also pass NULL, and document what happens when NULL is passed. Also, repositories open in this way should also be freed.
Re: What's cooking in git.git (Oct 2018, #01; Wed, 10)
Phillip Wood writes: > On 10/10/2018 06:43, Junio C Hamano wrote: >> Here are the topics that have been cooking. Commits prefixed with >> '-' are only in 'pu' (proposed updates) while commits prefixed with >> '+' are in 'next'. The ones marked with '.' do not appear in any of >> the integration branches, but I am still holding onto them. >> >> * pw/diff-color-moved-ws-fix (2018-10-04) 5 commits >> - diff --color-moved: fix a memory leak >> - diff --color-moved-ws: fix another memory leak >> - diff --color-moved-ws: fix a memory leak >> - diff --color-moved-ws: fix out of bounds string access >> - diff --color-moved-ws: fix double free crash >> >> Various fixes to "diff --color-moved-ws". >> >> What's the status of this topic? > > I think it is ready for next - Stefan was happy with the last iteration. This is not about your fixes, but I was skimming the color-moved support in general as a final sanity check to move this forward and noticed that $ git diff --color-moved-ws=ignore-any master... does not do anything interesting, which is broken at at least two points. * There is no "ignore-any" supported by the feature---I think that the parser for the option should have noticed and barfed, but it did not. It merely emitted a message to the standard output and let it scroll away with the huge diff before the reader noticed it. * After fixing ignore-any to one of the supported option (e.g. "ignore-all-spaces"), the color-moved feature still did not trigger. I think the presence of --color-moved-ws by itself is a hint that the user wants --color-moved to be used. If it turns out that there are some valid use cases where --color-moved-ws may have to be set but the color-moved feature should not be enabled, then diff --color-moved-ws=ignore-all-space --no-color-moved can be used to countermand this, of course. Am I missing something or are these mere small sloppiness in the current code?
URGENT RESPONSE NEEDED
Hello my dear. Did you receive my email message to you? Please, get back to me ASAP as the matter is becoming late. Expecting your urgent response. Sean.
Re: [PATCH v2 1/1] branch: introduce --show-current display option
On Thu, Oct 11, 2018 at 04:53:23PM -0400, Jeff King wrote: > Right, I like that part. It's just that putting "HEAD" there already has > a meaning: it would find refs/heads/HEAD. > > Now I'll grant that's a bad name for a branch (and the source of other > confusions, and I think perhaps even something a few commands actively > discourage these days). > Makes sense. My whole premise was based on the fact that refs/heads/HEAD wouldn't be supported. Now it's obvious to me that isn't necessarily true. And now I understand the real issue. Thanks for bearing with me. I also agree with your proposed '--list-head' suggestion. So, ideally, instead of my broken suggestion of: $ git branch --verbose --list HEAD feature1 hotfix-*; The equivalent would be: $ git branch --verbose --list-head --list feature1 hotfix-*; and it would coalesce nicely as long as --list-head conforms with the default formatting for --list. -- Cheers Rafael Ascensão
Re: [PATCH v3 7/7] revision.c: refactor basic topo-order logic
On Fri, Sep 21, 2018 at 10:39 AM Derrick Stolee via GitGitGadget wrote: > > From: Derrick Stolee [...] > For the test above, I specifically selected a path that is changed > frequently, including by merge commits. A less-frequently-changed > path (such as 'README') has similar end-to-end time since we need > to walk the same number of commits (before determining we do not > have 100 hits). However, get get the benefit that the output is "get get"
Re: [PATCH 16/19] pretty: prepare format_commit_message to handle arbitrary repositories
Patches 6-16 are all quite straightforward, and are reviewed-by: me.
[PATCH v3] branch: introduce --show-current display option
When called with --show-current, git branch will print the current branch name and terminate. Only the actual name gets printed, without refs/heads. In detached HEAD state, nothing is output. Intended both for scripting and interactive/informative use. Unlike git branch --list, no filtering is needed to just get the branch name. Signed-off-by: Daniels Umanovskis --- Cleaned up per suggestions, explicitly passing flags to clearly denote intent. If you consider the patch good conceptually, this implementation should hopefully be good enough to include. Documentation/git-branch.txt | 6 +- builtin/branch.c | 25 -- t/t3203-branch-output.sh | 41 3 files changed, 69 insertions(+), 3 deletions(-) diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt index bf5316ffa..0babb9b1b 100644 --- a/Documentation/git-branch.txt +++ b/Documentation/git-branch.txt @@ -9,7 +9,7 @@ SYNOPSIS [verse] 'git branch' [--color[=] | --no-color] [-r | -a] - [--list] [-v [--abbrev= | --no-abbrev]] + [--list] [--show-current] [-v [--abbrev= | --no-abbrev]] [--column[=] | --no-column] [--sort=] [(--merged | --no-merged) []] [--contains []] @@ -160,6 +160,10 @@ This option is only applicable in non-verbose mode. branch --list 'maint-*'`, list only the branches that match the pattern(s). +--show-current:: + Print the name of the current branch. In detached HEAD state, + nothing is printed. + -v:: -vv:: --verbose:: diff --git a/builtin/branch.c b/builtin/branch.c index c396c4153..46f91dc06 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -443,6 +443,21 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin free(to_free); } +static void print_current_branch_name(void) +{ + int flags; + const char *refname = resolve_ref_unsafe("HEAD", 0, NULL, &flags); + const char *shortname; + if (!refname) + die(_("could not resolve HEAD")); + else if (!(flags & REF_ISSYMREF)) + return; + else if (skip_prefix(refname, "refs/heads/", &shortname)) + puts(shortname); + else + die(_("HEAD (%s) points outside of refs/heads/"), refname); +} + static void reject_rebase_or_bisect_branch(const char *target) { struct worktree **worktrees = get_worktrees(0); @@ -581,6 +596,7 @@ static int edit_branch_description(const char *branch_name) int cmd_branch(int argc, const char **argv, const char *prefix) { int delete = 0, rename = 0, copy = 0, force = 0, list = 0; + int show_current = 0; int reflog = 0, edit_description = 0; int quiet = 0, unset_upstream = 0; const char *new_upstream = NULL; @@ -620,6 +636,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix) OPT_BIT('c', "copy", ©, N_("copy a branch and its reflog"), 1), OPT_BIT('C', NULL, ©, N_("copy a branch, even if target exists"), 2), OPT_BOOL('l', "list", &list, N_("list branch names")), + OPT_BOOL(0, "show-current", &show_current, N_("show current branch name")), OPT_BOOL(0, "create-reflog", &reflog, N_("create the branch's reflog")), OPT_BOOL(0, "edit-description", &edit_description, N_("edit the description for the branch")), @@ -662,14 +679,15 @@ int cmd_branch(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, options, builtin_branch_usage, 0); - if (!delete && !rename && !copy && !edit_description && !new_upstream && !unset_upstream && argc == 0) + if (!delete && !rename && !copy && !edit_description && !new_upstream && + !show_current && !unset_upstream && argc == 0) list = 1; if (filter.with_commit || filter.merge != REF_FILTER_MERGED_NONE || filter.points_at.nr || filter.no_commit) list = 1; - if (!!delete + !!rename + !!copy + !!new_upstream + + if (!!delete + !!rename + !!copy + !!new_upstream + !!show_current + list + unset_upstream > 1) usage_with_options(builtin_branch_usage, options); @@ -697,6 +715,9 @@ int cmd_branch(int argc, const char **argv, const char *prefix) if (!argc) die(_("branch name required")); return delete_branches(argc, argv, delete > 1, filter.kind, quiet); + } else if (show_current) { + print_current_branch_name(); + return 0; } else if (list) { /* git branch --local also shows HEAD when it is detached */ if ((filter.kind & FILTER_REFS_BRANCHES) && filter.detached) diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh index ee678761
Re: [PATCH 05/19] object: parse_object to honor its repository argument
> In 8e4b0b6047 (object.c: allow parse_object to handle > arbitrary repositories, 2018-06-28), we forgot to pass the > repository down to the read_object_file. [snip] > @@ -270,7 +270,7 @@ struct object *parse_object(struct repository *r, const > struct object_id *oid) > return lookup_object(r, oid->hash); > } > > - buffer = read_object_file(oid, &type, &size); > + buffer = repo_read_object_file(r, oid, &type, &size); There is still the matter of the 2 invocations of has_object_file() earlier in this function. The first one probably can be replaced with oid_object_info_extended() (see the definition of has_sha1_file_with_flags() to see how to do it), and the second one looks redundant to me and can be removed. Apart from that, I don't see any other invocations that need to be converted, so parse_object() will indeed fully honor its repository argument.
Re: [PATCH 04/19] object-store: prepare read_object_file to deal with arbitrary repositories
On Thu, Oct 11, 2018 at 3:01 PM Jonathan Tan wrote: > > > Introduce repo_read_object_file which takes the repository argument, and > > hide the original read_object_file as a macro behind > > NO_THE_REPOSITORY_COMPATIBILITY_MACROS, which we planned for in > > e675765235 (diff.c: remove implicit dependency on the_index, 2018-09-21) > > That commit didn't seem to plan for anything - it just seems to add a > new function with the name "repo_" preprended and define a macro if > NO_THE_REPOSITORY_COMPATIBILITY_MACROS is not set, just like this patch. > Maybe s/which we planned for in/just like in/. I was reading too much into The plan is these macros will always be defined for all library files and the macros are only accessible in builtin/ of that commit message.
Re: [PATCH 04/19] object-store: prepare read_object_file to deal with arbitrary repositories
> Introduce repo_read_object_file which takes the repository argument, and > hide the original read_object_file as a macro behind > NO_THE_REPOSITORY_COMPATIBILITY_MACROS, which we planned for in > e675765235 (diff.c: remove implicit dependency on the_index, 2018-09-21) That commit didn't seem to plan for anything - it just seems to add a new function with the name "repo_" preprended and define a macro if NO_THE_REPOSITORY_COMPATIBILITY_MACROS is not set, just like this patch. Maybe s/which we planned for in/just like in/. The patch itself looks good.
Re: [PATCH 03/19] object-store: allow read_object_file_extended to read from arbitrary repositories
> @@ -1413,10 +1414,10 @@ void *read_object_file_extended(const struct > object_id *oid, > const char *path; > struct stat st; > const struct object_id *repl = lookup_replace ? > - lookup_replace_object(the_repository, oid) : oid; > + lookup_replace_object(r, oid) : oid; Firstly, patches 1 and 2 are obviously correct. This lookup_replace_object() is a bit tricky in that at 'master', register_replace_ref() (indirectly called by lookup_replace_object()) did not handle arbitrary repositories correctly, but 'next' has a patch that solves this, so this shouldn't be an issue. The other function changes look fine. So this patch looks correct too.
[PATCH 19/19] Apply semantic patches from previous patches
Previous commits added some cocci rules, but did not patch the whole tree, as to not dilute the focus for reviewing the previous patches. This patch is generated by 'make coccicheck' and applying the resulting diff, which was white space damaged (>8 spaces after a tab) in blame.c, which has been fixed. Signed-off-by: Stefan Beller --- apply.c | 6 ++-- archive.c| 5 +-- bisect.c | 5 +-- blame.c | 15 + builtin/am.c | 2 +- builtin/blame.c | 4 +-- builtin/cat-file.c | 21 +++- builtin/checkout.c | 4 +-- builtin/commit.c | 13 +--- builtin/describe.c | 4 +-- builtin/difftool.c | 3 +- builtin/fast-export.c| 7 ++-- builtin/fmt-merge-msg.c | 8 +++-- builtin/grep.c | 2 +- builtin/index-pack.c | 8 +++-- builtin/log.c| 4 +-- builtin/merge-base.c | 2 +- builtin/merge-tree.c | 9 -- builtin/mktag.c | 3 +- builtin/name-rev.c | 2 +- builtin/notes.c | 12 --- builtin/pack-objects.c | 22 + builtin/reflog.c | 5 +-- builtin/replace.c| 2 +- builtin/shortlog.c | 5 +-- builtin/show-branch.c| 4 +-- builtin/tag.c| 4 +-- builtin/unpack-file.c| 2 +- builtin/unpack-objects.c | 3 +- builtin/verify-commit.c | 2 +- bundle.c | 2 +- combine-diff.c | 2 +- commit-graph.c | 8 ++--- commit.c | 15 + config.c | 2 +- diff.c | 3 +- dir.c| 2 +- entry.c | 3 +- fast-import.c| 7 ++-- fsck.c | 9 +++--- grep.c | 3 +- http-push.c | 3 +- log-tree.c | 3 +- mailmap.c| 2 +- match-trees.c| 4 +-- merge-blobs.c| 6 ++-- merge-recursive.c| 13 negotiator/default.c | 6 ++-- negotiator/skipping.c| 2 +- notes-cache.c| 5 +-- notes-merge.c| 4 +-- notes-utils.c| 2 +- notes.c | 10 +++--- pretty.c | 5 +-- read-cache.c | 5 +-- remote-testsvn.c | 4 +-- remote.c | 2 +- rerere.c | 5 +-- revision.c | 12 +++ sequencer.c | 55 +++- sha1-file.c | 3 +- sha1-name.c | 9 +++--- shallow.c| 4 +-- submodule-config.c | 3 +- t/helper/test-revision-walking.c | 3 +- tag.c| 5 +-- tree-walk.c | 6 ++-- tree.c | 5 +-- walker.c | 2 +- xdiff-interface.c| 2 +- 70 files changed, 254 insertions(+), 180 deletions(-) diff --git a/apply.c b/apply.c index fdae1d423b..5ac284b7e8 100644 --- a/apply.c +++ b/apply.c @@ -3187,7 +3187,8 @@ static int apply_binary(struct apply_state *state, unsigned long size; char *result; - result = read_object_file(&oid, &type, &size); + result = repo_read_object_file(the_repository, &oid, &type, + &size); if (!result) return error(_("the necessary postimage %s for " "'%s' cannot be read"), @@ -3249,7 +3250,8 @@ static int read_blob_object(struct strbuf *buf, const struct object_id *oid, uns unsigned long sz; char *result; - result = read_object_file(oid, &type, &sz); + result = repo_read_object_file(the_repository, oid, &type, + &sz); if (!result) return -1; /* XXX read_sha1_file NUL-terminates */ diff --git a/archive.c b/archive.c index 994495af05..70e5eed535 100644 --- a/archive.c +++ b/archive.c @@ -55,7 +55,8 @@ static void format_subst(const struct commit *commit, strbuf_add(&fmt, b + 8, c - b - 8); strbuf_add(buf, src, b - src); - format_commit_message(commit, fmt.buf, buf, &ctx); + repo_format_commit_message(the_repository, commit, fmt.buf, +
[PATCH 13/19] commit: prepare in_merge_bases[_many] to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.c| 15 +-- commit.h| 8 ++-- contrib/coccinelle/the_repository.cocci | 17 + 3 files changed, 32 insertions(+), 8 deletions(-) diff --git a/commit.c b/commit.c index 31f2ca4c78..eca9a475c7 100644 --- a/commit.c +++ b/commit.c @@ -1143,16 +1143,17 @@ int is_descendant_of(struct commit *commit, struct commit_list *with_commit) /* * Is "commit" an ancestor of one of the "references"? */ -int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit **reference) +int repo_in_merge_bases_many(struct repository *r, struct commit *commit, +int nr_reference, struct commit **reference) { struct commit_list *bases; int ret = 0, i; uint32_t min_generation = GENERATION_NUMBER_INFINITY; - if (parse_commit(commit)) + if (repo_parse_commit(r, commit)) return ret; for (i = 0; i < nr_reference; i++) { - if (parse_commit(reference[i])) + if (repo_parse_commit(r, reference[i])) return ret; if (reference[i]->generation < min_generation) min_generation = reference[i]->generation; @@ -1161,7 +1162,7 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit * if (commit->generation > min_generation) return ret; - bases = paint_down_to_common(the_repository, commit, + bases = paint_down_to_common(r, commit, nr_reference, reference, commit->generation); if (commit->object.flags & PARENT2) @@ -1175,9 +1176,11 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit * /* * Is "commit" an ancestor of (i.e. reachable from) the "reference"? */ -int in_merge_bases(struct commit *commit, struct commit *reference) +int repo_in_merge_bases(struct repository *r, + struct commit *commit, + struct commit *reference) { - return in_merge_bases_many(commit, 1, &reference); + return repo_in_merge_bases_many(r, commit, 1, &reference); } struct commit_list *reduce_heads(struct commit_list *heads) diff --git a/commit.h b/commit.h index 89b245be03..fead381651 100644 --- a/commit.h +++ b/commit.h @@ -283,8 +283,12 @@ extern void prune_shallow(int show_only); extern struct trace_key trace_shallow; int is_descendant_of(struct commit *, struct commit_list *); -int in_merge_bases(struct commit *, struct commit *); -int in_merge_bases_many(struct commit *, int, struct commit **); +int repo_in_merge_bases(struct repository *r, struct commit *, struct commit *); +int repo_in_merge_bases_many(struct repository *r, struct commit *, int, struct commit **); +#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS +#define in_merge_bases(c1, c2) repo_in_merge_bases(the_repository, c1, c2) +#define in_merge_bases_many(c1, n, cs) repo_in_merge_bases_many(the_repository, c1, n, cs) +#endif extern int interactive_add(int argc, const char **argv, const char *prefix, int patch); extern int run_add_interactive(const char *revision, const char *patch_mode, diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci index 6dad83f17b..ec579682f6 100644 --- a/contrib/coccinelle/the_repository.cocci +++ b/contrib/coccinelle/the_repository.cocci @@ -68,3 +68,20 @@ expression F; - get_commit_buffer( + repo_get_commit_buffer(the_repository, E, F); + +@@ +expression E; +expression F; +@@ +- in_merge_bases( ++ repo_in_merge_bases(the_repository, + E, F); + +@@ +expression E; +expression F; +expression G; +@@ +- in_merge_bases_many( ++ repo_in_merge_bases_many(the_repository, + E, F, G); -- 2.19.0
[PATCH 09/19] commit.c: allow remove_redundant to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/commit.c b/commit.c index 5e8791f0c1..f8a8844a72 100644 --- a/commit.c +++ b/commit.c @@ -995,7 +995,7 @@ struct commit_list *get_octopus_merge_bases(struct commit_list *in) return ret; } -static int remove_redundant(struct commit **array, int cnt) +static int remove_redundant(struct repository *r, struct commit **array, int cnt) { /* * Some commit in the array may be an ancestor of @@ -1013,7 +1013,7 @@ static int remove_redundant(struct commit **array, int cnt) ALLOC_ARRAY(filled_index, cnt - 1); for (i = 0; i < cnt; i++) - parse_commit(array[i]); + repo_parse_commit(r, array[i]); for (i = 0; i < cnt; i++) { struct commit_list *common; uint32_t min_generation = array[i]->generation; @@ -1029,7 +1029,7 @@ static int remove_redundant(struct commit **array, int cnt) if (array[j]->generation < min_generation) min_generation = array[j]->generation; } - common = paint_down_to_common(the_repository, array[i], filled, + common = paint_down_to_common(r, array[i], filled, work, min_generation); if (array[i]->object.flags & PARENT2) redundant[i] = 1; @@ -1088,7 +1088,7 @@ static struct commit_list *get_merge_bases_many_0(struct commit *one, clear_commit_marks(one, all_flags); clear_commit_marks_many(n, twos, all_flags); - cnt = remove_redundant(rslt, cnt); + cnt = remove_redundant(the_repository, rslt, cnt); result = NULL; for (i = 0; i < cnt; i++) commit_list_insert_by_date(rslt[i], &result); @@ -1199,7 +1199,7 @@ struct commit_list *reduce_heads(struct commit_list *heads) p->item->object.flags &= ~STALE; } } - num_head = remove_redundant(array, num_head); + num_head = remove_redundant(the_repository, array, num_head); for (i = 0; i < num_head; i++) tail = &commit_list_insert(array[i], tail)->next; free(array); -- 2.19.0
[PATCH 17/19] submodule: use submodule repos for object lookup
This converts the 'show_submodule_header' function to use the repository API properly, such that the submodule objects are not added to the main object store. Signed-off-by: Stefan Beller --- submodule.c | 48 ++-- 1 file changed, 38 insertions(+), 10 deletions(-) diff --git a/submodule.c b/submodule.c index 442229bb49..5e1a6c0b7c 100644 --- a/submodule.c +++ b/submodule.c @@ -443,7 +443,7 @@ static int prepare_submodule_summary(struct rev_info *rev, const char *path, return prepare_revision_walk(rev); } -static void print_submodule_summary(struct rev_info *rev, struct diff_options *o) +static void print_submodule_summary(struct repository *r, struct rev_info *rev, struct diff_options *o) { static const char format[] = " %m %s"; struct strbuf sb = STRBUF_INIT; @@ -454,7 +454,8 @@ static void print_submodule_summary(struct rev_info *rev, struct diff_options *o ctx.date_mode = rev->date_mode; ctx.output_encoding = get_log_output_encoding(); strbuf_setlen(&sb, 0); - format_commit_message(commit, format, &sb, &ctx); + repo_format_commit_message(r, commit, format, &sb, + &ctx); strbuf_addch(&sb, '\n'); if (commit->object.flags & SYMMETRIC_LEFT) diff_emit_submodule_del(o, sb.buf); @@ -481,12 +482,37 @@ void prepare_submodule_repo_env(struct argv_array *out) DEFAULT_GIT_DIR_ENVIRONMENT); } +/* + * Initialize 'out' based on the provided submodule path. + * + * Unlike repo_submodule_init, this tolerates submodules not present + * in .gitmodules. NEEDSWORK: The repo_submodule_init behavior is + * preferrable. This function exists only to preserve historical behavior. + * + * Returns 0 on success, -1 when the submodule is not present. + */ +static int open_submodule(struct repository *out, const char *path) +{ + struct strbuf sb = STRBUF_INIT; + + if (submodule_to_gitdir(&sb, path) || repo_init(out, sb.buf, NULL)) { + strbuf_release(&sb); + return -1; + } + + out->submodule_prefix = xstrdup(path); + + strbuf_release(&sb); + return 0; +} + /* Helper function to display the submodule header line prior to the full * summary output. If it can locate the submodule objects directory it will * attempt to lookup both the left and right commits and put them into the * left and right pointers. */ -static void show_submodule_header(struct diff_options *o, const char *path, +static void show_submodule_header(struct diff_options *o, struct repository *sub, + const char *path, struct object_id *one, struct object_id *two, unsigned dirty_submodule, struct commit **left, struct commit **right, @@ -507,7 +533,7 @@ static void show_submodule_header(struct diff_options *o, const char *path, else if (is_null_oid(two)) message = "(submodule deleted)"; - if (add_submodule_odb(path)) { + if (open_submodule(sub, path) < 0) { if (!message) message = "(commits not present)"; goto output_header; @@ -517,8 +543,8 @@ static void show_submodule_header(struct diff_options *o, const char *path, * Attempt to lookup the commit references, and determine if this is * a fast forward or fast backwards update. */ - *left = lookup_commit_reference(the_repository, one); - *right = lookup_commit_reference(the_repository, two); + *left = lookup_commit_reference(sub, one); + *right = lookup_commit_reference(sub, two); /* * Warn about missing commits in the submodule project, but only if @@ -528,7 +554,7 @@ static void show_submodule_header(struct diff_options *o, const char *path, (!is_null_oid(two) && !*right)) message = "(commits not present)"; - *merge_bases = get_merge_bases(*left, *right); + *merge_bases = repo_get_merge_bases(sub, *left, *right); if (*merge_bases) { if ((*merge_bases)->item == *left) fast_forward = 1; @@ -562,8 +588,9 @@ void show_submodule_summary(struct diff_options *o, const char *path, struct rev_info rev; struct commit *left = NULL, *right = NULL; struct commit_list *merge_bases = NULL; + struct repository sub; - show_submodule_header(o, path, one, two, dirty_submodule, + show_submodule_header(o, &sub, path, one, two, dirty_submodule, &left, &right, &merge_bases); /* @@ -580,7 +607,7 @@ void show_submodule_summary(struct diff_options *o, const char *path, goto out; } - print_submodule_summary(&rev, o); + print_submodule_summary(&sub, &rev, o); out:
[PATCH 16/19] pretty: prepare format_commit_message to handle arbitrary repositories
Signed-off-by: Stefan Beller --- contrib/coccinelle/the_repository.cocci | 10 ++ pretty.c| 15 --- pretty.h| 7 ++- 3 files changed, 24 insertions(+), 8 deletions(-) diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci index c81708bb73..c86decd418 100644 --- a/contrib/coccinelle/the_repository.cocci +++ b/contrib/coccinelle/the_repository.cocci @@ -102,3 +102,13 @@ expression G; - logmsg_reencode( + repo_logmsg_reencode(the_repository, E, F, G); + +@@ +expression E; +expression F; +expression G; +expression H; +@@ +- format_commit_message( ++ repo_format_commit_message(the_repository, + E, F, G, H); diff --git a/pretty.c b/pretty.c index 26e44472bb..948f5346cf 100644 --- a/pretty.c +++ b/pretty.c @@ -1505,9 +1505,10 @@ void userformat_find_requirements(const char *fmt, struct userformat_want *w) strbuf_release(&dummy); } -void format_commit_message(const struct commit *commit, - const char *format, struct strbuf *sb, - const struct pretty_print_context *pretty_ctx) +void repo_format_commit_message(struct repository *r, + const struct commit *commit, + const char *format, struct strbuf *sb, + const struct pretty_print_context *pretty_ctx) { struct format_commit_context context; const char *output_enc = pretty_ctx->output_encoding; @@ -1521,9 +1522,9 @@ void format_commit_message(const struct commit *commit, * convert a commit message to UTF-8 first * as far as 'format_commit_item' assumes it in UTF-8 */ - context.message = logmsg_reencode(commit, - &context.commit_encoding, - utf8); + context.message = repo_logmsg_reencode(r, commit, + &context.commit_encoding, + utf8); strbuf_expand(sb, format, format_commit_item, &context); rewrap_message_tail(sb, &context, 0, 0, 0); @@ -1547,7 +1548,7 @@ void format_commit_message(const struct commit *commit, } free(context.commit_encoding); - unuse_commit_buffer(commit, context.message); + repo_unuse_commit_buffer(r, commit, context.message); } static void pp_header(struct pretty_print_context *pp, diff --git a/pretty.h b/pretty.h index 7359d318a9..e6625269cf 100644 --- a/pretty.h +++ b/pretty.h @@ -103,9 +103,14 @@ void pp_remainder(struct pretty_print_context *pp, const char **msg_p, * Put the result to "sb". * Please use this function for custom formats. */ -void format_commit_message(const struct commit *commit, +void repo_format_commit_message(struct repository *r, + const struct commit *commit, const char *format, struct strbuf *sb, const struct pretty_print_context *context); +#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS +#define format_commit_message(c, f, s, con) \ + repo_format_commit_message(the_repository, c, f, s, con) +#endif /* * Parse given arguments from "arg", check it for correctness and -- 2.19.0
[PATCH 14/19] commit: prepare repo_unuse_commit_buffer to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.c| 6 -- commit.h| 7 ++- contrib/coccinelle/the_repository.cocci | 8 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/commit.c b/commit.c index eca9a475c7..526b33758d 100644 --- a/commit.c +++ b/commit.c @@ -318,10 +318,12 @@ const void *repo_get_commit_buffer(struct repository *r, return ret; } -void unuse_commit_buffer(const struct commit *commit, const void *buffer) +void repo_unuse_commit_buffer(struct repository *r, + const struct commit *commit, + const void *buffer) { struct commit_buffer *v = buffer_slab_peek( - the_repository->parsed_objects->buffer_slab, commit); + r->parsed_objects->buffer_slab, commit); if (!(v && v->buffer == buffer)) free((void *)buffer); } diff --git a/commit.h b/commit.h index fead381651..0976bf2562 100644 --- a/commit.h +++ b/commit.h @@ -130,7 +130,12 @@ const void *repo_get_commit_buffer(struct repository *r, * from an earlier call to get_commit_buffer. The buffer may or may not be * freed by this call; callers should not access the memory afterwards. */ -void unuse_commit_buffer(const struct commit *, const void *buffer); +void repo_unuse_commit_buffer(struct repository *r, + const struct commit *, + const void *buffer); +#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS +#define unuse_commit_buffer(c, b) repo_unuse_commit_buffer(the_repository, c, b) +#endif /* * Free any cached object buffer associated with the commit. diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci index ec579682f6..8c07185195 100644 --- a/contrib/coccinelle/the_repository.cocci +++ b/contrib/coccinelle/the_repository.cocci @@ -69,6 +69,14 @@ expression F; + repo_get_commit_buffer(the_repository, E, F); +@@ +expression E; +expression F; +@@ +- unuse_commit_buffer( ++ repo_unuse_commit_buffer(the_repository, + E, F); + @@ expression E; expression F; -- 2.19.0
[PATCH 15/19] commit: prepare logmsg_reencode to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.h| 8 contrib/coccinelle/the_repository.cocci | 9 + pretty.c| 13 +++-- 3 files changed, 24 insertions(+), 6 deletions(-) diff --git a/commit.h b/commit.h index 0976bf2562..61b05ddd91 100644 --- a/commit.h +++ b/commit.h @@ -180,6 +180,14 @@ extern int has_non_ascii(const char *text); extern const char *logmsg_reencode(const struct commit *commit, char **commit_encoding, const char *output_encoding); +const char *repo_logmsg_reencode(struct repository *r, +const struct commit *commit, +char **commit_encoding, +const char *output_encoding); +#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS +#define logmsg_reencode(c, enc, out) repo_logmsg_reencode(the_repository, c, enc, out) +#endif + extern const char *skip_blank_lines(const char *msg); /** Removes the first commit from a list sorted by date, and adds all diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci index 8c07185195..c81708bb73 100644 --- a/contrib/coccinelle/the_repository.cocci +++ b/contrib/coccinelle/the_repository.cocci @@ -93,3 +93,12 @@ expression G; - in_merge_bases_many( + repo_in_merge_bases_many(the_repository, E, F, G); + +@@ +expression E; +expression F; +expression G; +@@ +- logmsg_reencode( ++ repo_logmsg_reencode(the_repository, + E, F, G); diff --git a/pretty.c b/pretty.c index 98cf5228f9..26e44472bb 100644 --- a/pretty.c +++ b/pretty.c @@ -595,14 +595,15 @@ static char *replace_encoding_header(char *buf, const char *encoding) return strbuf_detach(&tmp, NULL); } -const char *logmsg_reencode(const struct commit *commit, - char **commit_encoding, - const char *output_encoding) +const char *repo_logmsg_reencode(struct repository *r, +const struct commit *commit, +char **commit_encoding, +const char *output_encoding) { static const char *utf8 = "UTF-8"; const char *use_encoding; char *encoding; - const char *msg = get_commit_buffer(commit, NULL); + const char *msg = repo_get_commit_buffer(r, commit, NULL); char *out; if (!output_encoding || !*output_encoding) { @@ -630,7 +631,7 @@ const char *logmsg_reencode(const struct commit *commit, * the cached copy from get_commit_buffer, we need to duplicate it * to avoid munging the cached copy. */ - if (msg == get_cached_commit_buffer(the_repository, commit, NULL)) + if (msg == get_cached_commit_buffer(r, commit, NULL)) out = xstrdup(msg); else out = (char *)msg; @@ -644,7 +645,7 @@ const char *logmsg_reencode(const struct commit *commit, */ out = reencode_string(msg, output_encoding, use_encoding); if (out) - unuse_commit_buffer(commit, msg); + repo_unuse_commit_buffer(r, commit, msg); } /* -- 2.19.0
[PATCH 18/19] submodule: don't add submodule as odb for push
The submodule was added as an alternative in eb21c732d6 (push: teach --recurse-submodules the on-demand option, 2012-03-29), but was not explained, why. In similar code, submodule_has_commits, the submodule is added as an alternative to perform a quick check if we need to dive into the submodule. However in push_submodule (a) for_each_remote_ref_submodule will also provide the quick check and (b) after that we don't need to have submodule objects around, as all further code is to spawn a separate process. Signed-off-by: Stefan Beller --- submodule.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/submodule.c b/submodule.c index 5e1a6c0b7c..f70d75ef45 100644 --- a/submodule.c +++ b/submodule.c @@ -1006,9 +1006,6 @@ static int push_submodule(const char *path, const struct string_list *push_options, int dry_run) { - if (add_submodule_odb(path)) - return 1; - if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) { struct child_process cp = CHILD_PROCESS_INIT; argv_array_push(&cp.args, "push"); -- 2.19.0
[PATCH 06/19] commit: allow parse_commit* to handle arbitrary repositories
Just like the previous commit, parse_commit and friends are used a lot and are found in new patches, so we cannot change their signature easily. Re-introduce these function prefixed with 'repo_' that take a repository argument and keep the original as a shallow macro. Signed-off-by: Stefan Beller --- commit.c| 18 +++--- commit.h| 17 + contrib/coccinelle/the_repository.cocci | 24 3 files changed, 48 insertions(+), 11 deletions(-) diff --git a/commit.c b/commit.c index 449c1f4920..c6aeedc3d8 100644 --- a/commit.c +++ b/commit.c @@ -443,7 +443,10 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b return 0; } -int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_commit_graph) +int repo_parse_commit_internal(struct repository *r, + struct commit *item, + int quiet_on_missing, + int use_commit_graph) { enum object_type type; void *buffer; @@ -454,9 +457,9 @@ int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com return -1; if (item->object.parsed) return 0; - if (use_commit_graph && parse_commit_in_graph(the_repository, item)) + if (use_commit_graph && parse_commit_in_graph(r, item)) return 0; - buffer = read_object_file(&item->object.oid, &type, &size); + buffer = repo_read_object_file(r, &item->object.oid, &type, &size); if (!buffer) return quiet_on_missing ? -1 : error("Could not read %s", @@ -467,18 +470,19 @@ int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com oid_to_hex(&item->object.oid)); } - ret = parse_commit_buffer(the_repository, item, buffer, size, 0); + ret = parse_commit_buffer(r, item, buffer, size, 0); if (save_commit_buffer && !ret) { - set_commit_buffer(the_repository, item, buffer, size); + set_commit_buffer(r, item, buffer, size); return 0; } free(buffer); return ret; } -int parse_commit_gently(struct commit *item, int quiet_on_missing) +int repo_parse_commit_gently(struct repository *r, +struct commit *item, int quiet_on_missing) { - return parse_commit_internal(item, quiet_on_missing, 1); + return repo_parse_commit_internal(r, item, quiet_on_missing, 1); } void parse_commit_or_die(struct commit *item) diff --git a/commit.h b/commit.h index da0db36eba..b8d1f6728f 100644 --- a/commit.h +++ b/commit.h @@ -79,12 +79,21 @@ struct commit *lookup_commit_reference_by_name(const char *name); struct commit *lookup_commit_or_die(const struct object_id *oid, const char *ref_name); int parse_commit_buffer(struct repository *r, struct commit *item, const void *buffer, unsigned long size, int check_graph); -int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_commit_graph); -int parse_commit_gently(struct commit *item, int quiet_on_missing); -static inline int parse_commit(struct commit *item) +int repo_parse_commit_internal(struct repository *r, struct commit *item, + int quiet_on_missing, int use_commit_graph); +int repo_parse_commit_gently(struct repository *r, +struct commit *item, +int quiet_on_missing); +static inline int repo_parse_commit(struct repository *r, struct commit *item) { - return parse_commit_gently(item, 0); + return repo_parse_commit_gently(r, item, 0); } +#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS +#define parse_commit_internal(item, quiet, use) repo_parse_commit_internal(the_repository, item, quiet, use) +#define parse_commit_gently(item, quiet) repo_parse_commit_gently(the_repository, item, quiet) +#define parse_commit(item) repo_parse_commit(the_repository, item) +#endif + void parse_commit_or_die(struct commit *item); struct buffer_slab; diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci index 3c7fa70502..7189a7293a 100644 --- a/contrib/coccinelle/the_repository.cocci +++ b/contrib/coccinelle/the_repository.cocci @@ -11,3 +11,27 @@ expression G; + repo_read_object_file(the_repository, E, F, G) +@@ +expression E; +expression F; +expression G; +@@ +- parse_commit_internal( ++ repo_parse_commit_internal(the_repository, + E, F, G) + +@@ +expression E; +expression F; +@@ +- parse_commit_gently( ++ repo_parse_commit_gently(the_repository, + E, F) + +@@ +expression E; +@@ +- parse_commit( ++ repo_parse_commit(the_repository, + E) + -- 2.19.0
[PATCH 12/19] commit: prepare get_commit_buffer to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.c| 8 +--- commit.h| 7 ++- contrib/coccinelle/the_repository.cocci | 7 +++ 3 files changed, 18 insertions(+), 4 deletions(-) diff --git a/commit.c b/commit.c index 2733bef019..31f2ca4c78 100644 --- a/commit.c +++ b/commit.c @@ -297,13 +297,15 @@ const void *get_cached_commit_buffer(struct repository *r, const struct commit * return v->buffer; } -const void *get_commit_buffer(const struct commit *commit, unsigned long *sizep) +const void *repo_get_commit_buffer(struct repository *r, + const struct commit *commit, + unsigned long *sizep) { - const void *ret = get_cached_commit_buffer(the_repository, commit, sizep); + const void *ret = get_cached_commit_buffer(r, commit, sizep); if (!ret) { enum object_type type; unsigned long size; - ret = read_object_file(&commit->object.oid, &type, &size); + ret = repo_read_object_file(r, &commit->object.oid, &type, &size); if (!ret) die("cannot read commit object %s", oid_to_hex(&commit->object.oid)); diff --git a/commit.h b/commit.h index f311911785..89b245be03 100644 --- a/commit.h +++ b/commit.h @@ -117,7 +117,12 @@ const void *get_cached_commit_buffer(struct repository *, const struct commit *, * from disk. The resulting memory should not be modified, and must be given * to unuse_commit_buffer when the caller is done. */ -const void *get_commit_buffer(const struct commit *, unsigned long *size); +const void *repo_get_commit_buffer(struct repository *r, + const struct commit *, + unsigned long *size); +#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS +#define get_commit_buffer(c, s) repo_get_commit_buffer(the_repository, c, s) +#endif /* * Tell the commit subsytem that we are done with a particular commit buffer. diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci index 7814f8fa1c..6dad83f17b 100644 --- a/contrib/coccinelle/the_repository.cocci +++ b/contrib/coccinelle/the_repository.cocci @@ -61,3 +61,10 @@ expression G; + repo_get_merge_bases_many_dirty(the_repository, E, F, G); +@@ +expression E; +expression F; +@@ +- get_commit_buffer( ++ repo_get_commit_buffer(the_repository, + E, F); -- 2.19.0
[PATCH 10/19] commit: allow get_merge_bases_many_0 to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/commit.c b/commit.c index f8a8844a72..b36c2aa0bf 100644 --- a/commit.c +++ b/commit.c @@ -1055,7 +1055,8 @@ static int remove_redundant(struct repository *r, struct commit **array, int cnt return filled; } -static struct commit_list *get_merge_bases_many_0(struct commit *one, +static struct commit_list *get_merge_bases_many_0(struct repository *r, + struct commit *one, int n, struct commit **twos, int cleanup) @@ -1065,7 +1066,7 @@ static struct commit_list *get_merge_bases_many_0(struct commit *one, struct commit_list *result; int cnt, i; - result = merge_bases_many(the_repository, one, n, twos); + result = merge_bases_many(r, one, n, twos); for (i = 0; i < n; i++) { if (one == twos[i]) return result; @@ -1088,7 +1089,7 @@ static struct commit_list *get_merge_bases_many_0(struct commit *one, clear_commit_marks(one, all_flags); clear_commit_marks_many(n, twos, all_flags); - cnt = remove_redundant(the_repository, rslt, cnt); + cnt = remove_redundant(r, rslt, cnt); result = NULL; for (i = 0; i < cnt; i++) commit_list_insert_by_date(rslt[i], &result); @@ -1100,19 +1101,19 @@ struct commit_list *get_merge_bases_many(struct commit *one, int n, struct commit **twos) { - return get_merge_bases_many_0(one, n, twos, 1); + return get_merge_bases_many_0(the_repository, one, n, twos, 1); } struct commit_list *get_merge_bases_many_dirty(struct commit *one, int n, struct commit **twos) { - return get_merge_bases_many_0(one, n, twos, 0); + return get_merge_bases_many_0(the_repository, one, n, twos, 0); } struct commit_list *get_merge_bases(struct commit *one, struct commit *two) { - return get_merge_bases_many_0(one, 1, &two, 1); + return get_merge_bases_many_0(the_repository, one, 1, &two, 1); } /* -- 2.19.0
[PATCH 02/19] packfile: allow has_packed_and_bad to handle arbitrary repositories
has_packed_and_bad is not widely used, so just migrate it all at once. Signed-off-by: Stefan Beller --- packfile.c | 5 +++-- packfile.h | 2 +- sha1-file.c | 2 +- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/packfile.c b/packfile.c index ebcb5742ec..40085fe160 100644 --- a/packfile.c +++ b/packfile.c @@ -1024,12 +1024,13 @@ void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1) p->num_bad_objects++; } -const struct packed_git *has_packed_and_bad(const unsigned char *sha1) +const struct packed_git *has_packed_and_bad(struct repository *r, + const unsigned char *sha1) { struct packed_git *p; unsigned i; - for (p = the_repository->objects->packed_git; p; p = p->next) + for (p = r->objects->packed_git; p; p = p->next) for (i = 0; i < p->num_bad_objects; i++) if (!hashcmp(sha1, p->bad_object_sha1 + the_hash_algo->rawsz * i)) diff --git a/packfile.h b/packfile.h index 630f35cf31..1d2d170bf8 100644 --- a/packfile.h +++ b/packfile.h @@ -136,7 +136,7 @@ extern int packed_object_info(struct repository *r, off_t offset, struct object_info *); extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1); -extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1); +extern const struct packed_git *has_packed_and_bad(struct repository *r, const unsigned char *sha1); /* * Iff a pack file in the given repository contains the object named by sha1, diff --git a/sha1-file.c b/sha1-file.c index 647068a836..13b3c5cb79 100644 --- a/sha1-file.c +++ b/sha1-file.c @@ -1432,7 +1432,7 @@ void *read_object_file_extended(const struct object_id *oid, die(_("loose object %s (stored in %s) is corrupt"), oid_to_hex(repl), path); - if ((p = has_packed_and_bad(repl->hash)) != NULL) + if ((p = has_packed_and_bad(the_repository, repl->hash)) != NULL) die(_("packed object %s (stored in %s) is corrupt"), oid_to_hex(repl), p->pack_name); -- 2.19.0
[PATCH 08/19] commit.c: allow merge_bases_many to handle arbitrary repositories
Signed-off-by: Stefan Beller --- commit.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/commit.c b/commit.c index f493a82f72..5e8791f0c1 100644 --- a/commit.c +++ b/commit.c @@ -934,7 +934,9 @@ static struct commit_list *paint_down_to_common(struct repository *r, return result; } -static struct commit_list *merge_bases_many(struct commit *one, int n, struct commit **twos) +static struct commit_list *merge_bases_many(struct repository *r, + struct commit *one, int n, + struct commit **twos) { struct commit_list *list = NULL; struct commit_list *result = NULL; @@ -949,14 +951,14 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co return commit_list_insert(one, &result); } - if (parse_commit(one)) + if (repo_parse_commit(r, one)) return NULL; for (i = 0; i < n; i++) { - if (parse_commit(twos[i])) + if (repo_parse_commit(r, twos[i])) return NULL; } - list = paint_down_to_common(the_repository, one, n, twos, 0); + list = paint_down_to_common(r, one, n, twos, 0); while (list) { struct commit *commit = pop_commit(&list); @@ -1063,7 +1065,7 @@ static struct commit_list *get_merge_bases_many_0(struct commit *one, struct commit_list *result; int cnt, i; - result = merge_bases_many(one, n, twos); + result = merge_bases_many(the_repository, one, n, twos); for (i = 0; i < n; i++) { if (one == twos[i]) return result; -- 2.19.0
[PATCH 11/19] commit: prepare get_merge_bases to handle arbitrary repositories
Similarly to previous patches, the get_merge_base functions are used often in the code base, which makes migrating them hard. Implement the new functions, prefixed with 'repo_' and hide the old functions behind a wrapper macro. Signed-off-by: Stefan Beller --- commit.c| 24 +-- commit.h| 20 ++- contrib/coccinelle/the_repository.cocci | 26 + 3 files changed, 55 insertions(+), 15 deletions(-) diff --git a/commit.c b/commit.c index b36c2aa0bf..2733bef019 100644 --- a/commit.c +++ b/commit.c @@ -1097,23 +1097,27 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r, return result; } -struct commit_list *get_merge_bases_many(struct commit *one, -int n, -struct commit **twos) +struct commit_list *repo_get_merge_bases_many(struct repository *r, + struct commit *one, + int n, + struct commit **twos) { - return get_merge_bases_many_0(the_repository, one, n, twos, 1); + return get_merge_bases_many_0(r, one, n, twos, 1); } -struct commit_list *get_merge_bases_many_dirty(struct commit *one, - int n, - struct commit **twos) +struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r, + struct commit *one, + int n, + struct commit **twos) { - return get_merge_bases_many_0(the_repository, one, n, twos, 0); + return get_merge_bases_many_0(r, one, n, twos, 0); } -struct commit_list *get_merge_bases(struct commit *one, struct commit *two) +struct commit_list *repo_get_merge_bases(struct repository *r, +struct commit *one, +struct commit *two) { - return get_merge_bases_many_0(the_repository, one, 1, &two, 1); + return get_merge_bases_many_0(r, one, 1, &two, 1); } /* diff --git a/commit.h b/commit.h index b8d1f6728f..f311911785 100644 --- a/commit.h +++ b/commit.h @@ -213,12 +213,22 @@ struct commit_graft *read_graft_line(struct strbuf *line); int register_commit_graft(struct repository *r, struct commit_graft *, int); struct commit_graft *lookup_commit_graft(struct repository *r, const struct object_id *oid); -extern struct commit_list *get_merge_bases(struct commit *rev1, struct commit *rev2); -extern struct commit_list *get_merge_bases_many(struct commit *one, int n, struct commit **twos); -extern struct commit_list *get_octopus_merge_bases(struct commit_list *in); - +struct commit_list *repo_get_merge_bases(struct repository *r, +struct commit *rev1, +struct commit *rev2); +struct commit_list *repo_get_merge_bases_many(struct repository *r, + struct commit *one, int n, + struct commit **twos); /* To be used only when object flags after this call no longer matter */ -extern struct commit_list *get_merge_bases_many_dirty(struct commit *one, int n, struct commit **twos); +struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r, + struct commit *one, int n, + struct commit **twos); +#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS +#define get_merge_bases(r1, r2) repo_get_merge_bases(the_repository, r1, r2) +#define get_merge_bases_many(one, n, two) repo_get_merge_bases_many(the_repository, one, n, two) +#define get_merge_bases_many_dirty(one, n, twos) repo_get_merge_bases_many_dirty(the_repository, one, n, twos) +#endif +extern struct commit_list *get_octopus_merge_bases(struct commit_list *in); /* largest positive number a signed 32-bit integer can contain */ #define INFINITE_DEPTH 0x7fff diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci index 7189a7293a..7814f8fa1c 100644 --- a/contrib/coccinelle/the_repository.cocci +++ b/contrib/coccinelle/the_repository.cocci @@ -35,3 +35,29 @@ expression E; + repo_parse_commit(the_repository, E) +@@ +expression E; +expression F; +@@ +- get_merge_bases( ++ repo_get_merge_bases(the_repository, + E, F); + +@@ +expression E; +expression F; +expression G; +@@ +- get_merge_bases_many( ++ repo_get_merge_bases_many(the_repository, + E, F, G); + +@@ +expression E; +expression F; +expression G; +@@ +- get_merge_bases_many_dirty( ++ repo_get_merge_bases_many_dirty(the_repository, + E, F, G);
[PATCH 07/19] commit.c: allow paint_down_to_common to handle arbitrary repositories
As the function is file local and not widely used, migrate it all at once. Signed-off-by: Stefan Beller --- commit.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/commit.c b/commit.c index c6aeedc3d8..f493a82f72 100644 --- a/commit.c +++ b/commit.c @@ -869,7 +869,8 @@ static int queue_has_nonstale(struct prio_queue *queue) } /* all input commits in one and twos[] must have been parsed! */ -static struct commit_list *paint_down_to_common(struct commit *one, int n, +static struct commit_list *paint_down_to_common(struct repository *r, + struct commit *one, int n, struct commit **twos, int min_generation) { @@ -922,7 +923,7 @@ static struct commit_list *paint_down_to_common(struct commit *one, int n, parents = parents->next; if ((p->object.flags & flags) == flags) continue; - if (parse_commit(p)) + if (repo_parse_commit(r, p)) return NULL; p->object.flags |= flags; prio_queue_put(&queue, p); @@ -955,7 +956,7 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co return NULL; } - list = paint_down_to_common(one, n, twos, 0); + list = paint_down_to_common(the_repository, one, n, twos, 0); while (list) { struct commit *commit = pop_commit(&list); @@ -1026,8 +1027,8 @@ static int remove_redundant(struct commit **array, int cnt) if (array[j]->generation < min_generation) min_generation = array[j]->generation; } - common = paint_down_to_common(array[i], filled, work, - min_generation); + common = paint_down_to_common(the_repository, array[i], filled, + work, min_generation); if (array[i]->object.flags & PARENT2) redundant[i] = 1; for (j = 0; j < filled; j++) @@ -1151,7 +1152,9 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit * if (commit->generation > min_generation) return ret; - bases = paint_down_to_common(commit, nr_reference, reference, commit->generation); + bases = paint_down_to_common(the_repository, commit, +nr_reference, reference, +commit->generation); if (commit->object.flags & PARENT2) ret = 1; clear_commit_marks(commit, all_flags); -- 2.19.0
[PATCH 05/19] object: parse_object to honor its repository argument
In 8e4b0b6047 (object.c: allow parse_object to handle arbitrary repositories, 2018-06-28), we forgot to pass the repository down to the read_object_file. Signed-off-by: Stefan Beller --- object.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/object.c b/object.c index 51c4594515..61f49d8b99 100644 --- a/object.c +++ b/object.c @@ -270,7 +270,7 @@ struct object *parse_object(struct repository *r, const struct object_id *oid) return lookup_object(r, oid->hash); } - buffer = read_object_file(oid, &type, &size); + buffer = repo_read_object_file(r, oid, &type, &size); if (buffer) { if (check_object_signature(repl, buffer, size, type_name(type)) < 0) { free(buffer); -- 2.19.0
[RFC PATCH 00/19] Bring more repository handles into our code base
This applies on nd/the-index (b3c7eef9b05) and is the logical continuation of the object store series, which I sent over the last year. The previous series did take a very slow and pedantic approach, using a #define trick, see cfc62fc98c for details, but it turns out, that it doesn't work: When changing the signature of widely used functions, it burdens the maintainer in resolving the semantic conflicts. In the orginal approach this was called a feature, as then we can ensure that not bugs creep into the code base during the merge window (while such a refactoring series wanders from pu to master). It turns out this was not well received and was just burdensome. The #define trick doesn't buy us much to begin with when dealing with non-merge-conflicts. For example, see deref_tag at tag.c:68, which got the repository argument in 286d258d4f (tag.c: allow deref_tag to handle arbitrary repositories, 2018-06-28) but lost its property of working on any repository while 8c4cc32689 (tag: don't warn if target is missing but promised, 2018-07-12) was in flight simultaneously. Another example of failure of this approach is seen in patch 5, which shows that the pedantry was missed. This series takes another approach as it doesn't change the signature of functions, but introduces new functions that can deal with arbitrary repositories, keeping the old function signature around using a shallow wrapper. Additionally each patch adds a semantic patch, that would port from the old to the new function. These semantic patches are all applied in the very last patch, but we could omit applying the last patch if it causes too many merge conflicts and trickl in the semantic patches over time when there are no merge conflicts. The original goal of all these refactoring series was to remove add_submodule_odb in submodule.c, which was partially reached with this series. I'll investigate the remaining calls in another series, but it shows we're close to be done with these large refactorings as far as I am concerned. Thanks, Stefan Stefan Beller (19): sha1_file: allow read_object to read objects in arbitrary repositories packfile: allow has_packed_and_bad to handle arbitrary repositories object-store: allow read_object_file_extended to read from arbitrary repositories object-store: prepare read_object_file to deal with arbitrary repositories object: parse_object to honor its repository argument commit: allow parse_commit* to handle arbitrary repositories commit.c: allow paint_down_to_common to handle arbitrary repositories commit.c: allow merge_bases_many to handle arbitrary repositories commit.c: allow remove_redundant to handle arbitrary repositories commit: allow get_merge_bases_many_0 to handle arbitrary repositories commit: prepare get_merge_bases to handle arbitrary repositories commit: prepare get_commit_buffer to handle arbitrary repositories commit: prepare in_merge_bases[_many] to handle arbitrary repositories commit: prepare repo_unuse_commit_buffer to handle arbitrary repositories commit: prepare logmsg_reencode to handle arbitrary repositories pretty: prepare format_commit_message to handle arbitrary repositories submodule: use submodule repos for object lookup submodule: don't add submodule as odb for push Apply semantic patches from previous patches apply.c | 6 +- archive.c | 5 +- bisect.c| 5 +- blame.c | 15 +-- builtin/am.c| 2 +- builtin/blame.c | 4 +- builtin/cat-file.c | 21 +++-- builtin/checkout.c | 4 +- builtin/commit.c| 13 ++- builtin/describe.c | 4 +- builtin/difftool.c | 3 +- builtin/fast-export.c | 7 +- builtin/fmt-merge-msg.c | 8 +- builtin/grep.c | 2 +- builtin/index-pack.c| 8 +- builtin/log.c | 4 +- builtin/merge-base.c| 2 +- builtin/merge-tree.c| 9 +- builtin/mktag.c | 3 +- builtin/name-rev.c | 2 +- builtin/notes.c | 12 ++- builtin/pack-objects.c | 22 +++-- builtin/reflog.c| 5 +- builtin/replace.c | 2 +- builtin/shortlog.c | 5 +- builtin/show-branch.c | 4 +- builtin/tag.c | 4 +- builtin/unpack-file.c | 2 +- builtin/unpack-objects.c| 3 +- builtin/verify-commit.c | 2 +- bundle.c| 2 +- combine-diff.c
[PATCH 03/19] object-store: allow read_object_file_extended to read from arbitrary repositories
read_object_file_extended is not widely used, so migrate it all at once. Signed-off-by: Stefan Beller --- object-store.h | 5 +++-- sha1-file.c| 11 ++- streaming.c| 2 +- 3 files changed, 10 insertions(+), 8 deletions(-) diff --git a/object-store.h b/object-store.h index 67e66227d9..6bb0ccbf05 100644 --- a/object-store.h +++ b/object-store.h @@ -146,12 +146,13 @@ void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned cha void *map_sha1_file(struct repository *r, const unsigned char *sha1, unsigned long *size); -extern void *read_object_file_extended(const struct object_id *oid, +extern void *read_object_file_extended(struct repository *r, + const struct object_id *oid, enum object_type *type, unsigned long *size, int lookup_replace); static inline void *read_object_file(const struct object_id *oid, enum object_type *type, unsigned long *size) { - return read_object_file_extended(oid, type, size, 1); + return read_object_file_extended(the_repository, oid, type, size, 1); } /* Read and unpack an object file into memory, write memory to an object file */ diff --git a/sha1-file.c b/sha1-file.c index 13b3c5cb79..ce47524679 100644 --- a/sha1-file.c +++ b/sha1-file.c @@ -1403,7 +1403,8 @@ int pretend_object_file(void *buf, unsigned long len, enum object_type type, * deal with them should arrange to call read_object() and give error * messages themselves. */ -void *read_object_file_extended(const struct object_id *oid, +void *read_object_file_extended(struct repository *r, + const struct object_id *oid, enum object_type *type, unsigned long *size, int lookup_replace) @@ -1413,10 +1414,10 @@ void *read_object_file_extended(const struct object_id *oid, const char *path; struct stat st; const struct object_id *repl = lookup_replace ? - lookup_replace_object(the_repository, oid) : oid; + lookup_replace_object(r, oid) : oid; errno = 0; - data = read_object(the_repository, repl->hash, type, size); + data = read_object(r, repl->hash, type, size); if (data) return data; @@ -1428,11 +1429,11 @@ void *read_object_file_extended(const struct object_id *oid, die(_("replacement %s not found for %s"), oid_to_hex(repl), oid_to_hex(oid)); - if (!stat_sha1_file(the_repository, repl->hash, &st, &path)) + if (!stat_sha1_file(r, repl->hash, &st, &path)) die(_("loose object %s (stored in %s) is corrupt"), oid_to_hex(repl), path); - if ((p = has_packed_and_bad(the_repository, repl->hash)) != NULL) + if ((p = has_packed_and_bad(r, repl->hash)) != NULL) die(_("packed object %s (stored in %s) is corrupt"), oid_to_hex(repl), p->pack_name); diff --git a/streaming.c b/streaming.c index d1e6b2dce6..c843a1230f 100644 --- a/streaming.c +++ b/streaming.c @@ -490,7 +490,7 @@ static struct stream_vtbl incore_vtbl = { static open_method_decl(incore) { - st->u.incore.buf = read_object_file_extended(oid, type, &st->size, 0); + st->u.incore.buf = read_object_file_extended(the_repository, oid, type, &st->size, 0); st->u.incore.read_ptr = 0; st->vtbl = &incore_vtbl; -- 2.19.0
[PATCH 04/19] object-store: prepare read_object_file to deal with arbitrary repositories
As read_object_file is a widely used function (which is also regularly used in new code in flight between master..pu), changing its signature is painful is hard, as other series in flight rely on the original signature. It would burden the maintainer if we'd just change the signature. Introduce repo_read_object_file which takes the repository argument, and hide the original read_object_file as a macro behind NO_THE_REPOSITORY_COMPATIBILITY_MACROS, which we planned for in e675765235 (diff.c: remove implicit dependency on the_index, 2018-09-21) Add a coccinelle patch to convert existing callers, but do not apply the resulting patch from 'make coccicheck' to keep the diff of this patch small. Signed-off-by: Stefan Beller --- contrib/coccinelle/the_repository.cocci | 13 + object-store.h | 10 -- 2 files changed, 21 insertions(+), 2 deletions(-) create mode 100644 contrib/coccinelle/the_repository.cocci diff --git a/contrib/coccinelle/the_repository.cocci b/contrib/coccinelle/the_repository.cocci new file mode 100644 index 00..3c7fa70502 --- /dev/null +++ b/contrib/coccinelle/the_repository.cocci @@ -0,0 +1,13 @@ +// This file is used for the ongoing refactoring of +// bringing the index or repository struct in all of +// our code base. + +@@ +expression E; +expression F; +expression G; +@@ +- read_object_file( ++ repo_read_object_file(the_repository, + E, F, G) + diff --git a/object-store.h b/object-store.h index 6bb0ccbf05..41ceebca48 100644 --- a/object-store.h +++ b/object-store.h @@ -150,10 +150,16 @@ extern void *read_object_file_extended(struct repository *r, const struct object_id *oid, enum object_type *type, unsigned long *size, int lookup_replace); -static inline void *read_object_file(const struct object_id *oid, enum object_type *type, unsigned long *size) +static inline void *repo_read_object_file(struct repository *r, + const struct object_id *oid, + enum object_type *type, + unsigned long *size) { - return read_object_file_extended(the_repository, oid, type, size, 1); + return read_object_file_extended(r, oid, type, size, 1); } +#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS +#define read_object_file(oid, type, size) repo_read_object_file(the_repository, oid, type, size) +#endif /* Read and unpack an object file into memory, write memory to an object file */ int oid_object_info(struct repository *r, const struct object_id *, unsigned long *); -- 2.19.0
[PATCH 01/19] sha1_file: allow read_object to read objects in arbitrary repositories
Allow read_object (a file local functon in sha1_file) to handle arbitrary repositories by passing the repository down to oid_object_info_extended. Signed-off-by: Stefan Beller --- sha1-file.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/sha1-file.c b/sha1-file.c index 308d5e20e2..647068a836 100644 --- a/sha1-file.c +++ b/sha1-file.c @@ -1361,7 +1361,9 @@ int oid_object_info(struct repository *r, return type; } -static void *read_object(const unsigned char *sha1, enum object_type *type, +static void *read_object(struct repository *r, +const unsigned char *sha1, +enum object_type *type, unsigned long *size) { struct object_id oid; @@ -1373,7 +1375,7 @@ static void *read_object(const unsigned char *sha1, enum object_type *type, hashcpy(oid.hash, sha1); - if (oid_object_info_extended(the_repository, &oid, &oi, 0) < 0) + if (oid_object_info_extended(r, &oid, &oi, 0) < 0) return NULL; return content; } @@ -1414,7 +1416,7 @@ void *read_object_file_extended(const struct object_id *oid, lookup_replace_object(the_repository, oid) : oid; errno = 0; - data = read_object(repl->hash, type, size); + data = read_object(the_repository, repl->hash, type, size); if (data) return data; @@ -1755,7 +1757,7 @@ int force_object_loose(const struct object_id *oid, time_t mtime) if (has_loose_object(oid)) return 0; - buf = read_object(oid->hash, &type, &len); + buf = read_object(the_repository, oid->hash, &type, &len); if (!buf) return error(_("cannot read sha1_file for %s"), oid_to_hex(oid)); hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", type_name(type), len) + 1; -- 2.19.0
Re: [PATCH v2 1/1] branch: introduce --show-current display option
On Thu, Oct 11, 2018 at 09:35:28PM +0100, Rafael Ascensão wrote: > On Thu, Oct 11, 2018 at 01:51:36PM -0400, Jeff King wrote: > > Yeah, I agree. > > Not sure which parts you meant, so I'll assume you didn't agree > with me. Correct. ;) I like your general idea, but I agree with Daniel that it introduces an inconsistency in the interface. > I doesn't seem far fetched to ask for an overview of my current branch, > feature1, feature2 and all hotfixes with something like: > > $ git branch --verbose --list HEAD feature1 feature2 hotfix-*; > > The 'what's my current branch' could be just a particular case of this > form. Right, I like that part. It's just that putting "HEAD" there already has a meaning: it would find refs/heads/HEAD. Now I'll grant that's a bad name for a branch (and the source of other confusions, and I think perhaps even something a few commands actively discourage these days). > My defense to treat HEAD specially comes in the form that from the user > perspective, HEAD is already being resolved to a commit when HEAD is > detached (Showing the detached at message.) > > Is there a strong reason to *not* "resolve" HEAD when it is attached? > Would it be that bad to have some DWIM behaviour here? After all, as > HEAD is an invalid name for a branch, nothing would ever match it > anyways. I don't think this is about resolving HEAD, or showing it. It's about the fact that arguments to "branch" are currently always branch-names, not full refs. -Peff
Re: [PATCH v2 1/1] branch: introduce --show-current display option
On 10/11/18 10:35 PM, Rafael Ascensão wrote: > The output of the proposed command is also a bit inconsistent with the > usual output given by git branch, specifically the space alignment on > the left, color and * marker. The proposed command therefore takes a new switch. It's definitely not perfect, but doesn't give the existing --list new and different behavior. > At this stage it's closer to what I would expect from > $git rev-parse --abbrev-ref HEAD; The proposal is largely to have similar output to that command, yes. I expect that "show current branch" is something that's available in the branch command, even completely disregarding questions of whether it's API stable, etc.
Re: [PATCH v2 1/1] branch: introduce --show-current display option
On Thu, Oct 11, 2018 at 06:36:02PM +0200, Daniels Umanovskis wrote: > I am not a fan because it would be yet another inconsistency in the Git > command interface. The output of the proposed command is also a bit inconsistent with the usual output given by git branch, specifically the space alignment on the left, color and * marker. In addition to not respecting --color, it also ignores --verbose and --format. At this stage it's closer to what I would expect from $git rev-parse --abbrev-ref HEAD; than something coming out of $git branch; Resolving HEAD makes it consistent with rest. On Thu, Oct 11, 2018 at 01:51:36PM -0400, Jeff King wrote: > Yeah, I agree. Not sure which parts you meant, so I'll assume you didn't agree with me. I doesn't seem far fetched to ask for an overview of my current branch, feature1, feature2 and all hotfixes with something like: $ git branch --verbose --list HEAD feature1 feature2 hotfix-*; The 'what's my current branch' could be just a particular case of this form. My defense to treat HEAD specially comes in the form that from the user perspective, HEAD is already being resolved to a commit when HEAD is detached (Showing the detached at message.) Is there a strong reason to *not* "resolve" HEAD when it is attached? Would it be that bad to have some DWIM behaviour here? After all, as HEAD is an invalid name for a branch, nothing would ever match it anyways. Thanks for the input. :) -- Cheers Rafael Ascensão
[PATCH v2 3/4] subtree: use commits before rejoins for splits
From: "Strain, Roger L" Adds recursive evaluation of parent commits which were not part of the initial commit list when performing a split. Split expects all relevant commits to be reachable from the target commit but not reachable from any previous rejoins. However, a branch could be based on a commit prior to a rejoin, then later merged back into the current code. In this case, a parent to the commit will not be present in the initial list of commits, trigging an "incorrect order" warning. Previous behavior was to consider that commit to have no parent, creating an original commit containing all subtree content. This commit is not present in an existing subtree commit graph, changing commit hashes and making pushing to a subtree repo impossible. New behavior will recursively check these unexpected parent commits to track them back to either an earlier rejoin, or a true original commit. The generated synthetic commits will properly match previously-generated commits, allowing successful pushing to a prior subtree repo. Signed-off-by: Strain, Roger L --- contrib/subtree/git-subtree.sh | 26 -- 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh index d8861f306..eef4199ae 100755 --- a/contrib/subtree/git-subtree.sh +++ b/contrib/subtree/git-subtree.sh @@ -231,12 +231,14 @@ cache_miss () { } check_parents () { - missed=$(cache_miss "$@") + missed=$(cache_miss "$1") + local indent=$(($2 + 1)) for miss in $missed do if ! test -r "$cachedir/notree/$miss" then debug " incorrect order: $miss" + process_split_commit "$miss" "" "$indent" fi done } @@ -606,8 +608,20 @@ ensure_valid_ref_format () { process_split_commit () { local rev="$1" local parents="$2" - revcount=$(($revcount + 1)) - progress "$revcount/$revmax ($createcount)" + local indent=$3 + + if test $indent -eq 0 + then + revcount=$(($revcount + 1)) + else + # processing commit without normal parent information; + # fetch from repo + parents=$(git log --pretty=%P -n 1 "$rev") + extracount=$(($extracount + 1)) + fi + + progress "$revcount/$revmax ($createcount) [$extracount]" + debug "Processing commit: $rev" exists=$(cache_get "$rev") if test -n "$exists" @@ -617,14 +631,13 @@ process_split_commit () { fi createcount=$(($createcount + 1)) debug " parents: $parents" + check_parents "$parents" "$indent" newparents=$(cache_get $parents) debug " newparents: $newparents" tree=$(subtree_for_commit "$rev" "$dir") debug " tree is: $tree" - check_parents $parents - # ugly. is there no better way to tell if this is a subtree # vs. a mainline commit? Does it matter? if test -z "$tree" @@ -744,10 +757,11 @@ cmd_split () { revmax=$(eval "$grl" | wc -l) revcount=0 createcount=0 + extracount=0 eval "$grl" | while read rev parents do - process_split_commit "$rev" "$parents" + process_split_commit "$rev" "$parents" 0 done || exit $? latest_new=$(cache_get latest_new) -- 2.19.1
[PATCH v2 0/4] Multiple subtree split fixes regarding complex repos
After doing some testing at scale, determined that one call was taking too long; replaced that with an alternate call which returns the same data significantly faster. Also, if anyone has any other feedback on these I'd really love to hear it. It's working better for us (as in, it actually generates a compatible tree version to version) but still isn't perfect, and I'm not sure perfect is achievable, but want to make sure this doesn't things for anyone else. Changes since v1: diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh index 1c157dbd9..7dd643998 100755 --- a/contrib/subtree/git-subtree.sh +++ b/contrib/subtree/git-subtree.sh @@ -633,7 +633,7 @@ process_split_commit () { else # processing commit without normal parent information; # fetch from repo - parents=$(git show -s --pretty=%P "$rev") + parents=$(git log --pretty=%P -n 1 "$rev") extracount=$(($extracount + 1)) fi Strain, Roger L (4): subtree: refactor split of a commit into standalone method subtree: make --ignore-joins pay attention to adds subtree: use commits before rejoins for splits subtree: improve decision on merges kept in split contrib/subtree/git-subtree.sh | 129 + 1 file changed, 83 insertions(+), 46 deletions(-) -- 2.19.1
[PATCH v2 1/4] subtree: refactor split of a commit into standalone method
From: "Strain, Roger L" In a particularly complex repo, subtree split was not creating compatible splits for pushing back to a separate repo. Addressing one of the issues requires recursive handling of parent commits that were not initially considered by the algorithm. This commit makes no functional changes, but relocates the code to be called recursively into a new method to simply comparisons of later commits. Signed-off-by: Strain, Roger L --- contrib/subtree/git-subtree.sh | 78 ++ 1 file changed, 42 insertions(+), 36 deletions(-) diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh index d3f39a862..2cd7b345b 100755 --- a/contrib/subtree/git-subtree.sh +++ b/contrib/subtree/git-subtree.sh @@ -598,6 +598,47 @@ ensure_valid_ref_format () { die "'$1' does not look like a ref" } +process_split_commit () { + local rev="$1" + local parents="$2" + revcount=$(($revcount + 1)) + progress "$revcount/$revmax ($createcount)" + debug "Processing commit: $rev" + exists=$(cache_get "$rev") + if test -n "$exists" + then + debug " prior: $exists" + return + fi + createcount=$(($createcount + 1)) + debug " parents: $parents" + newparents=$(cache_get $parents) + debug " newparents: $newparents" + + tree=$(subtree_for_commit "$rev" "$dir") + debug " tree is: $tree" + + check_parents $parents + + # ugly. is there no better way to tell if this is a subtree + # vs. a mainline commit? Does it matter? + if test -z "$tree" + then + set_notree "$rev" + if test -n "$newparents" + then + cache_set "$rev" "$rev" + fi + return + fi + + newrev=$(copy_or_skip "$rev" "$tree" "$newparents") || exit $? + debug " newrev is: $newrev" + cache_set "$rev" "$newrev" + cache_set latest_new "$newrev" + cache_set latest_old "$rev" +} + cmd_add () { if test -e "$dir" then @@ -706,42 +747,7 @@ cmd_split () { eval "$grl" | while read rev parents do - revcount=$(($revcount + 1)) - progress "$revcount/$revmax ($createcount)" - debug "Processing commit: $rev" - exists=$(cache_get "$rev") - if test -n "$exists" - then - debug " prior: $exists" - continue - fi - createcount=$(($createcount + 1)) - debug " parents: $parents" - newparents=$(cache_get $parents) - debug " newparents: $newparents" - - tree=$(subtree_for_commit "$rev" "$dir") - debug " tree is: $tree" - - check_parents $parents - - # ugly. is there no better way to tell if this is a subtree - # vs. a mainline commit? Does it matter? - if test -z "$tree" - then - set_notree "$rev" - if test -n "$newparents" - then - cache_set "$rev" "$rev" - fi - continue - fi - - newrev=$(copy_or_skip "$rev" "$tree" "$newparents") || exit $? - debug " newrev is: $newrev" - cache_set "$rev" "$newrev" - cache_set latest_new "$newrev" - cache_set latest_old "$rev" + process_split_commit "$rev" "$parents" done || exit $? latest_new=$(cache_get latest_new) -- 2.19.1
[PATCH v2 2/4] subtree: make --ignore-joins pay attention to adds
From: "Strain, Roger L" Changes the behavior of --ignore-joins to always consider a subtree add commit, and ignore only splits and squashes. The --ignore-joins option is documented to ignore prior --rejoin commits. However, it additionally ignored subtree add commits generated when a subtree was initially added to a repo. Due to the logic which determines whether a commit is a mainline commit or a subtree commit (namely, the presence or absence of content in the subtree prefix) this causes commits before the initial add to appear to be part of the subtree. An --ignore-joins split would therefore consider those commits part of the subtree history and include them at the beginning of the synthetic history, causing the resulting hashes to be incorrect for all later commits. Signed-off-by: Strain, Roger L --- contrib/subtree/git-subtree.sh | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh index 2cd7b345b..d8861f306 100755 --- a/contrib/subtree/git-subtree.sh +++ b/contrib/subtree/git-subtree.sh @@ -340,7 +340,12 @@ find_existing_splits () { revs="$2" main= sub= - git log --grep="^git-subtree-dir: $dir/*\$" \ + local grep_format="^git-subtree-dir: $dir/*\$" + if test -n "$ignore_joins" + then + grep_format="^Add '$dir/' from commit '" + fi + git log --grep="$grep_format" \ --no-show-signature --pretty=format:'START %H%n%s%n%n%b%nEND%n' $revs | while read a b junk do @@ -730,12 +735,7 @@ cmd_split () { done fi - if test -n "$ignore_joins" - then - unrevs= - else - unrevs="$(find_existing_splits "$dir" "$revs")" - fi + unrevs="$(find_existing_splits "$dir" "$revs")" # We can't restrict rev-list to only $dir here, because some of our # parents have the $dir contents the root, and those won't match. -- 2.19.1
[PATCH v2 4/4] subtree: improve decision on merges kept in split
From: "Strain, Roger L" When multiple identical parents are detected for a commit being considered for copying, explicitly check whether one is the common merge base between the commits. If so, the other commit can be used as the identical parent; if not, a merge must be performed to maintain history. In some situations two parents of a merge commit may appear to both have identical subtree content with each other and the current commit. However, those parents can potentially come from different commit graphs. Previous behavior would simply select one of the identical parents to serve as the replacement for this commit, based on the order in which they were processed. New behavior compares the merge base between the commits to determine if a new merge commit is necessary to maintain history despite the identical content. Signed-off-by: Strain, Roger L --- contrib/subtree/git-subtree.sh | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh index eef4199ae..7dd643998 100755 --- a/contrib/subtree/git-subtree.sh +++ b/contrib/subtree/git-subtree.sh @@ -541,6 +541,7 @@ copy_or_skip () { nonidentical= p= gotparents= + copycommit= for parent in $newparents do ptree=$(toptree_for_commit $parent) || exit $? @@ -548,7 +549,24 @@ copy_or_skip () { if test "$ptree" = "$tree" then # an identical parent could be used in place of this rev. - identical="$parent" + if test -n "$identical" + then + # if a previous identical parent was found, check whether + # one is already an ancestor of the other + mergebase=$(git merge-base $identical $parent) + if test "$identical" = "$mergebase" + then + # current identical commit is an ancestor of parent + identical="$parent" + elif test "$parent" != "$mergebase" + then + # no common history; commit must be copied + copycommit=1 + fi + else + # first identical parent detected + identical="$parent" + fi else nonidentical="$parent" fi @@ -571,7 +589,6 @@ copy_or_skip () { fi done - copycommit= if test -n "$identical" && test -n "$nonidentical" then extras=$(git rev-list --count $identical..$nonidentical) -- 2.19.1
Re: [PATCH 10/15] rebase-interactive: use todo_list_transform() in edit_todo_list()
Le 11/10/2018 à 17:16, Phillip Wood a écrit : > On 07/10/2018 20:54, Alban Gruin wrote: >> Just like complete_action(), edit_todo_list() used a >> function (transform_todo_file()) that read the todo-list from the disk >> and wrote it back, resulting in useless disk accesses. >> >> This changes edit_todo_list() to call directly todo_list_transform() >> instead. >> >> Signed-off-by: Alban Gruin >> --- >> rebase-interactive.c | 40 +++- >> 1 file changed, 19 insertions(+), 21 deletions(-) >> >> diff --git a/rebase-interactive.c b/rebase-interactive.c >> index 7c7f720a3d..f42d48e192 100644 >> --- a/rebase-interactive.c >> +++ b/rebase-interactive.c >> @@ -78,39 +78,37 @@ void append_todo_help(unsigned edit_todo, unsigned >> keep_empty, >> int edit_todo_list(unsigned flags) >> { >> - struct strbuf buf = STRBUF_INIT; >> const char *todo_file = rebase_path_todo(); >> + struct todo_list todo_list = TODO_LIST_INIT; >> + int res = 0; >> - if (strbuf_read_file(&buf, todo_file, 0) < 0) >> + if (strbuf_read_file(&todo_list.buf, todo_file, 0) < 0) >> return error_errno(_("could not read '%s'."), todo_file); >> - strbuf_stripspace(&buf, 1); >> - if (write_message(buf.buf, buf.len, todo_file, 0)) { >> - strbuf_release(&buf); >> - return -1; >> - } >> - >> - strbuf_release(&buf); >> + strbuf_stripspace(&todo_list.buf, 1); >> + if (!todo_list_parse_insn_buffer(todo_list.buf.buf, &todo_list)) >> + todo_list_transform(&todo_list, flags | TODO_LIST_SHORTEN_IDS); >> - transform_todo_file(flags | TODO_LIST_SHORTEN_IDS); >> - >> - if (strbuf_read_file(&buf, todo_file, 0) < 0) >> - return error_errno(_("could not read '%s'."), todo_file); >> + append_todo_help(1, 0, &todo_list.buf); >> - append_todo_help(1, 0, &buf); > > I think this patch is fine, I was just wondering if you meant to move > the call to append_todo_help() above the blank line? > No > Best Wishes > > Phillip > Cheers, Alban
[PATCH] Per-commit and per-parent filters for 2 parents
Signed-off-by: Jonathan Tan --- > One main benefit of storing on Bloom filter per commit is to avoid > recomputing hashes at every commit. Currently, this patch only improves > locality when checking membership at the cost of taking up more space. > Drop the dependence on the parent oid and then we can save the time > spent hashing during history queries. I've removed the hashing of the parent OID here and tried having per-parent and per-commit hashes for the first 2 parents of each commit instead of only 1, thus doubling the filter size. The results are not much of an improvement though: bloom filter total queries: 66409 definitely not: 56424 maybe: 9985 false positives: 9099 fp ratio: 0.137015 0:01.17 --- bloom-filter.c | 25 - bloom-filter.h | 4 ++-- commit-graph.c | 13 - revision.c | 11 +-- 4 files changed, 27 insertions(+), 26 deletions(-) diff --git a/bloom-filter.c b/bloom-filter.c index 39b453908f..10c73c45ae 100644 --- a/bloom-filter.c +++ b/bloom-filter.c @@ -11,7 +11,7 @@ void bloom_filter_init(struct bloom_filter *bf, uint32_t commit_nr, uint32_t bit bf->nr_entries = 0; bf->commit_nr = commit_nr; bf->bit_size = bit_size; - bf->bits = xcalloc(1, commit_nr * bit_size / CHAR_BIT); + bf->bits = xcalloc(1, 2 * commit_nr * bit_size / CHAR_BIT); } void bloom_filter_free(struct bloom_filter *bf) @@ -22,24 +22,24 @@ void bloom_filter_free(struct bloom_filter *bf) } -static void bloom_filter_set_bits(struct bloom_filter *bf, uint32_t graph_pos, const uint32_t *offsets, +static void bloom_filter_set_bits(struct bloom_filter *bf, uint32_t graph_pos, int parent_index, const uint32_t *offsets, int nr_offsets, int nr_entries) { int i; for (i = 0; i < nr_offsets; i++) { - uint32_t byte_offset = (offsets[i] % bf->bit_size + graph_pos * bf->bit_size) / CHAR_BIT; + uint32_t byte_offset = (offsets[i] % bf->bit_size + (2 * graph_pos + parent_index) * bf->bit_size) / CHAR_BIT; unsigned char mask = 1 << offsets[i] % CHAR_BIT; bf->bits[byte_offset] |= mask; } bf->nr_entries += nr_entries; } -static int bloom_filter_check_bits(struct bloom_filter *bf, uint32_t graph_pos, const uint32_t *offsets, +static int bloom_filter_check_bits(struct bloom_filter *bf, uint32_t graph_pos, int parent_index, const uint32_t *offsets, int nr) { int i; for (i = 0; i < nr; i++) { - uint32_t byte_offset = (offsets[i] % bf->bit_size + graph_pos * bf->bit_size) / CHAR_BIT; + uint32_t byte_offset = (offsets[i] % bf->bit_size + (2 * graph_pos + parent_index) * bf->bit_size) / CHAR_BIT; unsigned char mask = 1 << offsets[i] % CHAR_BIT; if (!(bf->bits[byte_offset] & mask)) return 0; @@ -48,19 +48,18 @@ static int bloom_filter_check_bits(struct bloom_filter *bf, uint32_t graph_pos, } -void bloom_filter_add_hash(struct bloom_filter *bf, uint32_t graph_pos, const unsigned char *hash) +void bloom_filter_add_hash(struct bloom_filter *bf, uint32_t graph_pos, int parent_index, const unsigned char *hash) { uint32_t offsets[GIT_MAX_RAWSZ / sizeof(uint32_t)]; hashcpy((unsigned char*)offsets, hash); - bloom_filter_set_bits(bf, graph_pos, offsets, -the_hash_algo->rawsz / sizeof(*offsets), 1); + bloom_filter_set_bits(bf, graph_pos, parent_index, offsets, the_hash_algo->rawsz / sizeof(*offsets), 1); } -int bloom_filter_check_hash(struct bloom_filter *bf, uint32_t graph_pos, const unsigned char *hash) +int bloom_filter_check_hash(struct bloom_filter *bf, uint32_t graph_pos, int parent_index, const unsigned char *hash) { uint32_t offsets[GIT_MAX_RAWSZ / sizeof(uint32_t)]; hashcpy((unsigned char*)offsets, hash); - return bloom_filter_check_bits(bf, graph_pos, offsets, + return bloom_filter_check_bits(bf, graph_pos, parent_index, offsets, the_hash_algo->rawsz / sizeof(*offsets)); } @@ -87,8 +86,8 @@ int bloom_filter_load(struct bloom_filter *bf) read_in_full(fd, &bf->bit_size, sizeof(bf->bit_size)); if (bf->bit_size % CHAR_BIT) BUG("invalid size for bloom filter"); - bf->bits = xmalloc(bf->commit_nr * bf->bit_size / CHAR_BIT); - read_in_full(fd, bf->bits, bf->commit_nr * bf->bit_size / CHAR_BIT); + bf->bits = xmalloc(2 * bf->commit_nr * bf->bit_size / CHAR_BIT); + read_in_full(fd, bf->bits, 2 * bf->commit_nr * bf->bit_size / CHAR_BIT); close(fd); @@ -102,7 +101,7 @@ void bloom_filter_write(struct bloom_filter *bf) write_in_full(fd, &bf->nr_entries, sizeof(bf->nr_entries)); write_in_full(fd, &bf->commit_nr, sizeof(bf->commit_nr)); write_in_full(fd, &bf->bit_size, sizeof(bf->bit_size)); -
[PATCH] doc: move git-get-tar-commit-id to plumbing
This is definitely a low-level command, it's hard to argue against it belonging in plumbing. Signed-off-by: Daniels Umanovskis --- command-list.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/command-list.txt b/command-list.txt index c36ea3c18..966705358 100644 --- a/command-list.txt +++ b/command-list.txt @@ -96,7 +96,7 @@ git-for-each-refplumbinginterrogators git-format-patchmainporcelain git-fsckancillaryinterrogators complete git-gc mainporcelain -git-get-tar-commit-id ancillaryinterrogators +git-get-tar-commit-id plumbinginterrogators git-grepmainporcelain info git-gui mainporcelain git-hash-object plumbingmanipulators -- 2.19.1.330.g93276587c.dirty
[PATCH] doc: move git-cherry to plumbing
Also remove git-cherry from Bash completion because plumbing commands do not belong there. Signed-off-by: Daniels Umanovskis --- Up to discussion whether cherry should be considered plumbing. I lean towards considering it a rarely-used porcelain command, but a case could be made either way so let's see what the list thinks. command-list.txt | 2 +- contrib/completion/git-completion.bash | 11 --- 2 files changed, 1 insertion(+), 12 deletions(-) diff --git a/command-list.txt b/command-list.txt index c36ea3c18..bdca6e3d3 100644 --- a/command-list.txt +++ b/command-list.txt @@ -62,7 +62,7 @@ git-check-mailmap purehelpers git-checkoutmainporcelain history git-checkout-index plumbingmanipulators git-check-ref-formatpurehelpers -git-cherry ancillaryinterrogators complete +git-cherry plumbinginterrogators complete git-cherry-pick mainporcelain git-citool mainporcelain git-clean mainporcelain diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index d63d2dffd..12f7ce0c5 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -1340,17 +1340,6 @@ _git_checkout () esac } -_git_cherry () -{ - case "$cur" in - --*) - __gitcomp_builtin cherry - return - esac - - __git_complete_refs -} - __git_cherry_pick_inprogress_options="--continue --quit --abort" _git_cherry_pick () -- 2.19.1.330.g93276587c.dirty
Re: [PATCH v3 1/2] commit-graph write: add progress output
On Wed, Oct 10 2018, Ævar Arnfjörð Bjarmason wrote: > On Wed, Oct 10 2018, SZEDER Gábor wrote: > >> On Wed, Oct 10, 2018 at 11:56:45PM +0200, Ævar Arnfjörð Bjarmason wrote: >>> On Wed, Oct 10 2018, SZEDER Gábor wrote: >> >>> >> for (i = 0; i < oids->nr; i++) { >>> >> +display_progress(progress, ++j); >>> >>> [...] >>> >>> > This display_progress() call, however, doesn't seem to be necessary. >>> > First, it counts all commits for a second time, resulting in the ~2x >>> > difference compared to the actual number of commits, and then causing >>> > my confusion. Second, all what this loop is doing is setting a flag >>> > in commits that were already looked up and parsed in the above loops. >>> > IOW this loop is very fast, and the progress indicator jumps from >>> > ~780k right to 1.5M, even on my tiny laptop, so it doesn't need a >>> > progress indicator at all. >>> >>> You're right, I tried this patch on top: >> >> [...] >> >>> And on a large repo with around 3 million commits the 3rd progress bar >>> doesn't kick in. >>> >>> But if I apply this on top: >>> >> [...] >>> >>> I.e. start the timer after 1/4 of a second instead of 1 second, I get >>> that progress bar. >>> >>> So I'm inclined to keep it. It just needs to be 4x the size before it's >>> noticeably hanging for 1 second. >> >> Just to clarify: are you worried about a 1 second hang in an approx. 12 >> million commit repository? If so, then I'm unconvinced, that's not >> even a blip on the radar, and the misleading numbers are far worse. > > It's not a blip on the runtime, but the point of these progress bars in > general is so we don't have a UI where there's no UI differnce between > git hanging and just doing work in some tight loop in the background, > and even 1 second when you're watching something is noticeable if it > stalls. > > Also it's 1 second on a server where I had 128G of RAM. I think even a > "trivial" flag change like this would very much change if e.g. the > system was under memory pressure or was swapping. > > And as noted code like this tends to change over time, that loop might > get more expensive, so let's future proof by having all the loops over N > call the progress code. > > When I wrote this the intent was just "report progress". So that it's > counting anything is just an implementation detail of how progress.c > works now. > > This was the reference to Duy's patch, i.e. instead of spewing numbers > at the user here let's just render a spinner. Then we no longer need to > make judgement calls about which loop over N is expensive right now, and > which one isn't, and if any of them will result in reporting a 2N number > while the user might be more familiar with or expecting N. > >>> That repo isn't all that big compared to what we've heard about out >>> there, and inner loops like this have a tendency to accumulate some more >>> code over time without a re-visit of why we weren't monitoring progress >>> there. >>> >>> But maybe we can fix the message. We say "Annotating commits in commit >>> grap", not "Counting" or whatever. I was trying to find something that >>> didn't imply that we were doing this once. One can annotate a thing more >>> than once, but maybe ther's a better way to explain this... >> >> IMO just remove it. Hrm, actually reading this again your initial post says we end up with a 2x difference v.s. the number of commits, but it's actually 3x. The loop that has a rather trivial runtime comparatively is the 3x, but the 2x loop takes a notable amount of time. So e.g. on git.git: $ git rev-list --all | wc -l; ~/g/git/git commit-graph write 166678 Annotating commits in commit graph: 518463, done. Computing commit graph generation numbers: 100% (172685/172685), done.
Re: [PATCH v2 1/1] branch: introduce --show-current display option
On Thu, Oct 11, 2018 at 07:29:58PM +0200, Daniels Umanovskis wrote: > > Without passing the &flag argument, I do not think there is a > > reliable way to ask resolve_ref_unsafe() if "HEAD" is a symbolic > > ref. > > If I'm reading the code correctly, resolve_ref_unsafe() will return > "HEAD" or NULL if there's no symbolic reference, so anything else would > indicate a symref, but even in that case checking the flag explicitly is > definitely better to clearly show intent. Yes, that matches my understanding, too. -Peff
Re: [PATCH v2 1/1] branch: introduce --show-current display option
On Thu, Oct 11, 2018 at 06:36:02PM +0200, Daniels Umanovskis wrote: > On 10/11/18 5:43 PM, Rafael Ascensão wrote: > > I agree it feels a bit out of place, and still think that > > > > $ git branch --list HEAD > > > > would be a good candidate to be taught how to print the current branch. > > I am not a fan because it would be yet another inconsistency in the Git > command interface. An argument given after git branch --list means a > pattern for the branches to list. Making HEAD print the current branch > would be an exception to what an argument in that place means. Yes, HEAD > itself is a very special string in git, but I'm not a fan of syntax > where a specific argument value does something very different from any > other value in that place. Yeah, I agree. If we were to go this route, it should probably be: git branch --list-head Which sounds a lot like what you are proposing, but I think the name implies more strongly "show --list, but only for the HEAD". I.e., for the detached case, show the "HEAD detached at..." text. -Peff
Re: [PATCH v2 1/1] branch: introduce --show-current display option
On 10/11/18 8:54 AM, Junio C Hamano wrote: > Is it a normal situation to have refname==NULL, or is it something > worth reporting as an error? Looks like that would be in the case of looping symrefs or file backend failure, so seems a good idea to die() in that case. > Without passing the &flag argument, I do not think there is a > reliable way to ask resolve_ref_unsafe() if "HEAD" is a symbolic > ref. If I'm reading the code correctly, resolve_ref_unsafe() will return "HEAD" or NULL if there's no symbolic reference, so anything else would indicate a symref, but even in that case checking the flag explicitly is definitely better to clearly show intent. Will soon reply with v3 cleaning up the suggested patch accordingly.
Re: [PATCH 08/15] sequencer: change complete_action() to use the refactored functions
Le 11/10/2018 à 15:51, Phillip Wood a écrit : > On 07/10/2018 20:54, Alban Gruin wrote: >> + if (rewrite_file(todo_file, new_todo.buf.buf, new_todo.buf.len) < >> 0) { >> + todo_list_release(&new_todo); >> + return error_errno(_("could not write '%s'"), todo_file); >> + } > > rewrite_file() can truncate the old version of the file if there is an > error when writing the new version, I think it would be better to use > write_message() instead as that atomically updates the file. The same > applies to patch 5 (refactor rearrange_squash()) after which I think > there will be no callers to rewrite_file() so it can be deleted. You’re right, I didn’t notice that. > > Best Wishes > > Phillip > Cheers, Alban
Re: [PATCH 04/15] sequencer: refactor sequencer_add_exec_commands() to work on a todo_list
Hi Phillip, thanks for taking the time to review my patches. Le 11/10/2018 à 13:25, Phillip Wood a écrit : > On 07/10/2018 20:54, Alban Gruin wrote: >> @@ -4419,15 +4406,38 @@ int sequencer_add_exec_commands(const char >> *commands) >> } >> /* insert or append final */ >> - if (insert >= 0 && insert < todo_list.nr) >> - strbuf_insert(buf, todo_list.items[insert].offset_in_buf + >> + if (insert >= 0 && insert < todo_list->nr) >> + strbuf_insert(buf, todo_list->items[insert].offset_in_buf + >> offset, commands, commands_len); >> else if (insert >= 0 || !offset) >> strbuf_add(buf, commands, commands_len); >> - i = write_message(buf->buf, buf->len, todo_file, 0); >> + if (todo_list_parse_insn_buffer(buf->buf, todo_list)) >> + BUG("unusable todo list");} > > It is a shame to have to re-parse the todo list, I wonder how difficult > it would be to adjust the todo_list item array as the exec commands are > inserted. The same applies to the next couple of patches > Good question. This function inserts an `exec' command after every `pick' command. These commands are stored in a dynamically allocated list, grew with ALLOW_GROW(). If we want to keep the current structure, we would have to grow the size of the list by 1 and move several element to the end every time we want to add an `exec' command. It would not be very effective. Perhaps I should use a linked list here, instead. It may also work well with rearrange_squash() and skip_unnecessary_picks(). Maybe we could even get rid of the strbuf at some point. > Best Wishes > > Phillip > Cheers, Alban
Re: [PATCH v2 1/1] branch: introduce --show-current display option
On 10/11/18 5:43 PM, Rafael Ascensão wrote: > I agree it feels a bit out of place, and still think that > > $ git branch --list HEAD > > would be a good candidate to be taught how to print the current branch. I am not a fan because it would be yet another inconsistency in the Git command interface. An argument given after git branch --list means a pattern for the branches to list. Making HEAD print the current branch would be an exception to what an argument in that place means. Yes, HEAD itself is a very special string in git, but I'm not a fan of syntax where a specific argument value does something very different from any other value in that place.
Re: [PATCH v3 7/7] revision.c: refactor basic topo-order logic
On 10/11/2018 11:35 AM, Jeff King wrote: On Fri, Sep 21, 2018 at 10:39:36AM -0700, Derrick Stolee via GitGitGadget wrote: From: Derrick Stolee When running a command like 'git rev-list --topo-order HEAD', Git performed the following steps: [...] In the new algorithm, these three steps correspond to three different commit walks. We run these walks simultaneously, A minor nit, but this commit message doesn't mention the most basic thing up front: that its main purpose is to introduce a new algorithm for topo-order. ;) It's obvious in the context of reviewing the series, but somebody reading "git log" later may want a bit more. Perhaps: revision.c: implement generation-based topo-order algorithm as a subject, and/or an introductory paragraph like: The current --topo-order algorithm requires walking all commits we are going to output up front, topo-sorting them, all before outputting the first value. This patch introduces a new algorithm which uses stored generation numbers to incrementally walk in topo-order, outputting commits as we go. Other than that, I find this to be a wonderfully explanatory commit message. :) Good idea. I'll make that change. The walks are as follows: 1. EXPLORE: using the explore_queue priority queue (ordered by maximizing the generation number), parse each reachable commit until all commits in the queue have generation number strictly lower than needed. During this walk, update the UNINTERESTING flags as necessary. OK, this makes sense. If we know that everybody else in our queue is at generation X, then it is safe to output a commit at generation greater than X. I think this by itself would allow us to implement "show no parents before all of its children are shown", right? But --topo-order promises a bit more: "avoid showing commits no multiple lines of history intermixed". I guess also INFINITY generation numbers need more. For a real generation number, we know that "gen(A) == gen(B)" implies that there is no ancestry relationship between the two. But not so for INFINITY. Yeah, to deal with INFINITY (and ZERO, but that won't happen if generation_numbers_enabled() returns true), we treat gen(A) == gen(B) as a "no information" state. So, to output a commit at generation X, we need to have our maximum generation number in the unexplored area to be at most X - 1. You'll see strict inequality when checking generations. 2. INDEGREE: using the indegree_queue priority queue (ordered by maximizing the generation number), add one to the in- degree of each parent for each commit that is walked. Since we walk in order of decreasing generation number, we know that discovering an in-degree value of 0 means the value for that commit was not initialized, so should be initialized to two. (Recall that in-degree value "1" is what we use to say a commit is ready for output.) As we iterate the parents of a commit during this walk, ensure the EXPLORE walk has walked beyond their generation numbers. I wondered how this would work for INFINITY. We can't know the order of a bunch of INFINITY nodes at all, so we never know when their in-degree values are "done". But if I understand the EXPLORE walk, we'd basically walk all of INFINITY down to something with a real generation number. Is that right? But after that, I'm not totally clear on why we need this INDEGREE walk. The INDEGREE walk is an important element for Kahn's algorithm. The final output order is dictated by peeling commits of "indegree zero" to ensure all children are output before their parents. (Note: since we use literal 0 to mean "uninitialized", we peel commits when the indegree slab has value 1.) This walk replaces the indegree logic from sort_in_topological_order(). That method performs one walk that fills the indegree slab, then another walk that peels the commits with indegree 0 and inserts them into a list. 3. TOPO: using the topo_queue priority queue (ordered based on the sort_order given, which could be commit-date, author- date, or typical topo-order which treats the queue as a LIFO stack), remove a commit from the queue and decrement the in-degree of each parent. If a parent has an in-degree of one, then we add it to the topo_queue. Before we decrement the in-degree, however, ensure the INDEGREE walk has walked beyond that generation number. OK, this makes sense to make --author-date-order, etc, work. Potentially those numbers might have no relationship at all with the graph structure, but we promise "no parent before its children are shown", so this is really just a tie-breaker after the topo-sort anyway. As long as steps 1 and 2 are correct and produce a complete set of commits for one "layer", this should be OK. I guess I'm not 100% convinced that we don't have a case where we haven't yet parsed or considered some commit that we know cannot have an ancestry relationship with commits
Re: [PATCH v2 1/1] branch: introduce --show-current display option
On Wed, Oct 10, 2018 at 08:34:40PM -0400, Jeff King wrote: > It just seems like in its current form it might be in an uncanny valley > where it is not quite scriptable plumbing, but not as informative as > other porcelain. I agree it feels a bit out of place, and still think that $ git branch --list HEAD would be a good candidate to be taught how to print the current branch. I suggested this in the previous iteration but either got lost in the noise or was uninteresting. If the latter, I would love to receive feedback on it. https://public-inbox.org/git/20181010142423.GA3390@rigel/ Something like the following (not meant as a real patch), would show the current branch when attached, (HEAD detached at hash) when detached, and nothing if unborn branch. This will also keep the current formatting git branch uses (which is sliglty harder to parse). I view it as a plus. Otherwise people will eventually start parsing it instead of using the recommended plumbing. -- Cheers Rafael Ascensão -- >8 -- diff --git a/builtin/branch.c b/builtin/branch.c index b67593288c..78a3de526c 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -684,6 +684,17 @@ int cmd_branch(int argc, const char **argv, const char *prefix) if ((filter.kind & FILTER_REFS_BRANCHES) && filter.detached) filter.kind |= FILTER_REFS_DETACHED_HEAD; filter.name_patterns = argv; + + while (*argv) { + if (!strcmp(*argv, "HEAD")) { + const char *refname = resolve_ref_unsafe("HEAD", 0, NULL, NULL); + skip_prefix(refname, "refs/heads/", &refname); + filter.name_patterns[argv - filter.name_patterns] = refname; + break; + } + argv++; + } + /* * If no sorting parameter is given then we default to sorting * by 'refname'. This would give us an alphabetically sorted
Re: [PATCH v3 7/7] revision.c: refactor basic topo-order logic
On Fri, Sep 21, 2018 at 10:39:36AM -0700, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee > > When running a command like 'git rev-list --topo-order HEAD', > Git performed the following steps: > [...] > In the new algorithm, these three steps correspond to three > different commit walks. We run these walks simultaneously, A minor nit, but this commit message doesn't mention the most basic thing up front: that its main purpose is to introduce a new algorithm for topo-order. ;) It's obvious in the context of reviewing the series, but somebody reading "git log" later may want a bit more. Perhaps: revision.c: implement generation-based topo-order algorithm as a subject, and/or an introductory paragraph like: The current --topo-order algorithm requires walking all commits we are going to output up front, topo-sorting them, all before outputting the first value. This patch introduces a new algorithm which uses stored generation numbers to incrementally walk in topo-order, outputting commits as we go. Other than that, I find this to be a wonderfully explanatory commit message. :) > The walks are as follows: > > 1. EXPLORE: using the explore_queue priority queue (ordered by >maximizing the generation number), parse each reachable >commit until all commits in the queue have generation >number strictly lower than needed. During this walk, update >the UNINTERESTING flags as necessary. OK, this makes sense. If we know that everybody else in our queue is at generation X, then it is safe to output a commit at generation greater than X. I think this by itself would allow us to implement "show no parents before all of its children are shown", right? But --topo-order promises a bit more: "avoid showing commits no multiple lines of history intermixed". I guess also INFINITY generation numbers need more. For a real generation number, we know that "gen(A) == gen(B)" implies that there is no ancestry relationship between the two. But not so for INFINITY. > 2. INDEGREE: using the indegree_queue priority queue (ordered >by maximizing the generation number), add one to the in- >degree of each parent for each commit that is walked. Since >we walk in order of decreasing generation number, we know >that discovering an in-degree value of 0 means the value for >that commit was not initialized, so should be initialized to >two. (Recall that in-degree value "1" is what we use to say a >commit is ready for output.) As we iterate the parents of a >commit during this walk, ensure the EXPLORE walk has walked >beyond their generation numbers. I wondered how this would work for INFINITY. We can't know the order of a bunch of INFINITY nodes at all, so we never know when their in-degree values are "done". But if I understand the EXPLORE walk, we'd basically walk all of INFINITY down to something with a real generation number. Is that right? But after that, I'm not totally clear on why we need this INDEGREE walk. > 3. TOPO: using the topo_queue priority queue (ordered based on >the sort_order given, which could be commit-date, author- >date, or typical topo-order which treats the queue as a LIFO >stack), remove a commit from the queue and decrement the >in-degree of each parent. If a parent has an in-degree of >one, then we add it to the topo_queue. Before we decrement >the in-degree, however, ensure the INDEGREE walk has walked >beyond that generation number. OK, this makes sense to make --author-date-order, etc, work. Potentially those numbers might have no relationship at all with the graph structure, but we promise "no parent before its children are shown", so this is really just a tie-breaker after the topo-sort anyway. As long as steps 1 and 2 are correct and produce a complete set of commits for one "layer", this should be OK. I guess I'm not 100% convinced that we don't have a case where we haven't yet parsed or considered some commit that we know cannot have an ancestry relationship with commits we are outputting. But it may have an author-date-order relationship. (I'm not at all convinced that this _is_ a problem, and I suspect it isn't; I'm only suggesting I haven't fully grokked the proof). > --- > object.h | 4 +- > revision.c | 196 +++-- > revision.h | 2 + > 3 files changed, 194 insertions(+), 8 deletions(-) I'll pause here on evaluating the actual code. It looks sane from a cursory read, but there's no point in digging further until I'm sure I fully understand the algorithm. I think that needs a little more brain power from me, and hopefully discussion around my comments above will help trigger that. -Peff
Re: [PATCH 10/15] rebase-interactive: use todo_list_transform() in edit_todo_list()
On 07/10/2018 20:54, Alban Gruin wrote: Just like complete_action(), edit_todo_list() used a function (transform_todo_file()) that read the todo-list from the disk and wrote it back, resulting in useless disk accesses. This changes edit_todo_list() to call directly todo_list_transform() instead. Signed-off-by: Alban Gruin --- rebase-interactive.c | 40 +++- 1 file changed, 19 insertions(+), 21 deletions(-) diff --git a/rebase-interactive.c b/rebase-interactive.c index 7c7f720a3d..f42d48e192 100644 --- a/rebase-interactive.c +++ b/rebase-interactive.c @@ -78,39 +78,37 @@ void append_todo_help(unsigned edit_todo, unsigned keep_empty, int edit_todo_list(unsigned flags) { - struct strbuf buf = STRBUF_INIT; const char *todo_file = rebase_path_todo(); + struct todo_list todo_list = TODO_LIST_INIT; + int res = 0; - if (strbuf_read_file(&buf, todo_file, 0) < 0) + if (strbuf_read_file(&todo_list.buf, todo_file, 0) < 0) return error_errno(_("could not read '%s'."), todo_file); - strbuf_stripspace(&buf, 1); - if (write_message(buf.buf, buf.len, todo_file, 0)) { - strbuf_release(&buf); - return -1; - } - - strbuf_release(&buf); + strbuf_stripspace(&todo_list.buf, 1); + if (!todo_list_parse_insn_buffer(todo_list.buf.buf, &todo_list)) + todo_list_transform(&todo_list, flags | TODO_LIST_SHORTEN_IDS); - transform_todo_file(flags | TODO_LIST_SHORTEN_IDS); - - if (strbuf_read_file(&buf, todo_file, 0) < 0) - return error_errno(_("could not read '%s'."), todo_file); + append_todo_help(1, 0, &todo_list.buf); - append_todo_help(1, 0, &buf); I think this patch is fine, I was just wondering if you meant to move the call to append_todo_help() above the blank line? Best Wishes Phillip - if (write_message(buf.buf, buf.len, todo_file, 0)) { - strbuf_release(&buf); + if (write_message(todo_list.buf.buf, todo_list.buf.len, todo_file, 0)) { + todo_list_release(&todo_list); return -1; } - strbuf_release(&buf); - - if (launch_sequence_editor(todo_file, NULL, NULL)) + strbuf_reset(&todo_list.buf); + if (launch_sequence_editor(todo_file, &todo_list.buf, NULL)) { + todo_list_release(&todo_list); return -1; + } - transform_todo_file(flags & ~(TODO_LIST_SHORTEN_IDS)); + if (!todo_list_parse_insn_buffer(todo_list.buf.buf, &todo_list)) { + todo_list_transform(&todo_list, flags & ~(TODO_LIST_SHORTEN_IDS)); + res = write_message(todo_list.buf.buf, todo_list.buf.len, todo_file, 0); + } - return 0; + todo_list_release(&todo_list); + return res; } define_commit_slab(commit_seen, unsigned char);
Re: [PATCH v3 5/7] commit/revisions: bookkeeping before refactoring
On Fri, Sep 21, 2018 at 10:39:33AM -0700, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee > > There are a few things that need to move around a little before > making a big refactoring in the topo-order logic: > > 1. We need access to record_author_date() and >compare_commits_by_author_date() in revision.c. These are used >currently by sort_in_topological_order() in commit.c. > > 2. Moving these methods to commit.h requires adding the author_slab >definition to commit.h. The overall goal makes sense. Do we really need to define the whole slab in the header file? We're going to end up with multiple copies of the functions, since they're declared static in each file that includes commit.h. >From what's here, I think you could get away with just: struct author_date_slab; void record_author_date(struct author_date_slab *author_date, struct commit *commit); in the header file. But presumably callers would eventually want to allocate their own author dates. If that's all we need, then these days you can do: declare_commit_slab(author_date, timestamp_t); to get the type declaration. If they really do need the functions accessible outside of commit.c, then perhaps: define_shared_commit_slab(author_date, timestamp_t); in commit.h, and: implement_shared_commit_slab(author_date, timestamp_t); in commit.c (the type repetition is not too bad, as the compiler would catch any mistakes). The only downside of this approach is that we're less likely to be able to inline element access (though "peek" is big enough that I'm not sure it ends up inlined anyway). > 3. The add_parents_to_list() method in revision.c performs logic >around the UNINTERESTING flag and other special cases depending >on the struct rev_info. Allow this method to ignore a NULL 'list' >parameter, as we will not be populating the list for our walk. So now you can add_parents_to_list() without a list? That sounds confusing. :) Is it possible to split the function into two? Some handle_uninteresting_parents() logic, and then an add_parents_to_list() that calls that, but also adds to the list? A cursory look at the function suggests it's actually kind of tricky. Perhaps as an alternative, add_parents_to_list() could just get a more descriptive name? > --- > commit.c | 11 --- > commit.h | 8 > revision.c | 6 -- > 3 files changed, 16 insertions(+), 9 deletions(-) The patch itself seems straight-forward based on those explanations. -Peff
Re: [PATCH v3 4/7] revision.c: begin refactoring --topo-order logic
On Fri, Sep 21, 2018 at 10:39:32AM -0700, Derrick Stolee via GitGitGadget wrote: > [..] > When setting revs->limited only because revs->topo_order is true, > only do so if generation numbers are not available. There is no > reason to use the new logic as it will behave similarly when all > generation numbers are INFINITY or ZERO. > > In prepare_revision_walk(), if we have revs->topo_order but not > revs->limited, then we trigger the new logic. It breaks the logic > into three pieces, to fit with the existing framework: Nicely explained. Your abstracted init/next/expand API seems sane, but of course the real test will be reading the later patches that make use of it. :) The patch matches my understanding of your explanation. -Peff
Re: [PATCH v3 3/7] test-reach: add rev-list tests
On Fri, Sep 21, 2018 at 10:39:30AM -0700, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee > > The rev-list command is critical to Git's functionality. Ensure it > works in the three commit-graph environments constructed in > t6600-test-reach.sh. Here are a few important types of rev-list > operations: > > * Basic: git rev-list --topo-order HEAD > * Range: git rev-list --topo-order compare..HEAD > * Ancestry: git rev-list --topo-order --ancestry-path compare..HEAD > * Symmetric Difference: git rev-list --topo-order compare...HEAD Makes sense. I'll assume you filled out all those "expect" blocks correctly. ;) -Peff
Re: [PATCH v3 2/7] test-reach: add run_three_modes method
On Fri, Sep 21, 2018 at 10:39:29AM -0700, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee > > The 'test_three_modes' method assumes we are using the 'test-tool > reach' command for our test. However, we may want to use the data > shape of our commit graph and the three modes (no commit-graph, > full commit-graph, partial commit-graph) for other git commands. > > Split test_three_modes to be a simple translation on a more general > run_three_modes method that executes the given command and tests > the actual output to the expected output. > > [...] > +test_three_modes () { > + run_three_modes test-tool reach "$@" > +} Makes sense. Sometimes in the test suite we want to be able to pass a whole shell snippet to eval, but unless we specifically need that for this series, running "$@" directly is simpler. -Peff
Re: [PATCH v3 1/7] prio-queue: add 'peek' operation
On Fri, Sep 21, 2018 at 10:39:27AM -0700, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee > > When consuming a priority queue, it can be convenient to inspect > the next object that will be dequeued without actually dequeueing > it. Our existing library did not have such a 'peek' operation, so > add it as prio_queue_peek(). Makes sense. > +void *prio_queue_peek(struct prio_queue *queue) > +{ > + if (!queue->nr) > + return NULL; > + if (!queue->compare) > + return queue->array[queue->nr - 1].data; > + return queue->array[0].data; The non-compare version of get() treats this like a LIFO, and you do the same here. Looks good. In theory get() could be implemented in terms of peek(), but the result is not actually shorter because we have to check those same conditions to decide how to remove the item anyway. > diff --git a/t/helper/test-prio-queue.c b/t/helper/test-prio-queue.c > index 9807b649b1..e817bbf464 100644 > --- a/t/helper/test-prio-queue.c > +++ b/t/helper/test-prio-queue.c > @@ -22,9 +22,13 @@ int cmd__prio_queue(int argc, const char **argv) > struct prio_queue pq = { intcmp }; > > while (*++argv) { > - if (!strcmp(*argv, "get")) > - show(prio_queue_get(&pq)); > - else if (!strcmp(*argv, "dump")) { > + if (!strcmp(*argv, "get")) { > + void *peek = prio_queue_peek(&pq); > + void *get = prio_queue_get(&pq); > + if (peek != get) > + BUG("peek and get results do not match"); > + show(get); > + } else if (!strcmp(*argv, "dump")) { This is a nice cheap way of piggy-backing on the existing get tests. -Peff
Re: [PATCH 08/15] sequencer: change complete_action() to use the refactored functions
On 07/10/2018 20:54, Alban Gruin wrote: complete_action() used functions that read the todo-list file, made some changes to it, and wrote it back to the disk. The previous commits were dedicated to separate the part that deals with the file from the actual logic of these functions. Now that this is done, we can call directly the "logic" functions to avoid useless file access. Signed-off-by: Alban Gruin --- builtin/rebase--interactive.c | 13 +- sequencer.c | 76 +-- sequencer.h | 2 +- 3 files changed, 38 insertions(+), 53 deletions(-) diff --git a/builtin/rebase--interactive.c b/builtin/rebase--interactive.c index eef1ff2e83..0700339f90 100644 --- a/builtin/rebase--interactive.c +++ b/builtin/rebase--interactive.c @@ -71,7 +71,6 @@ static int do_interactive_rebase(struct replay_opts *opts, unsigned flags, const char *head_hash = NULL; char *revisions = NULL, *shortrevisions = NULL; struct argv_array make_script_args = ARGV_ARRAY_INIT; - FILE *todo_list_file; struct todo_list todo_list = TODO_LIST_INIT; if (prepare_branch_to_be_rebased(opts, switch_to)) @@ -94,14 +93,6 @@ static int do_interactive_rebase(struct replay_opts *opts, unsigned flags, if (!upstream && squash_onto) write_file(path_squash_onto(), "%s\n", squash_onto); - todo_list_file = fopen(rebase_path_todo(), "w"); - if (!todo_list_file) { - free(revisions); - free(shortrevisions); - - return error_errno(_("could not open %s"), rebase_path_todo()); - } - argv_array_pushl(&make_script_args, "", revisions, NULL); if (restrict_revision) argv_array_push(&make_script_args, restrict_revision); @@ -109,15 +100,13 @@ static int do_interactive_rebase(struct replay_opts *opts, unsigned flags, ret = sequencer_make_script(&todo_list.buf, make_script_args.argc, make_script_args.argv, flags); I think it would be clearer to parse the todo list here explicitly rather than doing it implicitly in complete_action() - fputs(todo_list.buf.buf, todo_list_file); - fclose(todo_list_file); if (ret) error(_("could not generate todo list")); else { discard_cache() >ret = complete_action(opts, flags, shortrevisions, onto_name, onto, - head_hash, cmd, autosquash); + head_hash, cmd, autosquash, &todo_list); } free(revisions); diff --git a/sequencer.c b/sequencer.c index dfb8d1c974..b37935e5ab 100644 --- a/sequencer.c +++ b/sequencer.c @@ -4624,93 +4624,89 @@ static int skip_unnecessary_picks(struct object_id *output_oid) return 0; } +static int todo_list_rearrange_squash(struct todo_list *todo_list); + int complete_action(struct replay_opts *opts, unsigned flags, const char *shortrevisions, const char *onto_name, const char *onto, const char *orig_head, const char *cmd, - unsigned autosquash) + unsigned autosquash, struct todo_list *todo_list) { const char *shortonto, *todo_file = rebase_path_todo(); - struct todo_list todo_list = TODO_LIST_INIT; - struct strbuf *buf = &(todo_list.buf); + struct todo_list new_todo = TODO_LIST_INIT; + struct strbuf *buf = &todo_list->buf; struct object_id oid; - struct stat st; + int command_count; get_oid(onto, &oid); shortonto = find_unique_abbrev(&oid, DEFAULT_ABBREV); - if (!lstat(todo_file, &st) && st.st_size == 0 && - write_message("noop\n", 5, todo_file, 0)) - return -1; + if (buf->len == 0) + strbuf_add(buf, "noop\n", 5); + + if (todo_list_parse_insn_buffer(buf->buf, todo_list)) + BUG("unusable todo list"); - if (autosquash && rearrange_squash_in_todo_file()) + if (autosquash && todo_list_rearrange_squash(todo_list)) return -1; if (cmd && *cmd) - sequencer_add_exec_commands(cmd); + todo_list_add_exec_commands(todo_list, cmd); - if (strbuf_read_file(buf, todo_file, 0) < 0) - return error_errno(_("could not read '%s'."), todo_file); - - if (todo_list_parse_insn_buffer(buf->buf, &todo_list)) { - todo_list_release(&todo_list); - return error(_("unusable todo list: '%s'"), todo_file); - } - - if (count_commands(&todo_list) == 0) { + command_count = count_commands(todo_list); + if (command_count == 0) { apply_autostash(opts); sequencer_remove_state(opts); - todo_list_release(&todo_list); return error(_("nothing to do")); } + todo_li
Re: Bloom Filters
On Thu, Oct 11, 2018 at 08:33:58AM -0400, Derrick Stolee wrote: > > I don't know if this is a fruitful path at all or not. I was mostly just > > satisfying my own curiosity on the bitmap encoding question. But I'll > > post the patches, just to show my work. The first one is the same > > initial proof of concept I showed earlier. > > > >[1/3]: initial tree-bitmap proof of concept > >[2/3]: test-tree-bitmap: add "dump" mode > >[3/3]: test-tree-bitmap: replace ewah with custom rle encoding > > > > Makefile| 1 + > > t/helper/test-tree-bitmap.c | 344 > > 2 files changed, 345 insertions(+) > > create mode 100644 t/helper/test-tree-bitmap.c > I'm trying to test this out myself, and am having trouble reverse > engineering how I'm supposed to test it. > > Looks like running "t/helper/test-tree-bitmap gen" will output a lot of > binary data. Where should I store that? Does any file work? Yeah, you can do: # optionally run with GIT_TRACE=1 to see some per-bitmap stats test-tree-bitmap gen >out # this should be roughly the same as: # git rev-list --all | # git diff-tree --stdin -t --name-only test-tree-bitmap dump Is this series just for the storage costs, assuming that we would replace > all TREESAME checks with a query into this database? Or do you have a way to > test how much this would improve a "git log -- " query? Right, I was just looking at storage cost here. It's not integrated with the diff code at all. I left some hypothetical numbers elsewhere in the thread. -Peff
Re: [PATCH 1/2] One filter per commit
On 10/10/2018 9:21 PM, Jonathan Tan wrote: diff --git a/commit-graph.c b/commit-graph.c index f415d3b41f..90b0b3df90 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -715,13 +715,11 @@ static int add_ref_to_list(const char *refname, static void add_changes_to_bloom_filter(struct bloom_filter *bf, struct commit *parent, struct commit *commit, + int index, struct diff_options *diffopt) { - unsigned char p_c_hash[GIT_MAX_RAWSZ]; int i; - hashxor(parent->object.oid.hash, commit->object.oid.hash, p_c_hash); - diff_tree_oid(&parent->object.oid, &commit->object.oid, "", diffopt); diffcore_std(diffopt); @@ -756,8 +754,8 @@ static void add_changes_to_bloom_filter(struct bloom_filter *bf, the_hash_algo->update_fn(&ctx, path, p - path); the_hash_algo->final_fn(name_hash, &ctx); - hashxor(name_hash, p_c_hash, hash); - bloom_filter_add_hash(bf, hash); + hashxor(name_hash, parent->object.oid.hash, hash); + bloom_filter_add_hash(bf, index, hash); } while (*p); diff_free_filepair(diff_queued_diff.queue[i]); [snip] @@ -768,11 +766,10 @@ static void add_changes_to_bloom_filter(struct bloom_filter *bf, } static void fill_bloom_filter(struct bloom_filter *bf, - struct progress *progress) + struct progress *progress, struct commit **commits, int commit_nr) { struct rev_info revs; const char *revs_argv[] = {NULL, "--all", NULL}; - struct commit *commit; int i = 0; /* We (re-)create the bloom filter from scratch every time for now. */ @@ -783,18 +780,19 @@ static void fill_bloom_filter(struct bloom_filter *bf, if (prepare_revision_walk(&revs)) die("revision walk setup failed while preparing bloom filter"); - while ((commit = get_revision(&revs))) { + for (i = 0; i < commit_nr; i++) { + struct commit *commit = commits[i]; struct commit_list *parent; for (parent = commit->parents; parent; parent = parent->next) - add_changes_to_bloom_filter(bf, parent->item, commit, + add_changes_to_bloom_filter(bf, parent->item, commit, i, &revs.diffopt); [snip] - hashxor(pi->name_hash, p_c_hash, hash); - if (bloom_filter_check_hash(&bf, hash)) { + hashxor(pi->name_hash, parent->object.oid.hash, hash); + if (bloom_filter_check_hash(&bf, commit->graph_pos, hash)) { /* * At least one of the interesting pathspecs differs, * so we can return early and let the diff machinery One main benefit of storing on Bloom filter per commit is to avoid recomputing hashes at every commit. Currently, this patch only improves locality when checking membership at the cost of taking up more space. Drop the dependence on the parent oid and then we can save the time spent hashing during history queries. -Stolee
Re: Bloom Filters
On 10/9/2018 7:12 PM, Jeff King wrote: On Tue, Oct 09, 2018 at 05:14:50PM -0400, Jeff King wrote: Hmph. It really sounds like we could do better with a custom RLE solution. But that makes me feel like I'm missing something, because surely I can't invent something better than the state of the art in a simple thought experiment, right? I know what I'm proposing would be quite bad for random access, but my impression is that EWAH is the same. For the scale of bitmaps we're talking about, I think linear/streaming access through the bitmap would be OK. Thinking on it more, what I was missing is that for truly dense random bitmaps, this will perform much worse. Because it will use a byte to say "there's one 1", rather than a bit. But I think it does OK in practice for the very sparse bitmaps we tend to see in this application. I was able to generate a complete output that can reproduce "log --name-status -t" for linux.git in 32MB. But: - 15MB of that is commit sha1s, which will be stored elsewhere in a "real" system - 5MB of that is path list (which should shrink by a factor of 10 with prefix compression, and is really a function of a tree size less than history depth) So the per-commit cost is not too bad. That's still not counting merges, though, which would add another 10-15% (or maybe more; their bitmaps are less sparse). I don't know if this is a fruitful path at all or not. I was mostly just satisfying my own curiosity on the bitmap encoding question. But I'll post the patches, just to show my work. The first one is the same initial proof of concept I showed earlier. [1/3]: initial tree-bitmap proof of concept [2/3]: test-tree-bitmap: add "dump" mode [3/3]: test-tree-bitmap: replace ewah with custom rle encoding Makefile| 1 + t/helper/test-tree-bitmap.c | 344 2 files changed, 345 insertions(+) create mode 100644 t/helper/test-tree-bitmap.c I'm trying to test this out myself, and am having trouble reverse engineering how I'm supposed to test it. Looks like running "t/helper/test-tree-bitmap gen" will output a lot of binary data. Where should I store that? Does any file work? Is this series just for the storage costs, assuming that we would replace all TREESAME checks with a query into this database? Or do you have a way to test how much this would improve a "git log -- " query? Thanks, -Stolee
Re: [PATCH v4 0/6] Fix the racy split index problem
On Thu, Oct 11, 2018 at 12:36:47PM +0200, Ævar Arnfjörð Bjarmason wrote: > > On Thu, Oct 11 2018, SZEDER Gábor wrote: > > > Fourth and hopefully final round of fixing occasional test failures when > > run with 'GIT_TEST_SPLIT_INDEX=yes'. The only code change is the > > extraction of a helper function to compare two cache entries' content, > > and then a couple of minor log message clarifications. The range-diff > > below is rather clear on that. > > Looks good. I'm not going to run the stress test I did on v5 on this > since the changes are just moving existing code into a fuction, unless > you'd like me to or think there's a reason to that is. FWIW, I intend to carry this patch below and use it in tests both locally and on Travis CI. I could never trigger any of those three conditions by repeated test runs with 'GIT_TEST_SPLIT_INDEX=yes', or by deliberately constructing tricky scenarios where they might be triggered. Perhaps with enough time I'll get lucky eventually :) If it's not too much trouble, then perhaps you could pick it up as well? While testing previous versions of this series it turned out that your setup has much more "luck" in finding problematic timings than mine. diff --git a/split-index.c b/split-index.c index 5820412dc5..4af535e236 100644 --- a/split-index.c +++ b/split-index.c @@ -254,8 +254,8 @@ void prepare_to_write_split_index(struct index_state *istate) continue; } if (ce->index > si->base->cache_nr) { - BUG("ce refers to a shared ce at %d, which is beyond the shared index size %d", - ce->index, si->base->cache_nr); + BUG("ce of '%s' refers to a shared ce at %d, which is beyond the shared index size %d", + ce->name, ce->index, si->base->cache_nr); } ce->ce_flags |= CE_MATCHED; /* or "shared" */ base = si->base->cache[ce->index - 1]; @@ -293,10 +293,9 @@ void prepare_to_write_split_index(struct index_state *istate) continue; } if (ce->ce_namelen != base->ce_namelen || - strcmp(ce->name, base->name)) { - ce->index = 0; - continue; - } + strcmp(ce->name, base->name)) + BUG("ce of '%s' refers to the shared ce of a different file '%s'", + ce->name, base->name); /* * This is the copy of a cache entry that is present * in the shared index, created by unpack_trees() @@ -332,7 +331,8 @@ void prepare_to_write_split_index(struct index_state *istate) * set CE_UPDATE_IN_BASE as well. */ if (compare_ce_content(ce, base)) - ce->ce_flags |= CE_UPDATE_IN_BASE; + BUG("ce of '%s' differs from its shared ce, but the CE_UPDATE_IN_BASE flag was not set", + ce->name); } discard_cache_entry(base); si->base->cache[ce->index - 1] = ce;