Re: [RFC PATCH] checkout: Force matching mtime between files
On Fri, Apr 13, 2018 at 07:01:29PM +0200, Michał Górny wrote: > --- a/entry.c > +++ b/entry.c > @@ -411,6 +411,7 @@ int checkout_entry(struct cache_entry *ce, > { > static struct strbuf path = STRBUF_INIT; > struct stat st; > + int ret; > > if (topath) > return write_entry(ce, topath, state, 1); mgorny: Should the topath case trigger utime as well? Other questions: - Would there be be any value in hoisting the utime change into write_entry's finish block rather than having it in checkout_entry? - Should mtimes on directories be set if the directory is explicitly created? - Maybe using futimens on supported platforms? -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 signature.asc Description: Digital signature
Re: GSoC students and mentors in 2018
Hi Paul, On Wed, 25 Apr 2018, Paul-Sebastian Ungureanu wrote: > It is a pleasure and an honor for me to take part in Google Summer of > Code. I am sure it will be a exciting summer and I will definitely give > 100% to successfully fulfill 'git stash' project! I am excited, too, and really glad to have you on board. Here's to a splendid summer! Dscho
Re: [PATCH] git: add -N as a short option for --no-pager
Am 25.04.2018 um 02:05 schrieb Junio C Hamano: Johannes Sixt writes: It is not uncommon to request that the output remains visible in the terminal. For this, the option --no-pager can be used. But it is a bit cumbersome to type, even when command completion is available. Provide a short option, -N, to make the option easier accessible. Signed-off-by: Johannes Sixt --- Heh, I used to append "|cat", which is four keystrokes that is a bit shorter than " --no-pager", but that is only acceptable when you do not care about colored output ;-) I am not absolutely certain about the choice of a single letter. I already checked we do not use "git -N cmd" for anything else right now, so I am certain about the availability, but I am not sure if capital 'N' is the best choice, when the other side is lowercase 'p' (and more importantly, the other side 'p' has mneomonic value for 'pagination', but 'N' merely stands for 'no' and could be negating anything, not related to pagination). But I agree that a short-hand would be welcome. I considered -P, but thought that it would better be reserved for something related to "paths". We have --{html,man,info}-path, which would be served better with -[HMI]. That leaves --exec-path, which we would probably abbreviate as -x or -X. So, -P is perhaps not that bad after all. We could also choose +p, for which there is some precedent in other POSIX tools to mean negated -p, but not in git, I think. Implementationwise, it is not that trivial, either, because the section handling options is guarded by a check that the word begins with a dash. Perhaps --no-pager means also "unpaginated" (-u, -U), "linear" (-l, -L), "streamed" (-s, -S). Other ideas, anyone? diff --git a/Documentation/git.txt b/Documentation/git.txt index 4767860e72..17b50b0dc6 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -11,7 +11,7 @@ SYNOPSIS [verse] 'git' [--version] [--help] [-C ] [-c =] [--exec-path[=]] [--html-path] [--man-path] [--info-path] -[-p|--paginate|--no-pager] [--no-replace-objects] [--bare] +[-p|--paginate|-N|--no-pager] [--no-replace-objects] [--bare] [--git-dir=] [--work-tree=] [--namespace=] [--super-prefix=] []
Re: [PATCH] git: add -N as a short option for --no-pager
On Wed, Apr 25, 2018 at 09:05:56AM +0900, Junio C Hamano wrote: > Johannes Sixt writes: > > > In modern setups, less, the pager, uses alternate screen to show > > the content. When it is closed, it switches back to the original > > screen, and all content is gone. > > > > It is not uncommon to request that the output remains visible in > > the terminal. For this, the option --no-pager can be used. But > > it is a bit cumbersome to type, even when command completion is > > available. Provide a short option, -N, to make the option easier > > accessible. > > > > Signed-off-by: Johannes Sixt > > --- > > Heh, I used to append "|cat", which is four keystrokes that is a bit > shorter than " --no-pager", but that is only acceptable when you do > not care about colored output ;-) > > I am not absolutely certain about the choice of a single letter. I > already checked we do not use "git -N cmd" for anything else right > now, so I am certain about the availability, but I am not sure if > capital 'N' is the best choice, when the other side is lowercase 'p' > (and more importantly, the other side 'p' has mneomonic value for > 'pagination', but 'N' merely stands for 'no' and could be negating > anything, not related to pagination). But I agree that a short-hand > would be welcome. > I'm quite fond of the notation "-p-", but that would set a precedent for all other "--no-" options. Maybe the option parser could be enhanced to allow for both? Thanks, Beat > > diff --git a/Documentation/git.txt b/Documentation/git.txt > > index 4767860e72..17b50b0dc6 100644 > > --- a/Documentation/git.txt > > +++ b/Documentation/git.txt > > @@ -11,7 +11,7 @@ SYNOPSIS > > [verse] > > 'git' [--version] [--help] [-C ] [-c =] > > [--exec-path[=]] [--html-path] [--man-path] [--info-path] > > -[-p|--paginate|--no-pager] [--no-replace-objects] [--bare] > > +[-p|--paginate|-N|--no-pager] [--no-replace-objects] [--bare] > > [--git-dir=] [--work-tree=] [--namespace=] > > [--super-prefix=] > > [] > > @@ -103,6 +103,7 @@ foo.bar= ...`) sets `foo.bar` to the empty string which > > `git config > > configuration options (see the "Configuration Mechanism" section > > below). > > > > +-N:: > > --no-pager:: > > Do not pipe Git output into a pager. signature.asc Description: PGP signature
Re: [PATCH 25/41] builtin/receive-pack: avoid hard-coded constants for push certs
On 25 April 2018 at 04:00, brian m. carlson wrote: > On Tue, Apr 24, 2018 at 11:58:17AM +0200, Martin Ågren wrote: >> On 24 April 2018 at 01:39, brian m. carlson >> > diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c >> > index c4272fbc96..5f35596c14 100644 >> > --- a/builtin/receive-pack.c >> > +++ b/builtin/receive-pack.c >> > @@ -454,21 +454,21 @@ static void hmac_sha1(unsigned char *out, >> > /* RFC 2104 2. (6) & (7) */ >> > git_SHA1_Init(&ctx); >> > git_SHA1_Update(&ctx, k_opad, sizeof(k_opad)); >> > - git_SHA1_Update(&ctx, out, 20); >> > + git_SHA1_Update(&ctx, out, GIT_SHA1_RAWSZ); >> > git_SHA1_Final(out, &ctx); >> > } >> >> Since we do HMAC with SHA-1, we use the functions `git_SHA1_foo()`. Ok. >> But then why not just use "20"? Isn't GIT_SHA1_RAWSZ coupled to the >> whole hash transition thing? This use of "20" is not, IMHO, the "length >> in bytes [...] of an object name" (quoting cache.h). > > Originally, GIT_SHA1_RAWSZ was a good stand-in for the hard-coded uses > of 20 (and GIT_SHA1_HEXSZ for 40) for object IDs. Recently, we've > started moving toward using the_hash_algo for the object ID-specific > hash values, so I've started using those constants only to identify > SHA-1 specific items. > > In this case, using the constant makes it more obvious that what we're > passing is indeed an SHA-1 hash. It also makes it easier to find all > the remaining instances of "20" in the codebase and analyze them > accordingly. > > I agree that this isn't an object name strictly, but it's essentially > equivalent. If you feel strongly, I can leave this the way it is. I see. So one could say that in the ideal end-game, GIT_SHA1_RAWSZ would be gone when the oid-hash-transition is over. Except since we also use SHA-1 for other stuff than object IDs, the real-world ideal end-game is that we only have a few users lingering in places that have nothing to do with oid, but only with SHA-1 (and maybe in the gluing for calculating SHA-1 oids..). I do not feel strongly about this. I was just surprised to see it. Thank you for explaining this. Martin
Re: [PATCH 25/41] builtin/receive-pack: avoid hard-coded constants for push certs
On Tue, Apr 24, 2018 at 11:58:17AM +0200, Martin Ågren wrote: > On 24 April 2018 at 01:39, brian m. carlson > > diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c > > index c4272fbc96..5f35596c14 100644 > > --- a/builtin/receive-pack.c > > +++ b/builtin/receive-pack.c > > @@ -454,21 +454,21 @@ static void hmac_sha1(unsigned char *out, > > /* RFC 2104 2. (6) & (7) */ > > git_SHA1_Init(&ctx); > > git_SHA1_Update(&ctx, k_opad, sizeof(k_opad)); > > - git_SHA1_Update(&ctx, out, 20); > > + git_SHA1_Update(&ctx, out, GIT_SHA1_RAWSZ); > > git_SHA1_Final(out, &ctx); > > } > > Since we do HMAC with SHA-1, we use the functions `git_SHA1_foo()`. Ok. > But then why not just use "20"? Isn't GIT_SHA1_RAWSZ coupled to the > whole hash transition thing? This use of "20" is not, IMHO, the "length > in bytes [...] of an object name" (quoting cache.h). Originally, GIT_SHA1_RAWSZ was a good stand-in for the hard-coded uses of 20 (and GIT_SHA1_HEXSZ for 40) for object IDs. Recently, we've started moving toward using the_hash_algo for the object ID-specific hash values, so I've started using those constants only to identify SHA-1 specific items. In this case, using the constant makes it more obvious that what we're passing is indeed an SHA-1 hash. It also makes it easier to find all the remaining instances of "20" in the codebase and analyze them accordingly. I agree that this isn't an object name strictly, but it's essentially equivalent. If you feel strongly, I can leave this the way it is. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: Fetching tags overwrites existing tags
On Tue, Apr 24, 2018 at 5:52 PM, Junio C Hamano wrote: > Wink Saville writes: > >> Ideally I would have liked the tags fetched from gbenchmark to have a prefix >> of gbenchmark/, like the branches have, maybe something like: >> >> $ git fetch --tags gbenchmark >> ... >> * [new branch] v2 -> gbenchmark/v2 >> * [new tag] v0.0.9 -> gbenchmark/v0.0.9 >> * [new tag] v0.1.0 -> gbenchmark/v0.1.0 >> * [new tag] v1.0.0 -> gbenchmark/v1.0.0 >> * [new tag] v1.1.0 -> gbenchmark/v1.1.0 >> * [new tag] v1.2.0 -> gbenchmark/v1.2.0 >> * [new tag] v1.3.0 -> gbenchmark/v1.3.0 >> * [new tag] v1.4.0 -> gbenchmark/v1.4.0 > > The tag namespace (refs/tags/) is considered a shared resource (I am > not saying that that is the only valid world model---I am merely > explaining why things are like they are), hence the auto-following > tags will bring them to refs/tags/ (and I do not think there is no > way to configure auto-following to place them elsewhere). > > But you could configure things yourself. > > $ git init victim && cd victim > $ git remote add origin ../git.git > $ git config --add remote.origin.fetch \ > "+refs/tags/*:refs/remote-tags/origin/*" > $ tail -n 4 .git/config > [remote "origin"] > url = ../git.git/ > fetch = +refs/heads/*:refs/remotes/origin/* > fetch = +refs/tags/*:refs/remote-tags/origin/* > $ git fetch --no-tags > > The "--no-tags" option serves to decline the auto-following tags to > refs/tags/ hierarchy; once your repository is configured this way, > your initial and subsequent "git fetch" will copy refs it finds in > refs/tags/ hierarchy over there to your refs/remote-tags/origin/ > hierarchy locally. Interesting that kinda works, what about teaching git-remote to understand a "--prefix-tags" option which would create the "fetch = refs/tags/*:refs/remote-tags/origin" entry. And teach git-fetch to use that entry if it exists and not require the "--no-tags"? Of course I'm sure there are "lots" of other things to change, doing a search for "remotes/" gives the following: $ find . -type f -name '*.c' | xargs grep '\bremotes/' | sort -uk1,1 ./builtin/branch.c: fmt = "refs/remotes/%s"; ./builtin/checkout.c: skip_prefix(argv0, "remotes/", &argv0); ./builtin/clone.c: strbuf_addf(&branch_top, "refs/remotes/%s/", option_origin); ./builtin/describe.c: !skip_prefix(path, "refs/remotes/", &path_to_match)) { ./builtin/fast-export.c: "refs/remotes/", ./builtin/fetch.c: else if (starts_with(rm->name, "refs/remotes/")) { ./builtin/merge.c: if (starts_with(found_ref, "refs/remotes/")) { ./builtin/pull.c: * refs/heads/ to refs/remotes//. ./builtin/remote.c: strbuf_addf(tmp, "refs/heads/%s:refs/remotes/%s/%s", ./builtin/rev-parse.c: handle_ref_opt(arg, "refs/remotes/"); ./builtin/show-branch.c: if (!starts_with(refname, "refs/remotes/")) ./contrib/examples/builtin-fetch--tool.c: else if (!strncmp(remote_name, "refs/remotes/", 13)) { ./help.c: if (skip_prefix(refname, "refs/remotes/", &remote) && ./log-tree.c: else if (starts_with(refname, "refs/remotes/")) ./ref-filter.c:skip_prefix(refname, "refs/remotes/", &refname) || ./refs.c: return refs_for_each_ref_in(refs, "refs/remotes/", fn, cb_data); ./remote.c: FILE *f = fopen_or_warn(git_path("remotes/%s", remote->name), "r"); ./revision.c: for_each_glob_ref_in(handle_one_ref, optarg, "refs/remotes/", &cb); ./sha1_name.c: starts_with(refname, "refs/remotes/")) ./wt-status.c: skip_prefix(from, "refs/remotes/", &from); -- wink
Re: Fetching tags overwrites existing tags
On Tue, Apr 24, 2018 at 5:52 PM, Junio C Hamano wrote: > Wink Saville writes: > >> Ideally I would have liked the tags fetched from gbenchmark to have a prefix >> of gbenchmark/, like the branches have, maybe something like: >> >> $ git fetch --tags gbenchmark >> ... >> * [new branch] v2 -> gbenchmark/v2 >> * [new tag] v0.0.9 -> gbenchmark/v0.0.9 >> * [new tag] v0.1.0 -> gbenchmark/v0.1.0 >> * [new tag] v1.0.0 -> gbenchmark/v1.0.0 >> * [new tag] v1.1.0 -> gbenchmark/v1.1.0 >> * [new tag] v1.2.0 -> gbenchmark/v1.2.0 >> * [new tag] v1.3.0 -> gbenchmark/v1.3.0 >> * [new tag] v1.4.0 -> gbenchmark/v1.4.0 > > The tag namespace (refs/tags/) is considered a shared resource (I am > not saying that that is the only valid world model---I am merely > explaining why things are like they are), hence the auto-following > tags will bring them to refs/tags/ (and I do not think there is no > way to configure auto-following to place them elsewhere). > > But you could configure things yourself. > > $ git init victim && cd victim > $ git remote add origin ../git.git > $ git config --add remote.origin.fetch \ > "+refs/tags/*:refs/remote-tags/origin/*" > $ tail -n 4 .git/config > [remote "origin"] > url = ../git.git/ > fetch = +refs/heads/*:refs/remotes/origin/* > fetch = +refs/tags/*:refs/remote-tags/origin/* > $ git fetch --no-tags > > The "--no-tags" option serves to decline the auto-following tags to > refs/tags/ hierarchy; once your repository is configured this way, > your initial and subsequent "git fetch" will copy refs it finds in > refs/tags/ hierarchy over there to your refs/remote-tags/origin/ > hierarchy locally. It should be noted, that remote-tags would not be integrated into "git tag" or many other places in git commands, so it may be significantly less visible. Thanks, Jake
Re: [PATCH 21/41] http: eliminate hard-coded constants
On Wed, Apr 25, 2018 at 08:44:19AM +0900, Junio C Hamano wrote: > Martin Ågren writes: > > >> switch (data[i]) { > >> case 'P': > >> i++; > >> - if (i + 52 <= buf.len && > >> + if (i + hexsz + 12 <= buf.len && > >> starts_with(data + i, " pack-") && > >> - starts_with(data + i + 46, ".pack\n")) { > >> - get_sha1_hex(data + i + 6, sha1); > >> - fetch_and_setup_pack_index(packs_head, > >> sha1, > >> + starts_with(data + i + hexsz + 6, ".pack\n")) { > >> + get_sha1_hex(data + i + 6, hash); > >> + fetch_and_setup_pack_index(packs_head, > >> hash, > >> base_url); > >> i += 51; > > > > s/51/hexsz + 11/ ? > > Quite right. Good point. Will fix. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [PATCH v2 0/3] Some add-on patches on top of dj/runtime-prefix
Johannes Schindelin writes: > Thank you, and sorry for the trouble. I am just too used to a Continuous > Integration setting with exactly one integration branch. Fixing problems close to the source (i.e. picking an appropriately aged base) and making sure everthing works near the tip ala CI style are not opposing goals. It just takes an extra step (i.e. trial merge and testing the result) and discipline. Until one gets used to do it so much that one can do in one's sleep, that is ;-) > I will make an effort in the future to figure out the best base branch for > patches that do not apply cleanly on `master` but require more stuff from > `next`/`pu`. The easiest is to leave that to the maintainer most of the time, as that is what maintainers do. Thanks. I really want to see that the runtime prefix stuff mature enough during this cycle, so these follow-up patches are all very much appreciated.
Re: Fetching tags overwrites existing tags
Wink Saville writes: > Ideally I would have liked the tags fetched from gbenchmark to have a prefix > of gbenchmark/, like the branches have, maybe something like: > > $ git fetch --tags gbenchmark > ... > * [new branch] v2 -> gbenchmark/v2 > * [new tag] v0.0.9 -> gbenchmark/v0.0.9 > * [new tag] v0.1.0 -> gbenchmark/v0.1.0 > * [new tag] v1.0.0 -> gbenchmark/v1.0.0 > * [new tag] v1.1.0 -> gbenchmark/v1.1.0 > * [new tag] v1.2.0 -> gbenchmark/v1.2.0 > * [new tag] v1.3.0 -> gbenchmark/v1.3.0 > * [new tag] v1.4.0 -> gbenchmark/v1.4.0 The tag namespace (refs/tags/) is considered a shared resource (I am not saying that that is the only valid world model---I am merely explaining why things are like they are), hence the auto-following tags will bring them to refs/tags/ (and I do not think there is no way to configure auto-following to place them elsewhere). But you could configure things yourself. $ git init victim && cd victim $ git remote add origin ../git.git $ git config --add remote.origin.fetch \ "+refs/tags/*:refs/remote-tags/origin/*" $ tail -n 4 .git/config [remote "origin"] url = ../git.git/ fetch = +refs/heads/*:refs/remotes/origin/* fetch = +refs/tags/*:refs/remote-tags/origin/* $ git fetch --no-tags The "--no-tags" option serves to decline the auto-following tags to refs/tags/ hierarchy; once your repository is configured this way, your initial and subsequent "git fetch" will copy refs it finds in refs/tags/ hierarchy over there to your refs/remote-tags/origin/ hierarchy locally.
Re: [PATCH v3 09/11] technical/shallow: describe the relationship with replace refs
"Philip Oakley" writes: > Perhaps something like: > +$GIT_DIR/shallow, and handle its contents similar to replace > +refs (with the difference that shallow does not actually > +create those replace refs) with the difference that shallow commits will > +always have their parents not present. I am not sure if there is enough similarity to replace mechanism to mention that. It has lines of text, each of which records a commit object for which Git is told to pretend that it has no parent. To those who are familiar with "graft" format, it is possible to explain the format as "it is similar to graft", as a line with a single commit object name tells Git to pretend that it has no parent in the "graft" format, but because we are getting rid of graft, it probably makes sense to just explain it without reference to replace (which may serve a similar purpose, but is certainly very far from "similar" as a mechanism when you explain how the contents of shallow is handled). $GIT_DIR/shallow lists commit object names and tells Git to pretend as if they are root commits (e.g. "git log" traversal stops after showing them; "git fsck" does not complain saying the commits listed on their "parent" lines do not exist).
git https and github
Hi list, I'm struggling with git connecting to Github. The problem might be SSL/TLS related. https://githubengineering.com/crypto-removal-notice/ I suspect that my setup still uses tlsv1 or tlsv1.1. I've tried to explicitly set git to use tlsv1.2 in my .gitconfig file like this: [http] sslVersion = tlsv1.2 I've tried to re-compile git with OpenSSL and GnuTLS. All give the same error. git clone https://github.com/OnionIoT/source.git Cloning into 'source'... * Couldn't find host github.com in the .netrc file; using defaults * Trying 192.30.253.112... * TCP_NODELAY set * Connected to github.com (192.30.253.112) port 443 (#0) * ALPN, offering http/1.1 * Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH * successfully set certificate verify locations: * CAfile: /etc/ssl/certs/ca-certificates.crt CApath: /etc/ssl/certs * error:1409442E:SSL routines:ssl3_read_bytes:tlsv1 alert protocol version * Curl_http_done: called premature == 1 * stopped the pause stream! * Closing connection 0 fatal: unable to access 'https://github.com/OnionIoT/source.git/': error:1409442E:SSL routines:ssl3_read_bytes:tlsv1 alert protocol version lev@jive:~/git$ unset GIT_SSL_VERSION lev@jive:~/git$ git clone https://github.com/OnionIoT/source.git Cloning into 'source'... * Couldn't find host github.com in the .netrc file; using defaults * Trying 192.30.253.112... * TCP_NODELAY set * Connected to github.com (192.30.253.112) port 443 (#0) * ALPN, offering http/1.1 * Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH * successfully set certificate verify locations: * CAfile: /etc/ssl/certs/ca-certificates.crt CApath: /etc/ssl/certs * error:1409442E:SSL routines:ssl3_read_bytes:tlsv1 alert protocol version * Curl_http_done: called premature == 1 * stopped the pause stream! * Closing connection 0 fatal: unable to access 'https://github.com/OnionIoT/source.git/': error:1409442E:SSL routines:ssl3_read_bytes:tlsv1 alert protocol version I can connect to other git servers without any error. This is a debian stable system with the following components: git version 2.11.0 libcurl 7.52.1 OpenSSL 1.0.2l Is there any way to know what is the exact protocol used? Are there any workaround, fix for this issue? Any help welcome. Thank you, Levente -- Levente Kovacs Senior Electronic Engineer W: http://levente.logonex.eu
Re: [PATCH 5/9] packfile: add repository argument to packed_object_info
Jonathan Tan writes: > On Mon, 23 Apr 2018 16:43:23 -0700 > Stefan Beller wrote: > >> diff --git a/sha1_file.c b/sha1_file.c >> index 93f25c6c6a..b292e04fd3 100644 >> --- a/sha1_file.c >> +++ b/sha1_file.c >> @@ -1307,7 +1307,8 @@ int oid_object_info_extended_the_repository(const >> struct object_id *oid, struct >> * information below, so return early. >> */ >> return 0; >> -rtype = packed_object_info(e.p, e.offset, oi); >> + >> +rtype = packed_object_info(the_repository, e.p, e.offset, oi); > > Extra blank line introduced. Yes, but from the way this function is formatted, I think the new paragraph break there makes sort of sense.
Re: [PATCH v2 0/2] add additional config settings for merge
Ben Peart writes: > diff.renameLimit:: > The number of files to consider when performing the copy/rename > - detection; equivalent to the 'git diff' option `-l`. > + detection; equivalent to the 'git diff' option `-l`. This setting > + has no effect if rename detection is turned off. You mean "turned off via diff.renames"? This is not meant as a suggestion to rewrite this paragraph further---but if the answer is "no", then that might be an indication that the sentence is inviting a misunderstanding. > diff.renames:: > Whether and how Git detects renames. If set to "false", > diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt > index 5a9ab969db..38492bcb98 100644 > --- a/Documentation/merge-config.txt > +++ b/Documentation/merge-config.txt > @@ -39,7 +39,8 @@ include::fmt-merge-msg-config.txt[] > merge.renameLimit:: > The number of files to consider when performing rename detection > during a merge; if not specified, defaults to the value of > - diff.renameLimit. > + diff.renameLimit. This setting has no effect if rename detection > + is turned off. Ditto. If your design is to make the merge machinery completely ignore diff.renames and only pay attention to merge.renames [*1*], then it probably is a good idea to be more specific here, by saying "... is turned off via ...", though. > merge.renames:: > Whether and how Git detects renames. If set to "false", [Footnote] *1* ...which I do not think is such a good idea, by the way. I'd personally expect merge.renames to allow overriding and falling back to diff.renames, just like the {merge,diff}.renameLimit pair does.
Re: [PATCH 7/7] diff.c: add --color-moved-ignore-space-delta option
On Tue, 24 Apr 2018 16:23:28 -0700 Stefan Beller wrote: > >> s/missmatch/mismatch/ > >> Also, what is this used for? > > > > The mismatch should be used for (thanks for catching!) > > checking if the remainder of a line is the same, although a boolean > > may be not the correct choice. We know that the two strings > > passed into compute_ws_delta come from a complete white space > > agnostic comparison, so consider: > > > > + SP SP more TAB more > > + SP SP text TAB text > > > > - SP more TAB more > > - SP text TAB text > > > > which would mark this as a moved block. This is the feature > > working as intended, but what about > > > > + SP SP more TAB more > > + SP SP text TAB text > > > > - SP more SP more > > - SP text TAB text > > > > Note how the length of the strings is the same, hence the current > > code of > > > > compute_ws_delta(...) { > > int d = longer->len - shorter->len; > > out->string = xmemdupz(longer->line, d); > > } > > > > would work fine and not notice the "change in the remainder". > > That ought to be caught by the mismatch variable, that > > is assigned, but not used. > > > > The compare_ws_delta(a, b) needs to be extended to > > > > !a->mismatch && !b->mismatch && existing_condition > > > > Ideally the example from above would be different depending > > on whether the other white space flags are given or not. Thanks - this gives me food for thought. I'm starting to think that it is impossible to avoid creating our own string comparison function that: - seeks to the first non-whitespace character in both strings - checks that both strings, from that first non-whitespace characters, are equal for some definition of equal (whether through strcmp or xdiff_compare_lines) - walks backwards from that first non-whitespace characters to look for the first non-matching whitespace character between the 2 strings The existing diff whitespace modes (to be passed to xdiff_compare_lines) do not give the exact result we want. For example, if XDF_IGNORE_WHITESPACE is used (as is in this patch), lines like "a b" and "ab " would match even though they shouldn't. This comparison function can be used both in moved_entry_cmp() and when constructing the ws_delta (in which case it should be made to output whatever information is needed as out parameters or similar). > >>> + if (diffopt->color_moved_ws_handling & > >>> COLOR_MOVED_DELTA_WHITESPACES) > >>> + /* > >>> + * As there is not specific white space config given, > >>> + * we'd need to check for a new block, so ignore all > >>> + * white space. The setup of the white space > >>> + * configuration for the next block is done else where > >>> + */ > >>> + flags |= XDF_IGNORE_WHITESPACE; > >>> + > >>> return !xdiff_compare_lines(a->es->line, a->es->len, > >>> b->es->line, b->es->len, > >>> flags); We can't add XDF_IGNORE_WHITESPACE here - as far as I can tell, this means that more lines will be treated as moved than the user wants (if the user did not set --color-moved-ignore-all-space). > >> It seems like pmb and wsd are parallel arrays - could each wsd be > >> embedded into the corresponding entry of pmb instead? > > > > I'll explore that. It sounds like a good idea for code hygiene. > > Although if you do not intend to use this feature, then keeping it separate > > would allow for a smaller footprint in memory. If you're worried about memory, wsd can be embedded as a pointer. > > The command is missing a --color-moved, as the > > --color-moved-whitespace-settings > > do not imply --color-moved, yet(?) Maybe one should imply the other (or at least a warning).
Re: [PATCH] git: add -N as a short option for --no-pager
Johannes Sixt writes: > In modern setups, less, the pager, uses alternate screen to show > the content. When it is closed, it switches back to the original > screen, and all content is gone. > > It is not uncommon to request that the output remains visible in > the terminal. For this, the option --no-pager can be used. But > it is a bit cumbersome to type, even when command completion is > available. Provide a short option, -N, to make the option easier > accessible. > > Signed-off-by: Johannes Sixt > --- Heh, I used to append "|cat", which is four keystrokes that is a bit shorter than " --no-pager", but that is only acceptable when you do not care about colored output ;-) I am not absolutely certain about the choice of a single letter. I already checked we do not use "git -N cmd" for anything else right now, so I am certain about the availability, but I am not sure if capital 'N' is the best choice, when the other side is lowercase 'p' (and more importantly, the other side 'p' has mneomonic value for 'pagination', but 'N' merely stands for 'no' and could be negating anything, not related to pagination). But I agree that a short-hand would be welcome. > diff --git a/Documentation/git.txt b/Documentation/git.txt > index 4767860e72..17b50b0dc6 100644 > --- a/Documentation/git.txt > +++ b/Documentation/git.txt > @@ -11,7 +11,7 @@ SYNOPSIS > [verse] > 'git' [--version] [--help] [-C ] [-c =] > [--exec-path[=]] [--html-path] [--man-path] [--info-path] > -[-p|--paginate|--no-pager] [--no-replace-objects] [--bare] > +[-p|--paginate|-N|--no-pager] [--no-replace-objects] [--bare] > [--git-dir=] [--work-tree=] [--namespace=] > [--super-prefix=] > [] > @@ -103,6 +103,7 @@ foo.bar= ...`) sets `foo.bar` to the empty string which > `git config > configuration options (see the "Configuration Mechanism" section > below). > > +-N:: > --no-pager:: > Do not pipe Git output into a pager.
Re: [PATCH v1 2/2] merge: Add merge.aggressive config setting
Ben Peart writes: > That said, it makes sense to me to do >> this when rename detection is turned off. In fact, I think you'd >> automatically want to set aggressive to true whenever rename detection >> is turned off (whether by your merge.renames option or the >> -Xno-renames flag). >> ... > > While combining them would work for our specific use scenario (since > we turn both on already along with turning off merge.stat), I really > hesitate to tie these two different flags and code paths together with > a single config setting. The cases that non-agressive variant leaves unmerged are not auto-resolved only because marking them as merged will rob the chance from the rename detection logic to notice which ones are "new" paths that could be matched with "deleted" ones to turn into renames. If rename deteciton is not done, there is no reason to leave it non aggressive, as "#1 = missing, #2 = something and #3 = missing" entry (just one example that is not auto-resolved by non-agressive, but the principle is the same) left unmerged in the index will get resolved to keep the current entry by the post processing logic anyway. In fact, checking git-merge-resolve would tell us that we already use "aggresive" variant there unconditionally. So, I think Elijah is correct---there is no reason not to enable this setting when the other one to refuse rename detection is in effect.
Re: [PATCH 18/41] index-pack: abstract away hash function constant
On Tue, Apr 24, 2018 at 11:50:16AM +0200, Martin Ågren wrote: > On 24 April 2018 at 01:39, brian m. carlson > wrote: > > The code for reading certain pack v2 offsets had a hard-coded 5 > > representing the number of uint32_t words that we needed to skip over. > > Specify this value in terms of a value from the_hash_algo. > > > > Signed-off-by: brian m. carlson > > --- > > builtin/index-pack.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/builtin/index-pack.c b/builtin/index-pack.c > > index d81473e722..c1f94a7da6 100644 > > --- a/builtin/index-pack.c > > +++ b/builtin/index-pack.c > > @@ -1543,12 +1543,13 @@ static void read_v2_anomalous_offsets(struct > > packed_git *p, > > { > > const uint32_t *idx1, *idx2; > > uint32_t i; > > + const uint32_t hashwords = the_hash_algo->rawsz / sizeof(uint32_t); > > Should we round up? Or just what should we do if a length is not > divisible by 4? (I am not aware of any such hash functions, but one > could exist for all I know.) Another question is whether such an > index-pack v2 will ever contain non-SHA-1 oids to begin with. I can't > find anything suggesting that it could, but this is unfamiliar code to > me. I opted not to simply because I know that our current hash is 20 bytes and the new one will be 32, and I know those are both divisible by 4. I feel confident that any future hash we choose will also be divisible by 4, and the code is going to be complicated if it isn't. I agree that pack v2 is not going to have anything but SHA-1. However, writing all the code such that it's algorithm agnostic means that we can do testing of new algorithms by wholesale replacing the algorithm with a new one, which simplifies things considerably. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: Fetching tags overwrites existing tags
On Tue, Apr 24, 2018 at 12:57 PM, Wink Saville wrote: > If have a repository with a tag "v1.0.0" and I add a remote repository > which also has a tag "v1.0.0" tag is overwritten. > > Google found [1] from 2011 and option 3 is what I'd like to see. Has it been > implemented and I just don't see it? > > [1]: https://groups.google.com/forum/#!topic/git-version-control/0l_rJFyTE60 > > > Here is an example demonstrating what I see: > > $ echo abc > abc.txt > $ git init . > Initialized empty Git repository in > /home/wink/prgs/git/investigate-fetch-tags/.git/ > $ git add * > $ git commit -m "Initial commit" > [master (root-commit) 1116fdc] Initial commit > 1 file changed, 1 insertion(+) > create mode 100644 abc.txt > $ git tag v1.0.0 > $ git remote add gbenchmark g...@github.com:google/benchmark > $ git log --graph --format="%h %s %d" > * 1116fdc Initial commit (HEAD -> master, tag: v1.0.0) > $ git fetch --tags gbenchmark > warning: no common commits > remote: Counting objects: 4400, done. > remote: Compressing objects: 100% (15/15), done. > remote: Total 4400 (delta 5), reused 5 (delta 3), pack-reused 4382 > Receiving objects: 100% (4400/4400), 1.33 MiB | 2.81 MiB/s, done. > Resolving deltas: 100% (2863/2863), done. > From github.com:google/benchmark > * [new branch] clangtidy -> gbenchmark/clangtidy > * [new branch] iter_report -> gbenchmark/iter_report > * [new branch] master -> gbenchmark/master > * [new branch] releasing -> gbenchmark/releasing > * [new branch] reportercleanup -> gbenchmark/reportercleanup > * [new branch] rmheaders -> gbenchmark/rmheaders > * [new branch] v2 -> gbenchmark/v2 > * [new tag] v0.0.9 -> v0.0.9 > * [new tag] v0.1.0 -> v0.1.0 > t [tag update] v1.0.0 -> v1.0.0 > * [new tag] v1.1.0 -> v1.1.0 > * [new tag] v1.2.0 -> v1.2.0 > * [new tag] v1.3.0 -> v1.3.0 > * [new tag] v1.4.0 -> v1.4.0 > $ git log --graph --format="%h %s %d" > * 1116fdc Initial commit (HEAD -> master) > > As you can see the tag on 1116fdc is gone, v1.0.0 tag has been updated > and now its pointing to the tag in gbenchmark: > > $ git log -5 --graph --format="%h %s %d" v1.0.0 > * cd525ae Merge pull request #171 from eliben/update-doc-userealtime > (tag: v1.0.0) > |\ > | * c7ab1b9 Update README to mention UseRealTime for wallclock time > measurements. > |/ > * f662e8b Rename OS_MACOSX macro to new name BENCHMARK_OS_MACOSX. Fix #169 > * 0a1f484 Merge pull request #166 from disconnect3d/master > |\ > | * d2917bc Fixes #165: CustomArguments ret type in README > |/ > > Ideally I would have liked the tags fetched from gbenchmark to have a prefix > of gbenchmark/, like the branches have, maybe something like: > That would require a complete redesign of how we handle remotes. I've proposed ideas in the past but never had time and they didn't gain much traction. It's a known limitation that the tags namespace can only hold a single tag name (even if remotes differ in what that tag is). I *thought* that the tags should not be updated after you fetch it once, but it seems this is not the behavior we get now? My basic idea was to fetch *all* remote refs into a refs///* such that *every* ref in a remote can be determined by something like "refs/tracking/origin/tags/name" instead of "refs/remotes/origin/name", and then tags would have to be updated to check for tags in each remote as well as locally. Additionally, you could update the tool to warn when two remotes have the same tag at different refs, and allow disambiguation. Ideally, "origin/branch" should still DWIM, same for "tag" should work unless there are conflicts. Unfortunately, it's a pretty big change in how remotes are handled, and I never had time to actually work towards a POC or implementation. Mostly, we ended up on bikeshedding what the name should be now that we can't use "refs/remotes" due to backwards compatibility. I don't really like "tracking" as a name, but it was the best I could come up with. (Note, the impetus for this proposal was actually to allow easy sharing of notes and other specialized refs). Thanks, Jake > $ git fetch --tags gbenchmark > ... > * [new branch] v2 -> gbenchmark/v2 > * [new tag] v0.0.9 -> gbenchmark/v0.0.9 > * [new tag] v0.1.0 -> gbenchmark/v0.1.0 > * [new tag] v1.0.0 -> gbenchmark/v1.0.0 > * [new tag] v1.1.0 -> gbenchmark/v1.1.0 > * [new tag] v1.2.0 -> gbenchmark/v1.2.0 > * [new tag] v1.3.0 -> gbenchmark/v1.3.0 > * [new tag] v1.4.0 -> gbenchmark/v1.4.0 > > > -- Wink
[PATCH v3] Make running git under other debugger-like programs easy
This allows us to run git, when using the script from bin-wrappers, under other programs. A few examples for usage within testsuite scripts: debug git checkout master debug --debugger=nemiver git $ARGS debug -d "valgrind --tool-memcheck --track-origins=yes" git $ARGS Or, if someone has bin-wrappers/ in their $PATH and is executing git outside the testsuite: GIT_DEBUGGER="gdb --args" git $ARGS GIT_DEBUGGER=nemiver git $ARGS GIT_DEBUGGER="valgrind --tool=memcheck --track-origins=yes" git $ARGS There is also a handy shortcut of GIT_DEBUGGER=1 meaning the same as GIT_DEBUGGER="gdb --args" Original-patch-by: Johannes Schindelin Signed-off-by: Elijah Newren --- Finally getting back to this now that I have tied up all loose ends with the directory rename detection series (or at least I hope they are all tied up). There is only one change since v2: s/DBG_FLAGS/GIT_DEBUGGER/, as suggested by Dscho t/test-lib-functions.sh | 24 wrap-for-bin.sh | 19 +-- 2 files changed, 33 insertions(+), 10 deletions(-) diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index b895366fee..a407b09b48 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -145,12 +145,28 @@ test_pause () { "$SHELL_PATH" <&6 >&5 2>&7 } -# Wrap git in gdb. Adding this to a command can make it easier to -# understand what is going on in a failing test. +# Wrap git with a debugger. Adding this to a command can make it easier +# to understand what is going on in a failing test. # -# Example: "debug git checkout master". +# Examples: +# debug git checkout master +# debug --debugger=nemiver git $ARGS +# debug -d "valgrind --tool=memcheck --track-origins=yes" git $ARGS debug () { -GIT_TEST_GDB=1 "$@" <&6 >&5 2>&7 + case "$1" in + -d) + GIT_DEBUGGER="$2" && + shift 2 + ;; + --debugger=*) + GIT_DEBUGGER="${1#*=}" && + shift 1 + ;; + *) + GIT_DEBUGGER=1 + ;; + esac && + GIT_DEBUGGER="${GIT_DEBUGGER}" "$@" <&6 >&5 2>&7 } # Call test_commit with the arguments diff --git a/wrap-for-bin.sh b/wrap-for-bin.sh index 5842408817..95851b85b6 100644 --- a/wrap-for-bin.sh +++ b/wrap-for-bin.sh @@ -20,10 +20,17 @@ PATH='@@BUILD_DIR@@/bin-wrappers:'"$PATH" export GIT_EXEC_PATH GITPERLLIB PATH GIT_TEXTDOMAINDIR -if test -n "$GIT_TEST_GDB" -then - unset GIT_TEST_GDB - exec gdb --args "${GIT_EXEC_PATH}/@@PROG@@" "$@" -else +case "$GIT_DEBUGGER" in +'') exec "${GIT_EXEC_PATH}/@@PROG@@" "$@" -fi + ;; +1) + unset GIT_DEBUGGER + exec gdb --args "${GIT_EXEC_PATH}/@@PROG@@" "$@" + ;; +*) + GIT_DEBUGGER_ARGS="$GIT_DEBUGGER" + unset GIT_DEBUGGER + exec ${GIT_DEBUGGER_ARGS} "${GIT_EXEC_PATH}/@@PROG@@" "$@" + ;; +esac -- 2.17.0.2.g31fed8301b
Re: [PATCH 21/41] http: eliminate hard-coded constants
Martin Ågren writes: >> switch (data[i]) { >> case 'P': >> i++; >> - if (i + 52 <= buf.len && >> + if (i + hexsz + 12 <= buf.len && >> starts_with(data + i, " pack-") && >> - starts_with(data + i + 46, ".pack\n")) { >> - get_sha1_hex(data + i + 6, sha1); >> - fetch_and_setup_pack_index(packs_head, sha1, >> + starts_with(data + i + hexsz + 6, ".pack\n")) { >> + get_sha1_hex(data + i + 6, hash); >> + fetch_and_setup_pack_index(packs_head, hash, >> base_url); >> i += 51; > > s/51/hexsz + 11/ ? Quite right.
Re: [PATCH 7/7] diff.c: add --color-moved-ignore-space-delta option
Replied to Jonathan only instead of all. My reply is below: On Tue, Apr 24, 2018 at 3:55 PM, Stefan Beller wrote: > On Tue, Apr 24, 2018 at 3:35 PM, Jonathan Tan > wrote: >> On Tue, 24 Apr 2018 14:03:30 -0700 >> Stefan Beller wrote: >> >>> +--color-moved-[no-]ignore-space-prefix-delta:: >>> + Ignores whitespace when comparing lines when performing the move >>> + detection for --color-moved. This ignores uniform differences >>> + of white space at the beginning lines in moved blocks. >> >> Setting this option means that moved lines may be indented or dedented, >> and if they have been indented or dedented by the same amount, they are >> still considered to be the same block. Maybe call this >> --color-moved-allow-indentation-change. > > ok, sounds good as well. I tried coming up with a name that refers to > the block check as that is the important part. > >>> +struct ws_delta { >>> + char *string; /* The prefix delta, which is the same in the block */ >> >> Probably better described as "the difference between the '-' line and >> the '+' line". This may be the empty string if there is no difference. > > Makes sense. > >> >>> + int direction; /* adding or removing the line? */ >> >> What is the value when "added" and what when "removed"? Also, it is not >> truly "added" or "removed", so a better way might be: 1 if the '-' line >> is longer than the '+' line, and 0 otherwise. (And make sure that the >> documentation is correct with respect to equal lines.) >> >>> + int missmatch; /* in the remainder */ >> >> s/missmatch/mismatch/ >> Also, what is this used for? > > The mismatch should be used for (thanks for catching!) > checking if the remainder of a line is the same, although a boolean > may be not the correct choice. We know that the two strings > passed into compute_ws_delta come from a complete white space > agnostic comparison, so consider: > > + SP SP more TAB more > + SP SP text TAB text > > - SP more TAB more > - SP text TAB text > > which would mark this as a moved block. This is the feature > working as intended, but what about > > + SP SP more TAB more > + SP SP text TAB text > > - SP more SP more > - SP text TAB text > > Note how the length of the strings is the same, hence the current > code of > > compute_ws_delta(...) { > int d = longer->len - shorter->len; > out->string = xmemdupz(longer->line, d); > } > > would work fine and not notice the "change in the remainder". > That ought to be caught by the mismatch variable, that > is assigned, but not used. > > The compare_ws_delta(a, b) needs to be extended to > > !a->mismatch && !b->mismatch && existing_condition > > Ideally the example from above would be different depending > on whether the other white space flags are given or not. > >>> + if (diffopt->color_moved_ws_handling & COLOR_MOVED_DELTA_WHITESPACES) >>> + /* >>> + * As there is not specific white space config given, >>> + * we'd need to check for a new block, so ignore all >>> + * white space. The setup of the white space >>> + * configuration for the next block is done else where >>> + */ >>> + flags |= XDF_IGNORE_WHITESPACE; >>> + >>> return !xdiff_compare_lines(a->es->line, a->es->len, >>> b->es->line, b->es->len, >>> flags); >> >> I wrote in [1]: >> >> I think we should just prohibit combining this with any of the >> whitespace ignoring flags except for the space-at-eol one. They seem >> to contradict anyway. > > As outlined above, I think there are corner cases in which they do not > contradict. So I think the COLOR_MOVED_DELTA_WHITESPACES > will go into its own variable, and then we can solve the corner cases > correctly. > >> To elaborate, adding a specific flag that may interfere with other >> user-provided flags sounds like we're unnecessarily adding combinations >> that we must keep track of, and that it's both logical (from a user's >> point of view) and clearer (as for the code) to just forbid such >> combinations. > > Yes, I think you mentioned this before. Thanks for reminding! > >>> + struct ws_delta *wsd = NULL; /* white space deltas between pmb */ >>> + int wsd_alloc = 0; >>> + >>> + int n, flipped_block = 1, block_length = 0; >> >> It seems like pmb and wsd are parallel arrays - could each wsd be >> embedded into the corresponding entry of pmb instead? > > I'll explore that. It sounds like a good idea for code hygiene. > Although if you do not intend to use this feature, then keeping it separate > would allow for a smaller footprint in memory. > >> >>> --- a/diff.h >>> +++ b/diff.h >>> @@ -214,6 +214,8 @@ struct diff_options { >>> } color_moved; >>> #define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA >>> #define COLOR_MOVED_MIN_ALNUM_COUNT 20 >>> + /* XDF_WHITESPACE_FLAGS regarding block detection are set at 2, 3, 4
Re: [PATCH 5/7] diff.c: add a blocks mode for moved code detection
On Tue, 24 Apr 2018 15:37:51 -0700 Stefan Beller wrote: > These can be combined independently, so would > you expect the user to expect two options for them? > For example "--color-moved=zebra" could be split > into "--skip-small --alternate-blocks" Yes, this is a good explanation. Reusing your terms below, --skip-small controls the algorithm, and --alternate-blocks controls the presentation layer. > So instead of building blocks we rather want to split into algorithms > and presentation layer? > > The presentation layer would be things like: > * use a different color for moved things > * alternate colors for adjacent blocks > * paint border of a block (dimmed zebra) > > The algorithm side would be > * detect moves > * detect moves as blocks > * skip small heuristic Yes. This was just brainstorming, though - this might not be the direction we want to take in this patch. (The right solution might just be to always use blocks - thereby simplifying the algorithm aspect.)
Re: [PATCHv2 0/7] Moved code detection: ignore space on uniform indentation
On Tue, Apr 24, 2018 at 3:37 PM, Jonathan Tan wrote: > On Tue, 24 Apr 2018 14:03:23 -0700 > Stefan Beller wrote: > >> v2: >> I think I have addressed Jonathans feedback >> * by using a string instead of counting the first character only. >> * refined tests slightly (easier to read) >> * moved white space handling for moved blocks into its own flag field, >> keeping the enum for the actual mode of move detection. > > For reference, v1 is here: > https://public-inbox.org/git/20180402224854.86922-1-sbel...@google.com/ > >> Stefan Beller (7): >> xdiff/xdiff.h: remove unused flags >> xdiff/xdiffi.c: remove unneeded function declarations >> diff.c: do not pass diff options as keydata to hashmap >> diff.c: adjust hash function signature to match hashmap expectation >> diff.c: add a blocks mode for moved code detection >> diff.c: decouple white space treatment from move detection algorithm >> diff.c: add --color-moved-ignore-space-delta option > > I'm not sure if we should add a new "blocks" mode, or if we should > modify the existing plain mode to have the minimum block length instead. > I reviewed the code as if we want the new "blocks" mode. Thanks for the review! I think keeping plain is useful, see 176841f0c9 (diff.c: color moved lines differently, plain mode, 2017-06-30) diff.c: color moved lines differently, plain mode Add the 'plain' mode for move detection of code. This omits the checking for adjacent blocks, so it is not as useful. If you have a lot of the same blocks moved in the same patch, the 'Zebra' would end up slow as it is O(n^2) (n is number of same blocks). So this may be useful there and is generally easy to add. Instead be very literal at the move detection, do not skip over short blocks here. Although if we do not care about that use case we can just add heuristics to plain. As eluded to in Ævars email, we might want to break it up into multiple options as well?
Re: [PATCHv2 9/9] cache.h: allow oid_object_info to handle arbitrary repositories
On Tue, 24 Apr 2018 14:59:09 -0700 Stefan Beller wrote: > This involves also adapting oid_object_info_extended and a some > internal functions that are used to implement these. It all has to > happen in one patch, because of a single recursive chain of calls visits > all these functions. I wrote about delta_base_cache in a reply [1] to an earlier version, which is indeed safe (as discussed), but I think that other reviewers might have questions about that too so I think it's worth noting that in the commit message. Maybe write something like: Among the functions modified to handle arbitrary repositories, unpack_entry() is one of them. Note that it still references the globals "delta_base_cache" and "delta_base_cached", but those are safe to be referenced (the former is indexed partly by "struct packed_git *", which is repo-specific, and the latter is only used to limit the size of the former as an optimization). [1] https://public-inbox.org/git/20180424112332.38c0d04d96689f030e968...@google.com/ > sha1_object_info_extended is also used in partial clones, which allow > fetching missing objects. As this series will not add the repository > struct to the transport code and fetch_object(), add a TODO note and > bug out if a user tries to use a partial clone in a repository other than > the_repository. s/sha1_object/oid_object/ (in the 2nd paragraph) Also, you sent 2 versions of PATCHv2 9/9. > @@ -1290,9 +1291,12 @@ int oid_object_info_extended_the_repository(const > struct object_id *oid, struct > if (fetch_if_missing && repository_format_partial_clone && > !already_retried) { > /* > - * TODO Investigate haveing fetch_object() return > + * TODO Investigate having fetch_object() return >* TODO error/success and stopping the music here. > + * TODO Pass a repository struct through fetch_object. >*/ > + if (r != the_repository) > + die(_("partial clones only supported in > the_repository")); > fetch_object(repository_format_partial_clone, > real->hash); > already_retried = 1; > continue; This most likely means that a partial clone with a submodule would wrongly error out here. Instead, the "r == the_repository" check should be done in the same "if" statement as repository_format_partial_clone (and no "die"-ing occurs if it fails - just that there will be no fetching of objects).
Re: [PATCH 5/7] diff.c: add a blocks mode for moved code detection
On Tue, Apr 24, 2018 at 2:50 PM, Jonathan Tan wrote: > On Tue, 24 Apr 2018 14:03:28 -0700 > Stefan Beller wrote: > >> Suggested-by: Ævar Arnfjörð Bjarmason >> (https://public-inbox.org/git/87o9j0uljo@evledraar.gmail.com/) >> Signed-off-by: Stefan Beller > > Firstly, I don't know if this is the right solution- as written > in the linked e-mail [1], the issue might be more that the config > conflates 2 unrelated things, not that a certain intersection is > missing. The "plain zebra" or as I call them "blocks", has the "heuristic for a minimum of 20 characters" and "few colors" as its defining features, which solves that use case. Stepping back a bit, we have different "building blocks" at our disposal: * move detection by line or block * alternating blocks * a heuristic to skip over small chunks (20 alnum chars) These can be combined independently, so would you expect the user to expect two options for them? For example "--color-moved=zebra" could be split into "--skip-small --alternate-blocks" Eventually we'll use various colors to inform the user what these building blocks made of the diff. Ævar wrote: > Which is what I mean by the current config conflating two (to me) > unrelated things. One is how we, via any method, detect what's moved or > not, and the other is what color/format we use to present this to the > user. So instead of building blocks we rather want to split into algorithms and presentation layer? The presentation layer would be things like: * use a different color for moved things * alternate colors for adjacent blocks * paint border of a block (dimmed zebra) The algorithm side would be * detect moves * detect moves as blocks * skip small heuristic Am I still missing the big picture? > [1] https://public-inbox.org/git/87muykuij7@evledraar.gmail.com/ > > Optional: Probably better to put the link inline, instead of in the > trailer. ok. > >> -test_expect_success 'detect permutations inside moved code -- dimmed_zebra' >> ' >> +test_expect_success 'detect blocks of moved code' ' >> git reset --hard && >> cat <<-\EOF >lines.txt && >> long line 1 >> @@ -1271,6 +1271,52 @@ test_expect_success 'detect permutations inside moved >> code -- dimmed_zebra' ' >> test_config color.diff.newMovedDimmed "normal cyan" && >> test_config color.diff.oldMovedAlternativeDimmed "normal blue" && >> test_config color.diff.newMovedAlternativeDimmed "normal yellow" && > > Add a comment here explaining that these colors do not appear in the > output, but merely set to recognizable values to ensure that they do not > appear in the output. ok.
Re: [PATCHv2 0/7] Moved code detection: ignore space on uniform indentation
On Tue, 24 Apr 2018 14:03:23 -0700 Stefan Beller wrote: > v2: > I think I have addressed Jonathans feedback > * by using a string instead of counting the first character only. > * refined tests slightly (easier to read) > * moved white space handling for moved blocks into its own flag field, > keeping the enum for the actual mode of move detection. For reference, v1 is here: https://public-inbox.org/git/20180402224854.86922-1-sbel...@google.com/ > Stefan Beller (7): > xdiff/xdiff.h: remove unused flags > xdiff/xdiffi.c: remove unneeded function declarations > diff.c: do not pass diff options as keydata to hashmap > diff.c: adjust hash function signature to match hashmap expectation > diff.c: add a blocks mode for moved code detection > diff.c: decouple white space treatment from move detection algorithm > diff.c: add --color-moved-ignore-space-delta option I'm not sure if we should add a new "blocks" mode, or if we should modify the existing plain mode to have the minimum block length instead. I reviewed the code as if we want the new "blocks" mode.
Re: [PATCH 7/7] diff.c: add --color-moved-ignore-space-delta option
On Tue, 24 Apr 2018 14:03:30 -0700 Stefan Beller wrote: > +--color-moved-[no-]ignore-space-prefix-delta:: > + Ignores whitespace when comparing lines when performing the move > + detection for --color-moved. This ignores uniform differences > + of white space at the beginning lines in moved blocks. Setting this option means that moved lines may be indented or dedented, and if they have been indented or dedented by the same amount, they are still considered to be the same block. Maybe call this --color-moved-allow-indentation-change. > +struct ws_delta { > + char *string; /* The prefix delta, which is the same in the block */ Probably better described as "the difference between the '-' line and the '+' line". This may be the empty string if there is no difference. > + int direction; /* adding or removing the line? */ What is the value when "added" and what when "removed"? Also, it is not truly "added" or "removed", so a better way might be: 1 if the '-' line is longer than the '+' line, and 0 otherwise. (And make sure that the documentation is correct with respect to equal lines.) > + int missmatch; /* in the remainder */ s/missmatch/mismatch/ Also, what is this used for? > + if (diffopt->color_moved_ws_handling & COLOR_MOVED_DELTA_WHITESPACES) > + /* > + * As there is not specific white space config given, > + * we'd need to check for a new block, so ignore all > + * white space. The setup of the white space > + * configuration for the next block is done else where > + */ > + flags |= XDF_IGNORE_WHITESPACE; > + > return !xdiff_compare_lines(a->es->line, a->es->len, > b->es->line, b->es->len, > flags); I wrote in [1]: I think we should just prohibit combining this with any of the whitespace ignoring flags except for the space-at-eol one. They seem to contradict anyway. To elaborate, adding a specific flag that may interfere with other user-provided flags sounds like we're unnecessarily adding combinations that we must keep track of, and that it's both logical (from a user's point of view) and clearer (as for the code) to just forbid such combinations. [1] https://public-inbox.org/git/20180402174118.d204ec0d4b9d2fa7ebd77...@google.com/ > struct moved_entry **pmb = NULL; /* potentially moved blocks */ > int pmb_nr = 0, pmb_alloc = 0; > - int n, flipped_block = 1, block_length = 0; > > + struct ws_delta *wsd = NULL; /* white space deltas between pmb */ > + int wsd_alloc = 0; > + > + int n, flipped_block = 1, block_length = 0; It seems like pmb and wsd are parallel arrays - could each wsd be embedded into the corresponding entry of pmb instead? > --- a/diff.h > +++ b/diff.h > @@ -214,6 +214,8 @@ struct diff_options { > } color_moved; > #define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA > #define COLOR_MOVED_MIN_ALNUM_COUNT 20 > + /* XDF_WHITESPACE_FLAGS regarding block detection are set at 2, 3, 4 */ > + #define COLOR_MOVED_DELTA_WHITESPACES (1 << 22) > int color_moved_ws_handling; > }; Setting of DELTA_WHITESPACES should be a separate field, not as part of ws_handling. It's fine for the ws_handling to be a bitset, since that's how it's passed to xdiff_compare_lines(), but we don't need to do the same for DELTA_WHITESPACES. > +test_expect_success 'compare whitespace delta across moved blocks' ' > + > + git reset --hard && > + q_to_tab <<-\EOF >text.txt && > + QIndented > + QText across > + Qthree lines > + QBut! <- this stands out > + Qthis one > + QQline did > + Qnot adjust > + EOF > + > + git add text.txt && > + git commit -m "add text.txt" && > + > + q_to_tab <<-\EOF >text.txt && > + QQIndented > + QQText across > + QQthree lines > + QQQBut! <- this stands out > + this one > + Qline did > + not adjust > + EOF > + > + git diff --color --color-moved-ignore-space-prefix-delta | > + grep -v "index" | > + test_decode_color >actual && > + > + q_to_tab <<-\EOF >expected && > + diff --git a/text.txt b/text.txt > + --- a/text.txt > + +++ b/text.txt > + @@ -1,7 +1,7 @@ > + -QIndented > + -QText across > + -Qthree lines > + -QBut! <- this stands out > + -Qthis one > + -QQline did > + -Qnot adjust > + +QQIndented > + +QQText across > + +QQthree lines > + +QQQBut! <- this stands out > + +this one > + +Qline did > + +not adjust > + EOF > + > + test_cmp expected actual > +' I would have expected every line except the "this stands out" line to be colored differently than the usual RED and GREEN. Is this test output ex
Re: [PATCH 6/7] diff.c: decouple white space treatment from move detection algorithm
Hi Jonathan, On Tue, Apr 24, 2018 at 3:00 PM, Jonathan Tan wrote: > On Tue, 24 Apr 2018 14:03:29 -0700 > Stefan Beller wrote: > >> As we change the default, we'll adjust the tests. > > This statement is probably better written as: > > In some existing tests, options like --ignore-space-at-eol were used > to control the color of the output. They have been updated to use > options like --color-moved-ignore-space-at-eol instead. I'll adjust that statement; thanks for helping me out with good commit messages (even the "As we change the defaults, .." was proposed by you in a previous round) > >> + unsigned flags = diffopt->color_moved_ws_handling >> + & XDF_WHITESPACE_FLAGS; > > No need for "& XDF_WHITESPACE_FLAGS". This is in anticipation of the next commit, when color_moved_ws_handling takes more flags. I can move that over to the last commit. > >> + unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; > > Same here. Maybe I'll just state in the commit message why we keep the masking here. > >> @@ -214,6 +214,7 @@ struct diff_options { >> } color_moved; >> #define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA >> #define COLOR_MOVED_MIN_ALNUM_COUNT 20 >> + int color_moved_ws_handling; >> }; > > Should the "int" be "unsigned"? yes. > I noticed that the flag-like xdl_opts is > signed, but I think it's better for flags to be unsigned. I can change those, too. > Also, document > what this stores. ok, will document. > (And also, I would limit the bits.) Not sure I follow. you want to make it e.g. unsigned color_moved_ws_handling : 6; ? Oh, that would actually work, as XDF_WHITESPACE_FLAGS are in second to fifth bits. But then we need to document why the XDF_WHITESPACE need to be at these low positions. >> + q_to_tab <<-\EOF >text.txt && >> + Qa long line to exceed per-line minimum >> + Qanother long line to exceed per-line minimum >> + new file > > I know I suggested "per-line minimum", but I don't think there is one - > I think we only have a per-block minimum. Maybe delete "per-line" in > each of the lines. yeah, I guess this heuristic could also make for another setting, though as of now I did not desire any other heuristic than you originally came up with. Will reword the text. Thanks! Thanks, Stefan
Re: [PATCH 6/7] diff.c: decouple white space treatment from move detection algorithm
On Tue, 24 Apr 2018 14:03:29 -0700 Stefan Beller wrote: > As we change the default, we'll adjust the tests. This statement is probably better written as: In some existing tests, options like --ignore-space-at-eol were used to control the color of the output. They have been updated to use options like --color-moved-ignore-space-at-eol instead. > + unsigned flags = diffopt->color_moved_ws_handling > + & XDF_WHITESPACE_FLAGS; No need for "& XDF_WHITESPACE_FLAGS". > + unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; Same here. > @@ -214,6 +214,7 @@ struct diff_options { > } color_moved; > #define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA > #define COLOR_MOVED_MIN_ALNUM_COUNT 20 > + int color_moved_ws_handling; > }; Should the "int" be "unsigned"? I noticed that the flag-like xdl_opts is signed, but I think it's better for flags to be unsigned. Also, document what this stores. (And also, I would limit the bits.) > +test_expect_success 'only move detection ignores white spaces' ' > + git reset --hard && > + q_to_tab <<-\EOF >text.txt && > + a long line to exceed per-line minimum > + another long line to exceed per-line minimum > + original file > + EOF > + git add text.txt && > + git commit -m "add text" && > + q_to_tab <<-\EOF >text.txt && > + Qa long line to exceed per-line minimum > + Qanother long line to exceed per-line minimum > + new file > + EOF > + > + # Make sure we get a different diff using -w > + git diff --color --color-moved -w \ > + --color-moved-no-ignore-all-space \ > + --color-moved-no-ignore-space-change \ > + --color-moved-no-ignore-space-at-eol | > + grep -v "index" | > + test_decode_color >actual && > + q_to_tab <<-\EOF >expected && > + diff --git a/text.txt b/text.txt > + --- a/text.txt > + +++ b/text.txt > + @@ -1,3 +1,3 @@ > + Qa long line to exceed per-line minimum > + Qanother long line to exceed per-line minimum > + -original file > + +new file > + EOF > + test_cmp expected actual && > + > + # And now ignoring white space only in the move detection > + git diff --color --color-moved \ > + --color-moved-ignore-all-space \ > + --color-moved-ignore-space-change \ > + --color-moved-ignore-space-at-eol | > + grep -v "index" | > + test_decode_color >actual && > + q_to_tab <<-\EOF >expected && > + diff --git a/text.txt b/text.txt > + --- a/text.txt > + +++ b/text.txt > + @@ -1,3 +1,3 @@ > + -a long line to exceed per-line minimum > + -another long line to exceed per-line minimum > + -original file > + +Qa long line to exceed per-line > minimum > + +Qanother long line to exceed per-line > minimum > + +new file > + EOF > + test_cmp expected actual > ' I know I suggested "per-line minimum", but I don't think there is one - I think we only have a per-block minimum. Maybe delete "per-line" in each of the lines.
[PATCHv2 9/9] cache.h: allow sha1_object_info to handle arbitrary repositories
This involves also adapting sha1_object_info_extended and a some internal functions that are used to implement these. It all has to happen in one patch, because of a single recursive chain of calls visits all these functions. sha1_object_info_extended is also used in partial clones, which allow fetching missing objects. As this series will not add the repository struct to the transport code and fetch_object(), add a TODO note and bug out if a user tries to use a partial clone in a repository other than the_repository. Helped-by: Brandon Williams Helped-by: Jonathan Tan Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- cache.h | 9 - packfile.c | 58 ++--- packfile.h | 8 sha1_file.c | 30 --- 4 files changed, 54 insertions(+), 51 deletions(-) diff --git a/cache.h b/cache.h index 6340b2c572..3a4d80e92b 100644 --- a/cache.h +++ b/cache.h @@ -1192,8 +1192,7 @@ static inline void *read_object_file(const struct object_id *oid, enum object_ty } /* Read and unpack an object file into memory, write memory to an object file */ -#define oid_object_info(r, o, f) oid_object_info_##r(o, f) -int oid_object_info_the_repository(const struct object_id *, unsigned long *); +int oid_object_info(struct repository *r, const struct object_id *, unsigned long *); extern int hash_object_file(const void *buf, unsigned long len, const char *type, struct object_id *oid); @@ -1675,9 +1674,9 @@ struct object_info { /* Do not check loose object */ #define OBJECT_INFO_IGNORE_LOOSE 16 -#define oid_object_info_extended(r, oid, oi, flags) \ - oid_object_info_extended_##r(oid, oi, flags) -int oid_object_info_extended_the_repository(const struct object_id *, struct object_info *, unsigned flags); +int oid_object_info_extended(struct repository *r, +const struct object_id *, +struct object_info *, unsigned flags); /* * Set this to 0 to prevent sha1_object_info_extended() from fetching missing diff --git a/packfile.c b/packfile.c index 8de87f904b..55d383ed0a 100644 --- a/packfile.c +++ b/packfile.c @@ -1104,9 +1104,9 @@ static const unsigned char *get_delta_base_sha1(struct packed_git *p, return NULL; } -#define retry_bad_packed_offset(r, p, o) \ - retry_bad_packed_offset_##r(p, o) -static int retry_bad_packed_offset_the_repository(struct packed_git *p, off_t obj_offset) +static int retry_bad_packed_offset(struct repository *r, + struct packed_git *p, + off_t obj_offset) { int type; struct revindex_entry *revidx; @@ -1116,7 +1116,7 @@ static int retry_bad_packed_offset_the_repository(struct packed_git *p, off_t ob return OBJ_BAD; nth_packed_object_oid(&oid, p, revidx->nr); mark_bad_packed_object(p, oid.hash); - type = oid_object_info(the_repository, &oid, NULL); + type = oid_object_info(r, &oid, NULL); if (type <= OBJ_NONE) return OBJ_BAD; return type; @@ -1124,13 +1124,12 @@ static int retry_bad_packed_offset_the_repository(struct packed_git *p, off_t ob #define POI_STACK_PREALLOC 64 -#define packed_to_object_type(r, p, o, t, w, c) \ - packed_to_object_type_##r(p, o, t, w, c) -static enum object_type packed_to_object_type_the_repository(struct packed_git *p, -off_t obj_offset, -enum object_type type, -struct pack_window **w_curs, -off_t curpos) +static enum object_type packed_to_object_type(struct repository *r, + struct packed_git *p, + off_t obj_offset, + enum object_type type, + struct pack_window **w_curs, + off_t curpos) { off_t small_poi_stack[POI_STACK_PREALLOC]; off_t *poi_stack = small_poi_stack; @@ -1157,7 +1156,7 @@ static enum object_type packed_to_object_type_the_repository(struct packed_git * if (type <= OBJ_NONE) { /* If getting the base itself fails, we first * retry the base, otherwise unwind */ - type = retry_bad_packed_offset(the_repository, p, base_offset); + type = retry_bad_packed_offset(r, p, base_offset); if (type > OBJ_NONE) goto out; goto unwind; @@ -1185,7 +1184,7 @@ static enum object_type packed_to_object_type_the_repository(struct pac
[PATCHv2 7/9] packfile: add repository argument to unpack_entry
Add a repository argument to allow the callers of unpack_entry to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- fast-import.c | 2 +- pack-check.c | 3 ++- packfile.c| 7 --- packfile.h| 3 ++- 4 files changed, 9 insertions(+), 6 deletions(-) diff --git a/fast-import.c b/fast-import.c index afe06bd7c1..b009353e93 100644 --- a/fast-import.c +++ b/fast-import.c @@ -1376,7 +1376,7 @@ static void *gfi_unpack_entry( */ p->pack_size = pack_size + the_hash_algo->rawsz; } - return unpack_entry(p, oe->idx.offset, &type, sizep); + return unpack_entry(the_repository, p, oe->idx.offset, &type, sizep); } static const char *get_mode(const char *str, uint16_t *modep) diff --git a/pack-check.c b/pack-check.c index 385d964bdd..d3a57df34f 100644 --- a/pack-check.c +++ b/pack-check.c @@ -1,4 +1,5 @@ #include "cache.h" +#include "repository.h" #include "pack.h" #include "pack-revindex.h" #include "progress.h" @@ -134,7 +135,7 @@ static int verify_packfile(struct packed_git *p, data = NULL; data_valid = 0; } else { - data = unpack_entry(p, entries[i].offset, &type, &size); + data = unpack_entry(the_repository, p, entries[i].offset, &type, &size); data_valid = 1; } diff --git a/packfile.c b/packfile.c index 2876e04bb1..d5ac48ef18 100644 --- a/packfile.c +++ b/packfile.c @@ -1279,7 +1279,7 @@ static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset, ent = get_delta_base_cache_entry(p, base_offset); if (!ent) - return unpack_entry(p, base_offset, type, base_size); + return unpack_entry(the_repository, p, base_offset, type, base_size); if (type) *type = ent->type; @@ -1485,8 +1485,9 @@ static void *read_object_the_repository(const struct object_id *oid, return content; } -void *unpack_entry(struct packed_git *p, off_t obj_offset, - enum object_type *final_type, unsigned long *final_size) +void *unpack_entry_the_repository(struct packed_git *p, off_t obj_offset, + enum object_type *final_type, + unsigned long *final_size) { struct pack_window *w_curs = NULL; off_t curpos = obj_offset; diff --git a/packfile.h b/packfile.h index bc8d840b1b..1efa57a90e 100644 --- a/packfile.h +++ b/packfile.h @@ -115,7 +115,8 @@ extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n); extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *); extern int is_pack_valid(struct packed_git *); -extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *); +#define unpack_entry(r, p, of, ot, s) unpack_entry_##r(p, of, ot, s) +extern void *unpack_entry_the_repository(struct packed_git *, off_t, enum object_type *, unsigned long *); extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep); extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t); extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *); -- 2.17.0.441.gb46fe60e1d-goog
[PATCHv2 6/9] packfile: add repository argument to read_object
Add a repository argument to allow the callers of read_object to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller --- packfile.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/packfile.c b/packfile.c index 5fa7d27d3b..2876e04bb1 100644 --- a/packfile.c +++ b/packfile.c @@ -1469,8 +1469,10 @@ struct unpack_entry_stack_ent { unsigned long size; }; -static void *read_object(const struct object_id *oid, enum object_type *type, -unsigned long *size) +#define read_object(r, o, t, s) read_object_##r(o, t, s) +static void *read_object_the_repository(const struct object_id *oid, + enum object_type *type, + unsigned long *size) { struct object_info oi = OBJECT_INFO_INIT; void *content; @@ -1614,7 +1616,7 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset, oid_to_hex(&base_oid), (uintmax_t)obj_offset, p->pack_name); mark_bad_packed_object(p, base_oid.hash); - base = read_object(&base_oid, &type, &base_size); + base = read_object(the_repository, &base_oid, &type, &base_size); external_base = base; } } -- 2.17.0.441.gb46fe60e1d-goog
[PATCHv2 9/9] cache.h: allow oid_object_info to handle arbitrary repositories
This involves also adapting oid_object_info_extended and a some internal functions that are used to implement these. It all has to happen in one patch, because of a single recursive chain of calls visits all these functions. sha1_object_info_extended is also used in partial clones, which allow fetching missing objects. As this series will not add the repository struct to the transport code and fetch_object(), add a TODO note and bug out if a user tries to use a partial clone in a repository other than the_repository. Helped-by: Brandon Williams Helped-by: Jonathan Tan Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- cache.h | 9 - packfile.c | 58 ++--- packfile.h | 8 sha1_file.c | 30 --- 4 files changed, 54 insertions(+), 51 deletions(-) diff --git a/cache.h b/cache.h index 6340b2c572..3a4d80e92b 100644 --- a/cache.h +++ b/cache.h @@ -1192,8 +1192,7 @@ static inline void *read_object_file(const struct object_id *oid, enum object_ty } /* Read and unpack an object file into memory, write memory to an object file */ -#define oid_object_info(r, o, f) oid_object_info_##r(o, f) -int oid_object_info_the_repository(const struct object_id *, unsigned long *); +int oid_object_info(struct repository *r, const struct object_id *, unsigned long *); extern int hash_object_file(const void *buf, unsigned long len, const char *type, struct object_id *oid); @@ -1675,9 +1674,9 @@ struct object_info { /* Do not check loose object */ #define OBJECT_INFO_IGNORE_LOOSE 16 -#define oid_object_info_extended(r, oid, oi, flags) \ - oid_object_info_extended_##r(oid, oi, flags) -int oid_object_info_extended_the_repository(const struct object_id *, struct object_info *, unsigned flags); +int oid_object_info_extended(struct repository *r, +const struct object_id *, +struct object_info *, unsigned flags); /* * Set this to 0 to prevent sha1_object_info_extended() from fetching missing diff --git a/packfile.c b/packfile.c index 8de87f904b..55d383ed0a 100644 --- a/packfile.c +++ b/packfile.c @@ -1104,9 +1104,9 @@ static const unsigned char *get_delta_base_sha1(struct packed_git *p, return NULL; } -#define retry_bad_packed_offset(r, p, o) \ - retry_bad_packed_offset_##r(p, o) -static int retry_bad_packed_offset_the_repository(struct packed_git *p, off_t obj_offset) +static int retry_bad_packed_offset(struct repository *r, + struct packed_git *p, + off_t obj_offset) { int type; struct revindex_entry *revidx; @@ -1116,7 +1116,7 @@ static int retry_bad_packed_offset_the_repository(struct packed_git *p, off_t ob return OBJ_BAD; nth_packed_object_oid(&oid, p, revidx->nr); mark_bad_packed_object(p, oid.hash); - type = oid_object_info(the_repository, &oid, NULL); + type = oid_object_info(r, &oid, NULL); if (type <= OBJ_NONE) return OBJ_BAD; return type; @@ -1124,13 +1124,12 @@ static int retry_bad_packed_offset_the_repository(struct packed_git *p, off_t ob #define POI_STACK_PREALLOC 64 -#define packed_to_object_type(r, p, o, t, w, c) \ - packed_to_object_type_##r(p, o, t, w, c) -static enum object_type packed_to_object_type_the_repository(struct packed_git *p, -off_t obj_offset, -enum object_type type, -struct pack_window **w_curs, -off_t curpos) +static enum object_type packed_to_object_type(struct repository *r, + struct packed_git *p, + off_t obj_offset, + enum object_type type, + struct pack_window **w_curs, + off_t curpos) { off_t small_poi_stack[POI_STACK_PREALLOC]; off_t *poi_stack = small_poi_stack; @@ -1157,7 +1156,7 @@ static enum object_type packed_to_object_type_the_repository(struct packed_git * if (type <= OBJ_NONE) { /* If getting the base itself fails, we first * retry the base, otherwise unwind */ - type = retry_bad_packed_offset(the_repository, p, base_offset); + type = retry_bad_packed_offset(r, p, base_offset); if (type > OBJ_NONE) goto out; goto unwind; @@ -1185,7 +1184,7 @@ static enum object_type packed_to_object_type_the_repository(struct pack
[PATCHv2 8/9] packfile: add repository argument to cache_or_unpack_entry
Add a repository argument to allow the callers of cache_or_unpack_entry to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- packfile.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/packfile.c b/packfile.c index d5ac48ef18..8de87f904b 100644 --- a/packfile.c +++ b/packfile.c @@ -1272,7 +1272,8 @@ static void detach_delta_base_cache_entry(struct delta_base_cache_entry *ent) free(ent); } -static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset, +#define cache_or_unpack_entry(r, p, bo, bs, t) cache_or_unpack_entry_##r(p, bo, bs, t) +static void *cache_or_unpack_entry_the_repository(struct packed_git *p, off_t base_offset, unsigned long *base_size, enum object_type *type) { struct delta_base_cache_entry *ent; @@ -1346,7 +1347,7 @@ int packed_object_info_the_repository(struct packed_git *p, off_t obj_offset, * a "real" type later if the caller is interested. */ if (oi->contentp) { - *oi->contentp = cache_or_unpack_entry(p, obj_offset, oi->sizep, + *oi->contentp = cache_or_unpack_entry(the_repository, p, obj_offset, oi->sizep, &type); if (!*oi->contentp) type = OBJ_BAD; -- 2.17.0.441.gb46fe60e1d-goog
[PATCHv2 5/9] packfile: add repository argument to packed_object_info
From: Jonathan Nieder Add a repository argument to allow callers of packed_object_info to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- builtin/pack-objects.c | 3 ++- packfile.c | 4 ++-- packfile.h | 3 ++- sha1_file.c| 2 +- 4 files changed, 7 insertions(+), 5 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 8d4111f748..d65eb4a947 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1572,7 +1572,8 @@ static void drop_reused_delta(struct object_entry *entry) oi.sizep = &entry->size; oi.typep = &entry->type; - if (packed_object_info(entry->in_pack, entry->in_pack_offset, &oi) < 0) { + if (packed_object_info(the_repository, entry->in_pack, + entry->in_pack_offset, &oi) < 0) { /* * We failed to get the info from this pack for some reason; * fall back to sha1_object_info, which may find another copy. diff --git a/packfile.c b/packfile.c index 3ecfba66af..5fa7d27d3b 100644 --- a/packfile.c +++ b/packfile.c @@ -1333,8 +1333,8 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset, hashmap_add(&delta_base_cache, ent); } -int packed_object_info(struct packed_git *p, off_t obj_offset, - struct object_info *oi) +int packed_object_info_the_repository(struct packed_git *p, off_t obj_offset, + struct object_info *oi) { struct pack_window *w_curs = NULL; unsigned long size; diff --git a/packfile.h b/packfile.h index a92c0b241c..bc8d840b1b 100644 --- a/packfile.h +++ b/packfile.h @@ -125,7 +125,8 @@ extern void release_pack_memory(size_t); /* global flag to enable extra checks when accessing packed objects */ extern int do_check_packed_object_crc; -extern int packed_object_info(struct packed_git *pack, off_t offset, struct object_info *); +#define packed_object_info(r, p, o, oi) packed_object_info_##r(p, o, oi) +extern int packed_object_info_the_repository(struct packed_git *pack, off_t offset, struct object_info *); extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1); extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1); diff --git a/sha1_file.c b/sha1_file.c index 93f25c6c6a..746ff8297a 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1307,7 +1307,7 @@ int oid_object_info_extended_the_repository(const struct object_id *oid, struct * information below, so return early. */ return 0; - rtype = packed_object_info(e.p, e.offset, oi); + rtype = packed_object_info(the_repository, e.p, e.offset, oi); if (rtype < 0) { mark_bad_packed_object(e.p, real->hash); return oid_object_info_extended(the_repository, real, oi, 0); -- 2.17.0.441.gb46fe60e1d-goog
[PATCHv2 2/9] cache.h: add repository argument to oid_object_info
Add a repository argument to allow the callers of oid_object_info to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller --- archive-tar.c| 2 +- archive-zip.c| 3 ++- blame.c | 4 ++-- builtin/blame.c | 2 +- builtin/cat-file.c | 6 +++--- builtin/describe.c | 2 +- builtin/fast-export.c| 2 +- builtin/fetch.c | 2 +- builtin/fsck.c | 3 ++- builtin/index-pack.c | 4 ++-- builtin/ls-tree.c| 2 +- builtin/mktree.c | 2 +- builtin/pack-objects.c | 8 +--- builtin/prune.c | 3 ++- builtin/replace.c| 11 ++- builtin/tag.c| 4 ++-- builtin/unpack-objects.c | 2 +- cache.h | 3 ++- diff.c | 3 ++- fast-import.c| 14 +- list-objects-filter.c| 2 +- object.c | 2 +- pack-bitmap-write.c | 3 ++- packfile.c | 2 +- reachable.c | 2 +- refs.c | 2 +- remote.c | 2 +- sequencer.c | 3 ++- sha1_file.c | 4 ++-- sha1_name.c | 12 ++-- submodule.c | 2 +- tag.c| 2 +- 32 files changed, 67 insertions(+), 53 deletions(-) diff --git a/archive-tar.c b/archive-tar.c index 3563bcb9f2..f93409324f 100644 --- a/archive-tar.c +++ b/archive-tar.c @@ -276,7 +276,7 @@ static int write_tar_entry(struct archiver_args *args, memcpy(header.name, path, pathlen); if (S_ISREG(mode) && !args->convert && - oid_object_info(oid, &size) == OBJ_BLOB && + oid_object_info(the_repository, oid, &size) == OBJ_BLOB && size > big_file_threshold) buffer = NULL; else if (S_ISLNK(mode) || S_ISREG(mode)) { diff --git a/archive-zip.c b/archive-zip.c index 6b20bce4d1..74f3fe9103 100644 --- a/archive-zip.c +++ b/archive-zip.c @@ -325,7 +325,8 @@ static int write_zip_entry(struct archiver_args *args, compressed_size = 0; buffer = NULL; } else if (S_ISREG(mode) || S_ISLNK(mode)) { - enum object_type type = oid_object_info(oid, &size); + enum object_type type = oid_object_info(the_repository, oid, + &size); method = 0; attr2 = S_ISLNK(mode) ? ((mode | 0777) << 16) : diff --git a/blame.c b/blame.c index 78c9808bd1..dfa24473dc 100644 --- a/blame.c +++ b/blame.c @@ -81,7 +81,7 @@ static void verify_working_tree_path(struct commit *work_tree, const char *path) unsigned mode; if (!get_tree_entry(commit_oid, path, &blob_oid, &mode) && - oid_object_info(&blob_oid, NULL) == OBJ_BLOB) + oid_object_info(the_repository, &blob_oid, NULL) == OBJ_BLOB) return; } @@ -504,7 +504,7 @@ static int fill_blob_sha1_and_mode(struct blame_origin *origin) return 0; if (get_tree_entry(&origin->commit->object.oid, origin->path, &origin->blob_oid, &origin->mode)) goto error_out; - if (oid_object_info(&origin->blob_oid, NULL) != OBJ_BLOB) + if (oid_object_info(the_repository, &origin->blob_oid, NULL) != OBJ_BLOB) goto error_out; return 0; error_out: diff --git a/builtin/blame.c b/builtin/blame.c index db38c0b307..bfdf7cc132 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -655,7 +655,7 @@ static int is_a_rev(const char *name) if (get_oid(name, &oid)) return 0; - return OBJ_NONE < oid_object_info(&oid, NULL); + return OBJ_NONE < oid_object_info(the_repository, &oid, NULL); } int cmd_blame(int argc, const char **argv, const char *prefix) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 4ecdb9ff54..b8ecbea98e 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -116,7 +116,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name, /* else fallthrough */ case 'p': - type = oid_object_info(&oid, NULL); + type = oid_object_info(the_repository, &oid, NULL); if (type < 0) die("Not a valid object name %s", obj_name); @@ -140,7 +140,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name, case 0: if (type_from_string(exp_type) == OBJ_BLOB) { struct object_id blob_oid; - if (oid_object_info(&oid, NULL) == OBJ_TAG) { +
[PATCHv2 4/9] packfile: add repository argument to packed_to_object_type
Add a repository argument to allow the callers of packed_to_object_type to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- packfile.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/packfile.c b/packfile.c index d2b3f3503b..3ecfba66af 100644 --- a/packfile.c +++ b/packfile.c @@ -1124,11 +1124,13 @@ static int retry_bad_packed_offset_the_repository(struct packed_git *p, off_t ob #define POI_STACK_PREALLOC 64 -static enum object_type packed_to_object_type(struct packed_git *p, - off_t obj_offset, - enum object_type type, - struct pack_window **w_curs, - off_t curpos) +#define packed_to_object_type(r, p, o, t, w, c) \ + packed_to_object_type_##r(p, o, t, w, c) +static enum object_type packed_to_object_type_the_repository(struct packed_git *p, +off_t obj_offset, +enum object_type type, +struct pack_window **w_curs, +off_t curpos) { off_t small_poi_stack[POI_STACK_PREALLOC]; off_t *poi_stack = small_poi_stack; @@ -1378,8 +1380,8 @@ int packed_object_info(struct packed_git *p, off_t obj_offset, if (oi->typep || oi->type_name) { enum object_type ptot; - ptot = packed_to_object_type(p, obj_offset, type, &w_curs, -curpos); + ptot = packed_to_object_type(the_repository, p, obj_offset, +type, &w_curs, curpos); if (oi->typep) *oi->typep = ptot; if (oi->type_name) { -- 2.17.0.441.gb46fe60e1d-goog
[PATCHv2 0/9] object store: oid_object_info is the next contender
v2: * fixed the sha1/oid typo * removed spurious new line * Brandon and Jonthan discovered another dependency that I missed due to cherrypicking that commit from a tree before partial clone was a thing. We error out when attempting to use fetch_object for repos that are not the_repository. Thanks, Stefan v1: This applies on top of origin/sb/object-store-replace and is available as https://github.com/stefanbeller/git/tree/oid_object_info This continues the work of sb/packfiles-in-repository, extending the layer at which we have to pass in an explicit repository object to oid_object_info. A test merge to next shows only a minor merge conflicit (adding different #include lines in one c file), so this might be a good next step for the object store series. Notes on further object store series: I plan on converting the "parsed object store" next, which would be {alloc, object, tree, commit, tag}.c as that is a prerequisite for migrating shallow (which is intermingled with grafts) information to the object store. There is currently work going on in allocation (mempool - Jameson Miller) and grafts (deprecate grafts - DScho), which is why I am sending this series first. I think it can go in parallel to the "parsed object store" that is coming next. Thanks, Stefan Jonathan Nieder (1): packfile: add repository argument to packed_object_info Stefan Beller (8): cache.h: add repository argument to oid_object_info_extended cache.h: add repository argument to oid_object_info packfile: add repository argument to retry_bad_packed_offset packfile: add repository argument to packed_to_object_type packfile: add repository argument to read_object packfile: add repository argument to unpack_entry packfile: add repository argument to cache_or_unpack_entry cache.h: allow oid_object_info to handle arbitrary repositories archive-tar.c| 2 +- archive-zip.c| 3 ++- blame.c | 4 ++-- builtin/blame.c | 2 +- builtin/cat-file.c | 12 ++-- builtin/describe.c | 2 +- builtin/fast-export.c| 2 +- builtin/fetch.c | 2 +- builtin/fsck.c | 3 ++- builtin/index-pack.c | 4 ++-- builtin/ls-tree.c| 2 +- builtin/mktree.c | 2 +- builtin/pack-objects.c | 11 +++ builtin/prune.c | 3 ++- builtin/replace.c| 11 ++- builtin/tag.c| 4 ++-- builtin/unpack-objects.c | 2 +- cache.h | 7 +-- diff.c | 3 ++- fast-import.c| 16 ++-- list-objects-filter.c| 2 +- object.c | 2 +- pack-bitmap-write.c | 3 ++- pack-check.c | 3 ++- packfile.c | 40 +++- packfile.h | 6 -- reachable.c | 2 +- refs.c | 2 +- remote.c | 2 +- sequencer.c | 3 ++- sha1_file.c | 36 +--- sha1_name.c | 12 ++-- streaming.c | 2 +- submodule.c | 2 +- tag.c| 2 +- 35 files changed, 124 insertions(+), 92 deletions(-) -- 2.17.0.441.gb46fe60e1d-goog
[PATCHv2 1/9] cache.h: add repository argument to oid_object_info_extended
Add a repository argument to allow oid_object_info_extended callers to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Signed-off-by: Stefan Beller --- builtin/cat-file.c | 6 +++--- cache.h| 5 - packfile.c | 2 +- sha1_file.c| 10 +- streaming.c| 2 +- 5 files changed, 14 insertions(+), 11 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 2c46d257cd..4ecdb9ff54 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -77,7 +77,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name, switch (opt) { case 't': oi.type_name = &sb; - if (oid_object_info_extended(&oid, &oi, flags) < 0) + if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0) die("git cat-file: could not get object info"); if (sb.len) { printf("%s\n", sb.buf); @@ -88,7 +88,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name, case 's': oi.sizep = &size; - if (oid_object_info_extended(&oid, &oi, flags) < 0) + if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0) die("git cat-file: could not get object info"); printf("%lu\n", size); return 0; @@ -342,7 +342,7 @@ static void batch_object_write(const char *obj_name, struct batch_options *opt, struct strbuf buf = STRBUF_INIT; if (!data->skip_object_info && - oid_object_info_extended(&data->oid, &data->info, + oid_object_info_extended(the_repository, &data->oid, &data->info, OBJECT_INFO_LOOKUP_REPLACE) < 0) { printf("%s missing\n", obj_name ? obj_name : oid_to_hex(&data->oid)); diff --git a/cache.h b/cache.h index 027bd7ffc8..588c4fff9a 100644 --- a/cache.h +++ b/cache.h @@ -1673,7 +1673,10 @@ struct object_info { #define OBJECT_INFO_QUICK 8 /* Do not check loose object */ #define OBJECT_INFO_IGNORE_LOOSE 16 -extern int oid_object_info_extended(const struct object_id *, struct object_info *, unsigned flags); + +#define oid_object_info_extended(r, oid, oi, flags) \ + oid_object_info_extended_##r(oid, oi, flags) +int oid_object_info_extended_the_repository(const struct object_id *, struct object_info *, unsigned flags); /* * Set this to 0 to prevent sha1_object_info_extended() from fetching missing diff --git a/packfile.c b/packfile.c index 0bc67d0e00..d9914ba723 100644 --- a/packfile.c +++ b/packfile.c @@ -1474,7 +1474,7 @@ static void *read_object(const struct object_id *oid, enum object_type *type, oi.sizep = size; oi.contentp = &content; - if (oid_object_info_extended(oid, &oi, 0) < 0) + if (oid_object_info_extended(the_repository, oid, &oi, 0) < 0) return NULL; return content; } diff --git a/sha1_file.c b/sha1_file.c index 64a5bd7d87..50a2dc5f0a 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1231,7 +1231,7 @@ static int sha1_loose_object_info(struct repository *r, int fetch_if_missing = 1; -int oid_object_info_extended(const struct object_id *oid, struct object_info *oi, unsigned flags) +int oid_object_info_extended_the_repository(const struct object_id *oid, struct object_info *oi, unsigned flags) { static struct object_info blank_oi = OBJECT_INFO_INIT; struct pack_entry e; @@ -1310,7 +1310,7 @@ int oid_object_info_extended(const struct object_id *oid, struct object_info *oi rtype = packed_object_info(e.p, e.offset, oi); if (rtype < 0) { mark_bad_packed_object(e.p, real->hash); - return oid_object_info_extended(real, oi, 0); + return oid_object_info_extended(the_repository, real, oi, 0); } else if (oi->whence == OI_PACKED) { oi->u.packed.offset = e.offset; oi->u.packed.pack = e.p; @@ -1329,7 +1329,7 @@ int oid_object_info(const struct object_id *oid, unsigned long *sizep) oi.typep = &type; oi.sizep = sizep; - if (oid_object_info_extended(oid, &oi, + if (oid_object_info_extended(the_repository, oid, &oi, OBJECT_INFO_LOOKUP_REPLACE) < 0) return -1; return type; @@ -1347,7 +1347,7 @@ static void *read_object(const unsigned char *sha1, enum object_type *type, hashcpy(oid.hash, sha1); - if (oid_object_info_extended(&oid, &oi, 0) < 0) + if (oid_object_info_extended(the_repository, &oid, &oi, 0) < 0) return NULL; return content; } @@ -1745,7 +1745,7 @@ int has_sha1_file_with_flags(const unsigned char *sha1, int flags) if (!startup_
[PATCHv2 3/9] packfile: add repository argument to retry_bad_packed_offset
Add a repository argument to allow the callers of retry_bad_packed_offset to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller --- packfile.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/packfile.c b/packfile.c index 80c7fa734f..d2b3f3503b 100644 --- a/packfile.c +++ b/packfile.c @@ -1104,7 +1104,9 @@ static const unsigned char *get_delta_base_sha1(struct packed_git *p, return NULL; } -static int retry_bad_packed_offset(struct packed_git *p, off_t obj_offset) +#define retry_bad_packed_offset(r, p, o) \ + retry_bad_packed_offset_##r(p, o) +static int retry_bad_packed_offset_the_repository(struct packed_git *p, off_t obj_offset) { int type; struct revindex_entry *revidx; @@ -1153,7 +1155,7 @@ static enum object_type packed_to_object_type(struct packed_git *p, if (type <= OBJ_NONE) { /* If getting the base itself fails, we first * retry the base, otherwise unwind */ - type = retry_bad_packed_offset(p, base_offset); + type = retry_bad_packed_offset(the_repository, p, base_offset); if (type > OBJ_NONE) goto out; goto unwind; @@ -1181,7 +1183,7 @@ static enum object_type packed_to_object_type(struct packed_git *p, unwind: while (poi_stack_nr) { obj_offset = poi_stack[--poi_stack_nr]; - type = retry_bad_packed_offset(p, obj_offset); + type = retry_bad_packed_offset(the_repository, p, obj_offset); if (type > OBJ_NONE) goto out; } -- 2.17.0.441.gb46fe60e1d-goog
[PATCH v9 3/4] worktree: factor out dwim_branch function
Factor out a dwim_branch function, which takes care of the dwim'ery in 'git worktree add '. It's not too much code currently, but we're adding a new kind of dwim in a subsequent patch, at which point it makes more sense to have it as a separate function. Factor it out now to reduce the patch noise in the next patch. Signed-off-by: Thomas Gummerer --- builtin/worktree.c | 29 ++--- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/builtin/worktree.c b/builtin/worktree.c index 39bf1ea865..6bd32b6090 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -387,6 +387,21 @@ static void print_preparing_worktree_line(int detach, } } +static const char *dwim_branch(const char *path, const char **new_branch) +{ + int n; + const char *s = worktree_basename(path, &n); + *new_branch = xstrndup(s, n); + UNLEAK(*new_branch); + if (guess_remote) { + struct object_id oid; + const char *remote = + unique_tracking_name(*new_branch, &oid); + return remote; + } + return NULL; +} + static int add(int ac, const char **av, const char *prefix) { struct add_opts opts; @@ -439,17 +454,9 @@ static int add(int ac, const char **av, const char *prefix) } if (ac < 2 && !new_branch && !opts.detach) { - int n; - const char *s = worktree_basename(path, &n); - new_branch = xstrndup(s, n); - UNLEAK(new_branch); - if (guess_remote) { - struct object_id oid; - const char *remote = - unique_tracking_name(new_branch, &oid); - if (remote) - branch = remote; - } + const char *s = dwim_branch(path, &new_branch); + if (s) + branch = s; } if (ac == 2 && !new_branch && !opts.detach) { -- 2.16.1.74.g7afd1c25cc.dirty
[PATCH v9 4/4] worktree: teach "add" to check out existing branches
Currently 'git worktree add ' creates a new branch named after the basename of the path by default. If a branch with that name already exists, the command refuses to do anything, unless the '--force' option is given. However we can do a little better than that, and check the branch out if it is not checked out anywhere else. This will help users who just want to check an existing branch out into a new worktree, and save a few keystrokes. As the current behaviour is to simply 'die()' when a branch with the name of the basename of the path already exists, there are no backwards compatibility worries here. We will still 'die()' if the branch is checked out in another worktree, unless the --force flag is passed. Helped-by: Eric Sunshine Signed-off-by: Thomas Gummerer --- Documentation/git-worktree.txt | 9 +++-- builtin/worktree.c | 13 +++-- t/t2025-worktree-add.sh| 26 +++--- 3 files changed, 37 insertions(+), 11 deletions(-) diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt index 41585f535d..5d54f36a71 100644 --- a/Documentation/git-worktree.txt +++ b/Documentation/git-worktree.txt @@ -61,8 +61,13 @@ $ git worktree add --track -b / + If `` is omitted and neither `-b` nor `-B` nor `--detach` used, -then, as a convenience, a new branch based at HEAD is created automatically, -as if `-b $(basename )` was specified. +then, as a convenience, the new worktree is associated with a branch +(call it ``) named after `$(basename )`. If `` +doesn't exist, a new branch based on HEAD is automatically created as +if `-b ` was given. If `` does exist, it will be +checked out in the new worktree, if it's not checked out anywhere +else, otherwise the command will refuse to create the worktree (unless +`--force` is used). list:: diff --git a/builtin/worktree.c b/builtin/worktree.c index 6bd32b6090..d3aeb4877d 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -391,8 +391,17 @@ static const char *dwim_branch(const char *path, const char **new_branch) { int n; const char *s = worktree_basename(path, &n); - *new_branch = xstrndup(s, n); - UNLEAK(*new_branch); + const char *branchname = xstrndup(s, n); + struct strbuf ref = STRBUF_INIT; + + UNLEAK(branchname); + if (!strbuf_check_branch_ref(&ref, branchname) && + ref_exists(ref.buf)) { + strbuf_release(&ref); + return branchname; + } + + *new_branch = branchname; if (guess_remote) { struct object_id oid; const char *remote = diff --git a/t/t2025-worktree-add.sh b/t/t2025-worktree-add.sh index 2b95944973..ad38507d00 100755 --- a/t/t2025-worktree-add.sh +++ b/t/t2025-worktree-add.sh @@ -198,13 +198,25 @@ test_expect_success '"add" with omitted' ' test_cmp_rev HEAD bat ' -test_expect_success '"add" auto-vivify does not clobber existing branch' ' - test_commit c1 && - test_commit c2 && - git branch precious HEAD~1 && - test_must_fail git worktree add precious && - test_cmp_rev HEAD~1 precious && - test_path_is_missing precious +test_expect_success '"add" checks out existing branch of dwimd name' ' + git branch dwim HEAD~1 && + git worktree add dwim && + test_cmp_rev HEAD~1 dwim && + ( + cd dwim && + test_cmp_rev HEAD dwim + ) +' + +test_expect_success '"add " dwim fails with checked out branch' ' + git checkout -b test-branch && + test_must_fail git worktree add test-branch && + test_path_is_missing test-branch +' + +test_expect_success '"add --force" with existing dwimd name doesnt die' ' + git checkout test-branch && + git worktree add --force test-branch ' test_expect_success '"add" no auto-vivify with --detach and omitted' ' -- 2.16.1.74.g7afd1c25cc.dirty
[PATCH v9 1/4] worktree: remove extra members from struct add_opts
There are two members of 'struct add_opts', which are only used inside the 'add()' function, but being part of 'struct add_opts' they are needlessly also passed to the 'add_worktree' function. Make them local to the 'add()' function to make it clearer where they are used. Signed-off-by: Thomas Gummerer --- builtin/worktree.c | 33 - 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/builtin/worktree.c b/builtin/worktree.c index 7cef5b120b..bf305c8b7b 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -27,8 +27,6 @@ struct add_opts { int detach; int checkout; int keep_locked; - const char *new_branch; - int force_new_branch; }; static int show_only; @@ -363,10 +361,11 @@ static int add(int ac, const char **av, const char *prefix) const char *new_branch_force = NULL; char *path; const char *branch; + const char *new_branch = NULL; const char *opt_track = NULL; struct option options[] = { OPT__FORCE(&opts.force, N_("checkout even if already checked out in other worktree")), - OPT_STRING('b', NULL, &opts.new_branch, N_("branch"), + OPT_STRING('b', NULL, &new_branch, N_("branch"), N_("create a new branch")), OPT_STRING('B', NULL, &new_branch_force, N_("branch"), N_("create or reset a branch")), @@ -384,7 +383,7 @@ static int add(int ac, const char **av, const char *prefix) memset(&opts, 0, sizeof(opts)); opts.checkout = 1; ac = parse_options(ac, av, prefix, options, worktree_usage, 0); - if (!!opts.detach + !!opts.new_branch + !!new_branch_force > 1) + if (!!opts.detach + !!new_branch + !!new_branch_force > 1) die(_("-b, -B, and --detach are mutually exclusive")); if (ac < 1 || ac > 2) usage_with_options(worktree_usage, options); @@ -395,33 +394,33 @@ static int add(int ac, const char **av, const char *prefix) if (!strcmp(branch, "-")) branch = "@{-1}"; - opts.force_new_branch = !!new_branch_force; - if (opts.force_new_branch) { + if (new_branch_force) { struct strbuf symref = STRBUF_INIT; - opts.new_branch = new_branch_force; + new_branch = new_branch_force; if (!opts.force && - !strbuf_check_branch_ref(&symref, opts.new_branch) && + !strbuf_check_branch_ref(&symref, new_branch) && ref_exists(symref.buf)) die_if_checked_out(symref.buf, 0); strbuf_release(&symref); } - if (ac < 2 && !opts.new_branch && !opts.detach) { + if (ac < 2 && !new_branch && !opts.detach) { int n; const char *s = worktree_basename(path, &n); - opts.new_branch = xstrndup(s, n); + new_branch = xstrndup(s, n); + UNLEAK(new_branch); if (guess_remote) { struct object_id oid; const char *remote = - unique_tracking_name(opts.new_branch, &oid); + unique_tracking_name(new_branch, &oid); if (remote) branch = remote; } } - if (ac == 2 && !opts.new_branch && !opts.detach) { + if (ac == 2 && !new_branch && !opts.detach) { struct object_id oid; struct commit *commit; const char *remote; @@ -430,25 +429,25 @@ static int add(int ac, const char **av, const char *prefix) if (!commit) { remote = unique_tracking_name(branch, &oid); if (remote) { - opts.new_branch = branch; + new_branch = branch; branch = remote; } } } - if (opts.new_branch) { + if (new_branch) { struct child_process cp = CHILD_PROCESS_INIT; cp.git_cmd = 1; argv_array_push(&cp.args, "branch"); - if (opts.force_new_branch) + if (new_branch_force) argv_array_push(&cp.args, "--force"); - argv_array_push(&cp.args, opts.new_branch); + argv_array_push(&cp.args, new_branch); argv_array_push(&cp.args, branch); if (opt_track) argv_array_push(&cp.args, opt_track); if (run_command(&cp)) return -1; - branch = opts.new_branch; + branch = new_branch; } else if (opt_track) { die(_("--[no-]track can only be used if a new branch is cre
[PATCH v9 2/4] worktree: improve message when creating a new worktree
Currently 'git worktree add' produces output like the following: Preparing ../foo (identifier foo) HEAD is now at 26da330922 The '../foo' is the path where the worktree is created, which the user has just given on the command line. The identifier is an internal implementation detail, which is not particularly relevant for the user and indeed isn't mentioned explicitly anywhere in the man page. Instead of this message, print a message that gives the user a bit more detail of what exactly 'git worktree' is doing. There are various dwim modes which perform some magic under the hood, which should be helpful to users. Just from the output of the command it is not always visible to users what exactly has happened. Help the users a bit more by modifying the "Preparing ..." message and adding some additional information of what 'git worktree add' did under the hood, while not displaying the identifier anymore. Currently there are several different cases: - 'git worktree add -b ...' or 'git worktree add ', both of which create a new branch, either through the user explicitly requesting it, or through 'git worktree add' implicitly creating it. This will end up with the following output: Preparing worktree (new branch '') HEAD is now at 26da330922 - 'git worktree add -B ...', which may either create a new branch if the branch with the given name does not exist yet, or resets an existing branch to the current HEAD, or the commit-ish given. Depending on which action is taken, we'll end up with the following output: Preparing worktree (resetting branch ''; was at caa68db14) HEAD is now at 26da330922 or: Preparing worktree (new branch '') HEAD is now at 26da330922 - 'git worktree add --detach' or 'git worktree add ', both of which create a new worktree with a detached HEAD, for which we will print the following output: Preparing worktree (detached HEAD 26da330922) HEAD is now at 26da330922 - 'git worktree add ', which checks out the branch and prints the following output: Preparing worktree (checking out '') HEAD is now at 47007d5 Additionally currently the "Preparing ..." line is printed to stderr, while the "HEAD is now at ..." line is printed to stdout by 'git reset --hard', which is used internally by 'git worktree add'. Fix this inconsistency by printing the "Preparing ..." message to stdout as well. As "Preparing ..." is not an error, stdout also seems like the more appropriate output stream. Helped-by: Eric Sunshine Signed-off-by: Thomas Gummerer --- builtin/worktree.c | 38 -- 1 file changed, 36 insertions(+), 2 deletions(-) diff --git a/builtin/worktree.c b/builtin/worktree.c index bf305c8b7b..39bf1ea865 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -301,8 +301,6 @@ static int add_worktree(const char *path, const char *refname, strbuf_addf(&sb, "%s/commondir", sb_repo.buf); write_file(sb.buf, "../.."); - fprintf_ln(stderr, _("Preparing %s (identifier %s)"), path, name); - argv_array_pushf(&child_env, "%s=%s", GIT_DIR_ENVIRONMENT, sb_git.buf); argv_array_pushf(&child_env, "%s=%s", GIT_WORK_TREE_ENVIRONMENT, path); cp.git_cmd = 1; @@ -355,6 +353,40 @@ static int add_worktree(const char *path, const char *refname, return ret; } +static void print_preparing_worktree_line(int detach, + const char *branch, + const char *new_branch, + int force_new_branch) +{ + if (force_new_branch) { + struct commit *commit = lookup_commit_reference_by_name(new_branch); + if (!commit) + printf_ln(_("Preparing worktree (new branch '%s')"), new_branch); + else + printf_ln(_("Preparing worktree (resetting branch '%s'; was at %s)"), + new_branch, + find_unique_abbrev(commit->object.oid.hash, +DEFAULT_ABBREV)); + } else if (new_branch) { + printf_ln(_("Preparing worktree (new branch '%s')"), new_branch); + } else { + struct strbuf s = STRBUF_INIT; + if (!detach && !strbuf_check_branch_ref(&s, branch) && + ref_exists(s.buf)) + printf_ln(_("Preparing worktree (checking out '%s')"), + branch); + else { + struct commit *commit = lookup_commit_reference_by_name(branch); + if (!commit) + die(_("invalid reference: %s"), branch); + printf_ln(_("Preparing worktree (detached HEAD %s)"), + find_uniqu
[PATCH v9 0/4] worktree: teach "add" to check out existing branches
Previous rounds are at <20180121120208.12760-1-t.gumme...@gmail.com>, <20180204221305.28300-1-t.gumme...@gmail.com>, <20180317220830.30963-1-t.gumme...@gmail.com>, <2018031719.4940-1-t.gumme...@gmail.com>, <20180325134947.25828-1-t.gumme...@gmail.com>, <20180331151804.30380-1-t.gumme...@gmail.com>, <20180415202917.4360-1-t.gumme...@gmail.com> and <20180423193848.5159-1-t.gumme...@gmail.com>. Thanks Eric for the review and the suggestions on the previous round. Changes since the previous round: - UNLEAK new_branch after it was xstrndup'd - update the commit message of 2/4 according to Eric's suggestions - make force_new_branch a boolean flag in print_preparing_worktree_line, instead of using it as the branch name. Instead use new_branch as the new branch name everywhere in that function. - get rid of the ckeckout_existing_branch flag Interdiff below: diff --git a/builtin/worktree.c b/builtin/worktree.c index d52495f312..d3aeb4877d 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -356,18 +356,15 @@ static int add_worktree(const char *path, const char *refname, static void print_preparing_worktree_line(int detach, const char *branch, const char *new_branch, - const char *new_branch_force, - int checkout_existing_branch) + int force_new_branch) { - if (checkout_existing_branch) { - printf_ln(_("Preparing worktree (checking out '%s')"), branch); - } else if (new_branch_force) { - struct commit *commit = lookup_commit_reference_by_name(new_branch_force); + if (force_new_branch) { + struct commit *commit = lookup_commit_reference_by_name(new_branch); if (!commit) - printf_ln(_("Preparing worktree (new branch '%s')"), new_branch_force); + printf_ln(_("Preparing worktree (new branch '%s')"), new_branch); else printf_ln(_("Preparing worktree (resetting branch '%s'; was at %s)"), - new_branch_force, + new_branch, find_unique_abbrev(commit->object.oid.hash, DEFAULT_ABBREV)); } else if (new_branch) { @@ -390,19 +387,17 @@ static void print_preparing_worktree_line(int detach, } } -static const char *dwim_branch(const char *path, const char **new_branch, - int *checkout_existing_branch) +static const char *dwim_branch(const char *path, const char **new_branch) { int n; const char *s = worktree_basename(path, &n); const char *branchname = xstrndup(s, n); struct strbuf ref = STRBUF_INIT; + UNLEAK(branchname); if (!strbuf_check_branch_ref(&ref, branchname) && ref_exists(ref.buf)) { - *checkout_existing_branch = 1; strbuf_release(&ref); - UNLEAK(branchname); return branchname; } @@ -421,7 +416,6 @@ static int add(int ac, const char **av, const char *prefix) struct add_opts opts; const char *new_branch_force = NULL; char *path; - int checkout_existing_branch = 0; const char *branch; const char *new_branch = NULL; const char *opt_track = NULL; @@ -469,8 +463,7 @@ static int add(int ac, const char **av, const char *prefix) } if (ac < 2 && !new_branch && !opts.detach) { - const char *s = dwim_branch(path, &new_branch, - &checkout_existing_branch); + const char *s = dwim_branch(path, &new_branch); if (s) branch = s; } @@ -490,8 +483,7 @@ static int add(int ac, const char **av, const char *prefix) } } - print_preparing_worktree_line(opts.detach, branch, new_branch, new_branch_force, - checkout_existing_branch); + print_preparing_worktree_line(opts.detach, branch, new_branch, !!new_branch_force); if (new_branch) { struct child_process cp = CHILD_PROCESS_INIT; Thomas Gummerer (4): worktree: remove extra members from struct add_opts worktree: improve message when creating a new worktree worktree: factor out dwim_branch function worktree: teach "add" to check out existing branches Documentation/git-worktree.txt | 9 +++- builtin/worktree.c | 103 ++--- t/t2025-worktree-add.sh| 26 --- 3 files changed, 102 insertions(+), 36 deletions(-) -- 2.16.1.74.g7afd1c25cc.dirty
Re: [PATCH 5/7] diff.c: add a blocks mode for moved code detection
On Tue, 24 Apr 2018 14:03:28 -0700 Stefan Beller wrote: > Suggested-by: Ævar Arnfjörð Bjarmason > (https://public-inbox.org/git/87o9j0uljo@evledraar.gmail.com/) > Signed-off-by: Stefan Beller Firstly, I don't know if this is the right solution- as written in the linked e-mail [1], the issue might be more that the config conflates 2 unrelated things, not that a certain intersection is missing. [1] https://public-inbox.org/git/87muykuij7@evledraar.gmail.com/ Optional: Probably better to put the link inline, instead of in the trailer. > -test_expect_success 'detect permutations inside moved code -- dimmed_zebra' ' > +test_expect_success 'detect blocks of moved code' ' > git reset --hard && > cat <<-\EOF >lines.txt && > long line 1 > @@ -1271,6 +1271,52 @@ test_expect_success 'detect permutations inside moved > code -- dimmed_zebra' ' > test_config color.diff.newMovedDimmed "normal cyan" && > test_config color.diff.oldMovedAlternativeDimmed "normal blue" && > test_config color.diff.newMovedAlternativeDimmed "normal yellow" && Add a comment here explaining that these colors do not appear in the output, but merely set to recognizable values to ensure that they do not appear in the output. > + > + git diff HEAD --no-renames --color-moved=blocks --color | > + grep -v "index" | > + test_decode_color >actual && > + cat <<-\EOF >expected && > + diff --git a/lines.txt b/lines.txt > + --- a/lines.txt > + +++ b/lines.txt > + @@ -1,16 +1,16 @@ > + -long line 1 > + -long line 2 > + -long line 3 > + line 4 > + line 5 > + line 6 > + line 7 > + line 8 > + line 9 > + +long line 1 > + +long line 2 > + +long line 3 > + +long line 14 > + +long line 15 > + +long line 16 > + line 10 > + line 11 > + line 12 > + line 13 > + -long line 14 > + -long line 15 > + -long line 16 > + EOF > + test_cmp expected actual > + > +' [snip]
Re: [PATCH v3 09/11] technical/shallow: describe the relationship with replace refs
Hi dscho From: "Johannes Schindelin" : Tuesday, April 24, 2018 8:10 PM On Sun, 22 Apr 2018, Philip Oakley wrote: From: "Johannes Schindelin" > Now that grafts are deprecated, we should start to assume that readers > have no idea what grafts are. So it makes more sense to describe the > "shallow" feature in terms of replace refs. Here we say we should drop the term "grafts" > > Suggested-by: Eric Sunshine > Signed-off-by: Johannes Schindelin > --- > Documentation/technical/shallow.txt | 19 +++ > 1 file changed, 11 insertions(+), 8 deletions(-) > > diff --git a/Documentation/technical/shallow.txt > b/Documentation/technical/shallow.txt > index 5183b154229..b3ff23c25f6 100644 > --- a/Documentation/technical/shallow.txt > +++ b/Documentation/technical/shallow.txt > @@ -9,14 +9,17 @@ these commits have no parents. > * > > The basic idea is to write the SHA-1s of shallow commits into > -$GIT_DIR/shallow, and handle its contents like the contents > -of $GIT_DIR/info/grafts (with the difference that shallow > -cannot contain parent information). > - > -This information is stored in a new file instead of grafts, or > -even the config, since the user should not touch that file > -at all (even throughout development of the shallow clone, it > -was never manually edited!). > +$GIT_DIR/shallow, and handle its contents similar to replace > +refs (with the difference that shallow does not actually > +create those replace refs) and If grafts are deprecated, why not alse get rid of this mention and simply leave the 'what it does' part. Internally, shallow commits are implemented using the graft code path, and however the change here is just to the documentation, independent of th code path's name. they always will be: we will always need a list of the shallow commits, and we will always need to be able to lift the "shallow" attribute quickly, when deepening a shallow clone. So it makes sense to mention that here, because we are deep in technical details in Documentation/technical/. > very much like the > deprecated > +graft file (with I was looking to snip this 'graft' reference, as per the commit message.. > the difference that shallow commits will > +always have their parents grafted away, not replaced by s/their parents grafted away/no parents/ (rather than being replaced..) Then I botched this substitution But the commits will typically have parents. So they really will have their parents grafted away as long as they are marked "shallow"... OK, maybe I mis-used the figurative 'no parents', when it means the literal 'parents not present'. Perhaps something like: +$GIT_DIR/shallow, and handle its contents similar to replace +refs (with the difference that shallow does not actually +create those replace refs) with the difference that shallow commits will +always have their parents not present. -- Philip
Confidential Request!!
-- Hello Dear Friend, I am Williams Abbas, I need your services in a confidential matter regarding money transfer. This requires a private arrangement though the details of the transaction will be furnished to you if you indicate your interest in this proposal.We have all the legal documents to back up the transaction, besides we have worked out the modalities to ensure smooth and risky free transfer.We are willing to offer you 35% of the money, the fund in question is quite large. All correspondences will be via email for now. I am expecting to hear from you, if you are willing to do the business with us to send your response to my private email address with your private phone number is needed and please note that this is not scam, but legitimate business offer. Thanks, Yours Faithfully, Williams Abbas
Re: GSoC students and mentors in 2018
On 24.04.2018 00:01, Stefan Beller wrote: Hi Git community, This year we'll participate once again in Google Summer or Code! We'll have 3 students and 3 mentors, which is more than in recent years. Paul-Sebastian Ungureanu mentored by DScho, wants to convert git-stash into a builtin. Alban Gruin and Pratik Karki want to convert parts of git-rebase into a builtin. Both are mentored by Christian and myself. The slots were just announced today, please join me in welcoming them to the Git mailing list! (Although you may remember them from the micro projects[1,2,3]) [1] https://public-inbox.org/git/20180319155929.7000-1-ungureanupaulsebast...@gmail.com/ [2] https://public-inbox.org/git/2018030907.17607-1-alban.gr...@gmail.com/ [3] https://public-inbox.org/git/20180327173137.5970-1-predatoram...@gmail.com/ Thanks, Stefan Hello, It is a pleasure and an honor for me to take part in Google Summer of Code. I am sure it will be a exciting summer and I will definitely give 100% to successfully fulfill 'git stash' project! Best regards, Paul Ungureanu
[PATCH 5/7] diff.c: add a blocks mode for moved code detection
The new "blocks" mode provides a middle ground between plain and zebra. It is as intuitive (few colors) as plain, but still has the requirement for a minimum of lines/characters to count a block as moved. Suggested-by: Ævar Arnfjörð Bjarmason (https://public-inbox.org/git/87o9j0uljo@evledraar.gmail.com/) Signed-off-by: Stefan Beller --- Documentation/diff-options.txt | 8 -- diff.c | 6 +++-- diff.h | 5 ++-- t/t4015-diff-whitespace.sh | 48 +- 4 files changed, 60 insertions(+), 7 deletions(-) diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index e3a44f03cd..bb9f1b7cd8 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -276,10 +276,14 @@ plain:: that are added somewhere else in the diff. This mode picks up any moved line, but it is not very useful in a review to determine if a block of code was moved without permutation. -zebra:: +blocks: Blocks of moved text of at least 20 alphanumeric characters are detected greedily. The detected blocks are - painted using either the 'color.diff.{old,new}Moved' color or + painted using either the 'color.diff.{old,new}Moved' color. + Adjacent blocks cannot be told apart. +zebra:: + Blocks of moved text are detected as in 'blocks' mode. The blocks + are painted using either the 'color.diff.{old,new}Moved' color or 'color.diff.{old,new}MovedAlternative'. The change between the two colors indicates that a new block was detected. dimmed_zebra:: diff --git a/diff.c b/diff.c index d1bae900cd..95c51c0b7d 100644 --- a/diff.c +++ b/diff.c @@ -271,6 +271,8 @@ static int parse_color_moved(const char *arg) return COLOR_MOVED_NO; else if (!strcmp(arg, "plain")) return COLOR_MOVED_PLAIN; + else if (!strcmp(arg, "blocks")) + return COLOR_MOVED_BLOCKS; else if (!strcmp(arg, "zebra")) return COLOR_MOVED_ZEBRA; else if (!strcmp(arg, "default")) @@ -278,7 +280,7 @@ static int parse_color_moved(const char *arg) else if (!strcmp(arg, "dimmed_zebra")) return COLOR_MOVED_ZEBRA_DIM; else - return error(_("color moved setting must be one of 'no', 'default', 'zebra', 'dimmed_zebra', 'plain'")); + return error(_("color moved setting must be one of 'no', 'default', 'blocks', 'zebra', 'dimmed_zebra', 'plain'")); } int git_diff_ui_config(const char *var, const char *value, void *cb) @@ -903,7 +905,7 @@ static void mark_color_as_moved(struct diff_options *o, block_length++; - if (flipped_block) + if (flipped_block && o->color_moved != COLOR_MOVED_BLOCKS) l->flags |= DIFF_SYMBOL_MOVED_LINE_ALT; } adjust_last_block(o, n, block_length); diff --git a/diff.h b/diff.h index d29560f822..7bd4f182c3 100644 --- a/diff.h +++ b/diff.h @@ -208,8 +208,9 @@ struct diff_options { enum { COLOR_MOVED_NO = 0, COLOR_MOVED_PLAIN = 1, - COLOR_MOVED_ZEBRA = 2, - COLOR_MOVED_ZEBRA_DIM = 3, + COLOR_MOVED_BLOCKS = 2, + COLOR_MOVED_ZEBRA = 3, + COLOR_MOVED_ZEBRA_DIM = 4, } color_moved; #define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA #define COLOR_MOVED_MIN_ALNUM_COUNT 20 diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh index 17df491a3a..45091abb19 100755 --- a/t/t4015-diff-whitespace.sh +++ b/t/t4015-diff-whitespace.sh @@ -1223,7 +1223,7 @@ test_expect_success 'plain moved code, inside file' ' test_cmp expected actual ' -test_expect_success 'detect permutations inside moved code -- dimmed_zebra' ' +test_expect_success 'detect blocks of moved code' ' git reset --hard && cat <<-\EOF >lines.txt && long line 1 @@ -1271,6 +1271,52 @@ test_expect_success 'detect permutations inside moved code -- dimmed_zebra' ' test_config color.diff.newMovedDimmed "normal cyan" && test_config color.diff.oldMovedAlternativeDimmed "normal blue" && test_config color.diff.newMovedAlternativeDimmed "normal yellow" && + + git diff HEAD --no-renames --color-moved=blocks --color | + grep -v "index" | + test_decode_color >actual && + cat <<-\EOF >expected && + diff --git a/lines.txt b/lines.txt + --- a/lines.txt + +++ b/lines.txt + @@ -1,16 +1,16 @@ + -long line 1 + -long line 2 + -long line 3 +line 4 +line 5 +line 6 +line 7 +line 8 +line 9 + +long line 1 + +long line 2 + +long line 3 + +long line 14 + +long line 15 + +long line 16 +line 10 +line 11 +line 12 +li
[PATCH 6/7] diff.c: decouple white space treatment from move detection algorithm
In the original implementation of the move detection logic the choice for ignoring white space changes is the same for the move detection as it is for the regular diff. Some cases came up where different treatment would have been nice. Allow the user to specify that whitespace should be ignored differently during detection of moved lines than during generation of added and removed lines. This is done by providing analogs to the --ignore-space-at-eol, -b, and -w options (namely, --color-moved-[no-]ignore-space-at-eol --color-moved-[no-]ignore-space-change --color-moved-[no-]ignore-all-space) that affect only the color of the output, and making the existing --ignore-space-at-eol, -b, and -w options no longer affect the color of the output. As we change the default, we'll adjust the tests. For now we do not infer any options to treat whitespaces in the move detection from the generic white space options given to diff. This can be tuned later to reasonable default. Signed-off-by: Stefan Beller --- Documentation/diff-options.txt | 13 + diff.c | 19 ++- diff.h | 1 + t/t4015-diff-whitespace.sh | 90 +++--- 4 files changed, 114 insertions(+), 9 deletions(-) diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index bb9f1b7cd8..7b2527b9a1 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -292,6 +292,19 @@ dimmed_zebra:: blocks are considered interesting, the rest is uninteresting. -- +--color-moved-[no-]ignore-space-at-eol:: + Ignore changes in whitespace at EOL when performing the move + detection for --color-moved. +--color-moved-[no-]ignore-space-change:: + Ignore changes in amount of whitespace when performing the move + detection for --color-moved. This ignores whitespace + at line end, and considers all other sequences of one or + more whitespace characters to be equivalent. +--color-moved-[no-]ignore-all-space:: + Ignore whitespace when comparing lines when performing the move + detection for --color-moved. This ignores differences even if + one line has whitespace where the other line has none. + --word-diff[=]:: Show a word diff, using the to delimit changed words. By default, words are delimited by whitespace; see diff --git a/diff.c b/diff.c index 95c51c0b7d..b5819dd538 100644 --- a/diff.c +++ b/diff.c @@ -717,10 +717,12 @@ static int moved_entry_cmp(const void *hashmap_cmp_fn_data, const struct diff_options *diffopt = hashmap_cmp_fn_data; const struct moved_entry *a = entry; const struct moved_entry *b = entry_or_key; + unsigned flags = diffopt->color_moved_ws_handling +& XDF_WHITESPACE_FLAGS; return !xdiff_compare_lines(a->es->line, a->es->len, b->es->line, b->es->len, - diffopt->xdl_opts); + flags); } static struct moved_entry *prepare_entry(struct diff_options *o, @@ -728,8 +730,9 @@ static struct moved_entry *prepare_entry(struct diff_options *o, { struct moved_entry *ret = xmalloc(sizeof(*ret)); struct emitted_diff_symbol *l = &o->emitted_symbols->buf[line_no]; + unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; - ret->ent.hash = xdiff_hash_string(l->line, l->len, o->xdl_opts); + ret->ent.hash = xdiff_hash_string(l->line, l->len, flags); ret->es = l; ret->next_line = NULL; @@ -4638,6 +4641,18 @@ int diff_opt_parse(struct diff_options *options, DIFF_XDL_SET(options, IGNORE_CR_AT_EOL); else if (!strcmp(arg, "--ignore-blank-lines")) DIFF_XDL_SET(options, IGNORE_BLANK_LINES); + else if (!strcmp(arg, "--color-moved-no-ignore-all-space")) + options->color_moved_ws_handling &= ~XDF_IGNORE_WHITESPACE; + else if (!strcmp(arg, "--color-moved-no-ignore-space-change")) + options->color_moved_ws_handling &= ~XDF_IGNORE_WHITESPACE_CHANGE; + else if (!strcmp(arg, "--color-moved-no-ignore-space-at-eol")) + options->color_moved_ws_handling &= ~XDF_IGNORE_WHITESPACE_AT_EOL; + else if (!strcmp(arg, "--color-moved-ignore-all-space")) + options->color_moved_ws_handling |= XDF_IGNORE_WHITESPACE; + else if (!strcmp(arg, "--color-moved-ignore-space-change")) + options->color_moved_ws_handling |= XDF_IGNORE_WHITESPACE_CHANGE; + else if (!strcmp(arg, "--color-moved-ignore-space-at-eol")) + options->color_moved_ws_handling |= XDF_IGNORE_WHITESPACE_AT_EOL; else if (!strcmp(arg, "--indent-heuristic")) DIFF_XDL_SET(options, INDENT_HEURISTIC); else if (!strcmp(arg, "--no-indent-heuristic")) diff --git a/diff.h b/diff.h index 7bd4f182c3..de5dc68005 100
[PATCH 4/7] diff.c: adjust hash function signature to match hashmap expectation
This makes the follow up patch easier. Signed-off-by: Stefan Beller --- diff.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/diff.c b/diff.c index ce7bedc1b9..d1bae900cd 100644 --- a/diff.c +++ b/diff.c @@ -707,11 +707,15 @@ struct moved_entry { struct moved_entry *next_line; }; -static int moved_entry_cmp(const struct diff_options *diffopt, - const struct moved_entry *a, - const struct moved_entry *b, +static int moved_entry_cmp(const void *hashmap_cmp_fn_data, + const void *entry, + const void *entry_or_key, const void *keydata) { + const struct diff_options *diffopt = hashmap_cmp_fn_data; + const struct moved_entry *a = entry; + const struct moved_entry *b = entry_or_key; + return !xdiff_compare_lines(a->es->line, a->es->len, b->es->line, b->es->len, diffopt->xdl_opts); @@ -5534,10 +5538,8 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o) if (o->color_moved) { struct hashmap add_lines, del_lines; - hashmap_init(&del_lines, -(hashmap_cmp_fn)moved_entry_cmp, o, 0); - hashmap_init(&add_lines, -(hashmap_cmp_fn)moved_entry_cmp, o, 0); + hashmap_init(&del_lines, moved_entry_cmp, o, 0); + hashmap_init(&add_lines, moved_entry_cmp, o, 0); add_lines_to_move_detection(o, &add_lines, &del_lines); mark_color_as_moved(o, &add_lines, &del_lines); -- 2.17.0.441.gb46fe60e1d-goog
[PATCH 1/7] xdiff/xdiff.h: remove unused flags
These flags were there since the beginning (3443546f6e (Use a *real* built-in diff generator, 2006-03-24), but were never used. Remove them. Signed-off-by: Stefan Beller --- xdiff/xdiff.h | 8 1 file changed, 8 deletions(-) diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h index c1937a2911..2356da5f78 100644 --- a/xdiff/xdiff.h +++ b/xdiff/xdiff.h @@ -52,14 +52,6 @@ extern "C" { #define XDL_EMIT_FUNCNAMES (1 << 0) #define XDL_EMIT_FUNCCONTEXT (1 << 2) -#define XDL_MMB_READONLY (1 << 0) - -#define XDL_MMF_ATOMIC (1 << 0) - -#define XDL_BDOP_INS 1 -#define XDL_BDOP_CPY 2 -#define XDL_BDOP_INSB 3 - /* merge simplification levels */ #define XDL_MERGE_MINIMAL 0 #define XDL_MERGE_EAGER 1 -- 2.17.0.441.gb46fe60e1d-goog
[PATCHv2 0/7] Moved code detection: ignore space on uniform indentation
v2: I think I have addressed Jonathans feedback * by using a string instead of counting the first character only. * refined tests slightly (easier to read) * moved white space handling for moved blocks into its own flag field, keeping the enum for the actual mode of move detection. v1: This is a re-attempt of [1], which allows the moved code detection to ignore blanks in various modes. patches 1-5 are refactoring, patch 6 adds all existing white space options of regular diff to the move detection. (I am unsure about this patch, as I presume we want to keep the option space at a minimum if possible). The fun is in the last patch, which allows white space sensitive languages to trust the move detection, too. Each block that is marked as moved will have the same delta in {in-, de-}dentation. I would think this mode might be a reasonable default eventually. Thanks, Stefan [1] https://public-inbox.org/git/20171025224620.27657-1-sbel...@google.com/ Stefan Beller (7): xdiff/xdiff.h: remove unused flags xdiff/xdiffi.c: remove unneeded function declarations diff.c: do not pass diff options as keydata to hashmap diff.c: adjust hash function signature to match hashmap expectation diff.c: add a blocks mode for moved code detection diff.c: decouple white space treatment from move detection algorithm diff.c: add --color-moved-ignore-space-delta option Documentation/diff-options.txt | 25 - diff.c | 128 ++ diff.h | 8 +- t/t4015-diff-whitespace.sh | 192 +++-- xdiff/xdiff.h | 8 -- xdiff/xdiffi.c | 17 --- 6 files changed, 322 insertions(+), 56 deletions(-) -- 2.17.0.441.gb46fe60e1d-goog
[PATCH 2/7] xdiff/xdiffi.c: remove unneeded function declarations
There is no need to forward-declare these functions, as they are used after their implementation only. Signed-off-by: Stefan Beller --- xdiff/xdiffi.c | 17 - 1 file changed, 17 deletions(-) diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c index 0de1ef463b..3e8aff92bc 100644 --- a/xdiff/xdiffi.c +++ b/xdiff/xdiffi.c @@ -22,34 +22,17 @@ #include "xinclude.h" - - #define XDL_MAX_COST_MIN 256 #define XDL_HEUR_MIN_COST 256 #define XDL_LINE_MAX (long)((1UL << (CHAR_BIT * sizeof(long) - 1)) - 1) #define XDL_SNAKE_CNT 20 #define XDL_K_HEUR 4 - - typedef struct s_xdpsplit { long i1, i2; int min_lo, min_hi; } xdpsplit_t; - - - -static long xdl_split(unsigned long const *ha1, long off1, long lim1, - unsigned long const *ha2, long off2, long lim2, - long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl, - xdalgoenv_t *xenv); -static xdchange_t *xdl_add_change(xdchange_t *xscr, long i1, long i2, long chg1, long chg2); - - - - - /* * See "An O(ND) Difference Algorithm and its Variations", by Eugene Myers. * Basically considers a "box" (off1, off2, lim1, lim2) and scan from both -- 2.17.0.441.gb46fe60e1d-goog
[PATCH 3/7] diff.c: do not pass diff options as keydata to hashmap
When we initialize the hashmap, we give it a pointer to the diff_options, which it then passes along to each call of the hashmap_cmp_fn function. There's no need to pass it a second time as the "keydata" parameter, and our comparison functions never look at keydata. This was a mistake left over from an earlier round of 2e2d5ac184 (diff.c: color moved lines differently, 2017-06-30), before hashmap learned to pass the data pointer for us. Explanation-by: Jeff King Signed-off-by: Stefan Beller --- diff.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index 1289df4b1f..ce7bedc1b9 100644 --- a/diff.c +++ b/diff.c @@ -842,13 +842,13 @@ static void mark_color_as_moved(struct diff_options *o, case DIFF_SYMBOL_PLUS: hm = del_lines; key = prepare_entry(o, n); - match = hashmap_get(hm, key, o); + match = hashmap_get(hm, key, NULL); free(key); break; case DIFF_SYMBOL_MINUS: hm = add_lines; key = prepare_entry(o, n); - match = hashmap_get(hm, key, o); + match = hashmap_get(hm, key, NULL); free(key); break; default: -- 2.17.0.441.gb46fe60e1d-goog
[PATCH 7/7] diff.c: add --color-moved-ignore-space-delta option
This marks moved code still as blocks when their indentation level changes uniformly. Signed-off-by: Stefan Beller --- Documentation/diff-options.txt | 4 ++ diff.c | 83 +++--- diff.h | 2 + t/t4015-diff-whitespace.sh | 54 ++ 4 files changed, 137 insertions(+), 6 deletions(-) diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 7b2527b9a1..facdbc8f95 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -304,6 +304,10 @@ dimmed_zebra:: Ignore whitespace when comparing lines when performing the move detection for --color-moved. This ignores differences even if one line has whitespace where the other line has none. +--color-moved-[no-]ignore-space-prefix-delta:: + Ignores whitespace when comparing lines when performing the move + detection for --color-moved. This ignores uniform differences + of white space at the beginning lines in moved blocks. --word-diff[=]:: Show a word diff, using the to delimit changed words. diff --git a/diff.c b/diff.c index b5819dd538..1227a4d2a8 100644 --- a/diff.c +++ b/diff.c @@ -709,6 +709,31 @@ struct moved_entry { struct moved_entry *next_line; }; +struct ws_delta { + char *string; /* The prefix delta, which is the same in the block */ + int direction; /* adding or removing the line? */ + int missmatch; /* in the remainder */ +}; +#define WS_DELTA_INIT { NULL, 0, 0 } + +static void compute_ws_delta(const struct emitted_diff_symbol *a, +const struct emitted_diff_symbol *b, +struct ws_delta *out) +{ + const struct emitted_diff_symbol *longer = a->len > b->len ? a : b; + const struct emitted_diff_symbol *shorter = a->len > b->len ? b : a; + int d = longer->len - shorter->len; + + out->missmatch = !memcmp(longer->line + d, shorter->line, shorter->len); + out->string = xmemdupz(longer->line, d); + out->direction = (a == longer); +} + +static int compare_ws_delta(const struct ws_delta *a, const struct ws_delta *b) +{ + return a->direction == b->direction && !strcmp(a->string, b->string); +} + static int moved_entry_cmp(const void *hashmap_cmp_fn_data, const void *entry, const void *entry_or_key, @@ -720,6 +745,15 @@ static int moved_entry_cmp(const void *hashmap_cmp_fn_data, unsigned flags = diffopt->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; + if (diffopt->color_moved_ws_handling & COLOR_MOVED_DELTA_WHITESPACES) + /* +* As there is not specific white space config given, +* we'd need to check for a new block, so ignore all +* white space. The setup of the white space +* configuration for the next block is done else where +*/ + flags |= XDF_IGNORE_WHITESPACE; + return !xdiff_compare_lines(a->es->line, a->es->len, b->es->line, b->es->len, flags); @@ -772,7 +806,8 @@ static void add_lines_to_move_detection(struct diff_options *o, } static int shrink_potential_moved_blocks(struct moved_entry **pmb, -int pmb_nr) +int pmb_nr, +struct ws_delta **wsd) { int lp, rp; @@ -788,6 +823,10 @@ static int shrink_potential_moved_blocks(struct moved_entry **pmb, if (lp < pmb_nr && rp > -1 && lp < rp) { pmb[lp] = pmb[rp]; + if (*wsd) { + free((*wsd)[lp].string); + (*wsd)[lp] = (*wsd)[rp]; + } pmb[rp] = NULL; rp--; lp++; @@ -837,8 +876,11 @@ static void mark_color_as_moved(struct diff_options *o, { struct moved_entry **pmb = NULL; /* potentially moved blocks */ int pmb_nr = 0, pmb_alloc = 0; - int n, flipped_block = 1, block_length = 0; + struct ws_delta *wsd = NULL; /* white space deltas between pmb */ + int wsd_alloc = 0; + + int n, flipped_block = 1, block_length = 0; for (n = 0; n < o->emitted_symbols->nr; n++) { struct hashmap *hm = NULL; @@ -881,14 +923,31 @@ static void mark_color_as_moved(struct diff_options *o, struct moved_entry *p = pmb[i]; struct moved_entry *pnext = (p && p->next_line) ? p->next_line : NULL; - if (pnext && !hm->cmpfn(o, pnext, match, NULL)) { - pmb[i] = p->next_line; + +
Re: [PATCH v2 1/2] merge: Add merge.renames config setting
On 4/24/2018 2:59 PM, Elijah Newren wrote: Sorry, I noticed something else I missed on my last reading... On Tue, Apr 24, 2018 at 10:11 AM, Ben Peart wrote: diff --git a/builtin/merge.c b/builtin/merge.c index 8746c5e3e8..3be52cd316 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -424,6 +424,7 @@ static void finish(struct commit *head_commit, opts.output_format |= DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT; opts.detect_rename = DIFF_DETECT_RENAME; + git_config_get_bool("merge.renames", &opts.detect_rename); diff_setup_done(&opts); diff_tree_oid(head, new_head, "", &opts); diffcore_std(&opts); Shouldn't this also be turned off if either (a) merge.renames is unset and diff.renames is false, or (b) the user specifies -Xno-renames? This makes me think that I should probably remove the line that overrides the detect_rename setting with the merge config setting. As I look at the code, none of the other merge options are reflected in the diffstat; instead, all the settings are pretty much hard coded. Perhaps I shouldn't rock that boat.
[PATCH v10 2/2] fixup! t6046: testcases checking whether updates can be skipped in a merge
--- t/t6046-merge-skip-unneeded-updates.sh | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/t/t6046-merge-skip-unneeded-updates.sh b/t/t6046-merge-skip-unneeded-updates.sh index 880cd782d7..fcefffcaec 100755 --- a/t/t6046-merge-skip-unneeded-updates.sh +++ b/t/t6046-merge-skip-unneeded-updates.sh @@ -41,7 +41,7 @@ test_expect_success '1a-setup: Modify(A)/Modify(B), change on B subset of A' ' ( cd 1a && - test_write_lines 1 2 3 4 5 6 7 8 9 10 >b + test_write_lines 1 2 3 4 5 6 7 8 9 10 >b && git add b && test_tick && git commit -m "O" && @@ -138,7 +138,7 @@ test_expect_success '2a-setup: Modify(A)/rename(B)' ' ( cd 2a && - test_seq 1 10 >b + test_seq 1 10 >b && git add b && test_tick && git commit -m "O" && @@ -148,7 +148,7 @@ test_expect_success '2a-setup: Modify(A)/rename(B)' ' git branch B && git checkout A && - test_seq 1 11 > b && + test_seq 1 11 >b && git add b && test_tick && git commit -m "A" && @@ -229,7 +229,7 @@ test_expect_success '2b-setup: Rename+Mod(A)/Mod(B), B mods subset of A' ' ( cd 2b && - test_write_lines 1 2 3 4 5 6 7 8 9 10 >b + test_write_lines 1 2 3 4 5 6 7 8 9 10 >b && git add b && test_tick && git commit -m "O" && @@ -337,7 +337,7 @@ test_expect_success '2c-setup: Modify b & add c VS rename b->c' ' ( cd 2c && - test_seq 1 10 >b + test_seq 1 10 >b && git add b && test_tick && git commit -m "O" && @@ -443,7 +443,7 @@ test_expect_success '3a-setup: bq_1->foo/bq_2 on A, foo/->bar/ on B' ' git branch B && git checkout A && - test_seq 1 11 > bq && + test_seq 1 11 >bq && git add bq && git mv bq foo/ && test_tick && @@ -542,7 +542,7 @@ test_expect_success '3b-setup: bq_1->foo/bq_2 on A, foo/->bar/ on B' ' git commit -m "A" && git checkout B && - test_seq 1 11 > bq && + test_seq 1 11 >bq && git add bq && git mv foo/ bar/ && test_tick && @@ -624,7 +624,7 @@ test_expect_success '4a-setup: Change on A, change on B subset of A, dirty mods ( cd 4a && - test_write_lines 1 2 3 4 5 6 7 8 9 10 >b + test_write_lines 1 2 3 4 5 6 7 8 9 10 >b && git add b && test_tick && git commit -m "O" && @@ -698,7 +698,7 @@ test_expect_success '4b-setup: Rename+Mod(A)/Mod(B), change on B subset of A, di ( cd 4b && - test_write_lines 1 2 3 4 5 6 7 8 9 10 >b + test_write_lines 1 2 3 4 5 6 7 8 9 10 >b && git add b && test_tick && git commit -m "O" && -- 2.17.0.295.g791b7256b2.dirty
[PATCH v10 1/2] fixup! merge-recursive: fix was_tracked() to quit lying with some renamed paths
--- merge-recursive.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/merge-recursive.c b/merge-recursive.c index 1de8dc1c53..f2cbad4f10 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -3146,10 +3146,10 @@ int merge_trees(struct merge_options *o, /* Free the extra index left from git_merge_trees() */ /* -* FIXME: Need to also data allocated by setup_unpack_trees_porcelain() -* tucked away in o->unpack_opts.msgs, but the problem is that only -* half of it refers to dynamically allocated data, while the other -* half points at static strings. +* FIXME: Need to also free data allocated by +* setup_unpack_trees_porcelain() tucked away in o->unpack_opts.msgs, +* but the problem is that only half of it refers to dynamically +* allocated data, while the other half points at static strings. */ discard_index(&o->orig_index); -- 2.17.0.295.g791b7256b2.dirty
Fetching tags overwrites existing tags
If have a repository with a tag "v1.0.0" and I add a remote repository which also has a tag "v1.0.0" tag is overwritten. Google found [1] from 2011 and option 3 is what I'd like to see. Has it been implemented and I just don't see it? [1]: https://groups.google.com/forum/#!topic/git-version-control/0l_rJFyTE60 Here is an example demonstrating what I see: $ echo abc > abc.txt $ git init . Initialized empty Git repository in /home/wink/prgs/git/investigate-fetch-tags/.git/ $ git add * $ git commit -m "Initial commit" [master (root-commit) 1116fdc] Initial commit 1 file changed, 1 insertion(+) create mode 100644 abc.txt $ git tag v1.0.0 $ git remote add gbenchmark g...@github.com:google/benchmark $ git log --graph --format="%h %s %d" * 1116fdc Initial commit (HEAD -> master, tag: v1.0.0) $ git fetch --tags gbenchmark warning: no common commits remote: Counting objects: 4400, done. remote: Compressing objects: 100% (15/15), done. remote: Total 4400 (delta 5), reused 5 (delta 3), pack-reused 4382 Receiving objects: 100% (4400/4400), 1.33 MiB | 2.81 MiB/s, done. Resolving deltas: 100% (2863/2863), done. >From github.com:google/benchmark * [new branch] clangtidy -> gbenchmark/clangtidy * [new branch] iter_report -> gbenchmark/iter_report * [new branch] master -> gbenchmark/master * [new branch] releasing -> gbenchmark/releasing * [new branch] reportercleanup -> gbenchmark/reportercleanup * [new branch] rmheaders -> gbenchmark/rmheaders * [new branch] v2 -> gbenchmark/v2 * [new tag] v0.0.9 -> v0.0.9 * [new tag] v0.1.0 -> v0.1.0 t [tag update] v1.0.0 -> v1.0.0 * [new tag] v1.1.0 -> v1.1.0 * [new tag] v1.2.0 -> v1.2.0 * [new tag] v1.3.0 -> v1.3.0 * [new tag] v1.4.0 -> v1.4.0 $ git log --graph --format="%h %s %d" * 1116fdc Initial commit (HEAD -> master) As you can see the tag on 1116fdc is gone, v1.0.0 tag has been updated and now its pointing to the tag in gbenchmark: $ git log -5 --graph --format="%h %s %d" v1.0.0 * cd525ae Merge pull request #171 from eliben/update-doc-userealtime (tag: v1.0.0) |\ | * c7ab1b9 Update README to mention UseRealTime for wallclock time measurements. |/ * f662e8b Rename OS_MACOSX macro to new name BENCHMARK_OS_MACOSX. Fix #169 * 0a1f484 Merge pull request #166 from disconnect3d/master |\ | * d2917bc Fixes #165: CustomArguments ret type in README |/ Ideally I would have liked the tags fetched from gbenchmark to have a prefix of gbenchmark/, like the branches have, maybe something like: $ git fetch --tags gbenchmark ... * [new branch] v2 -> gbenchmark/v2 * [new tag] v0.0.9 -> gbenchmark/v0.0.9 * [new tag] v0.1.0 -> gbenchmark/v0.1.0 * [new tag] v1.0.0 -> gbenchmark/v1.0.0 * [new tag] v1.1.0 -> gbenchmark/v1.1.0 * [new tag] v1.2.0 -> gbenchmark/v1.2.0 * [new tag] v1.3.0 -> gbenchmark/v1.3.0 * [new tag] v1.4.0 -> gbenchmark/v1.4.0 -- Wink
Assalamu`Alaikum.
Dear Sir/Madam. Assalamu`Alaikum. I am Dr mohammad ouattara, I have ($14.6 Million us dollars) to transfer into your account, I will send you more details about this deal and the procedures to follow when I receive a positive response from you, Have a great day, Dr mohammad ouattara.
Re: [PATCH v8 06/16] sequencer: introduce the `merge` command
From: "Johannes Schindelin" On Mon, 23 Apr 2018, Philip Oakley wrote: From: "Johannes Schindelin" : Monday, April 23, 2018 1:03 PM Subject: Re: [PATCH v8 06/16] sequencer: introduce the `merge` command [...] > > > > label onto > > > > > > # Branch abc > > > reset onto > > > > Is this reset strictly necessary. We are already there @head. > > No, this is not strictly necessary, but I've realised my misunderstanding. I was thinking this (and others) was equivalent to $ git reset # maybe even --hard, i.e. affecting the worktree Oh, but it *is* affecting the worktree. In this case, since we label HEAD and then immediately reset to the label, there is just nothing to change. Consider this example, though: label onto # Branch: from-philip reset onto pick abcdef something label from-philip # Branch: with-love reset onto pick 012345 else label with-love reset onto merge -C 98765 from-philip merge -C 43210 with-love Only in the first instance is the `reset onto` a no-op, an incidental one. After picking `something` and labeling the result as `from-philip`, though, the next `reset onto` really resets the worktree. rather that just being a movement of the Head rev (though I may be having brain fade here regarding untracked files etc..) The current way of doing things does not allow the `reset` to overwrite untracked, nor ignored files (I think, I only verified the former, not the latter). But yeah, it is not just a movement of HEAD. It does reset the worktree, although quite a bit more gently (and safely) than `git reset --hard`. In that respect, this patch series is a drastic improvement over the Git garden shears (which is the shell script I use in Git for Windows which inspired this here patch series). thanks for clarifying. Yes my reasoning was a total brain fade ... Along with the fact that it's a soft/safe/gentle reset. -- Philip
Re: [PATCH v3 09/11] technical/shallow: describe the relationship with replace refs
Hi Philip, On Sun, 22 Apr 2018, Philip Oakley wrote: > From: "Johannes Schindelin" > > Now that grafts are deprecated, we should start to assume that readers > > have no idea what grafts are. So it makes more sense to describe the > > "shallow" feature in terms of replace refs. > > > > Suggested-by: Eric Sunshine > > Signed-off-by: Johannes Schindelin > > --- > > Documentation/technical/shallow.txt | 19 +++ > > 1 file changed, 11 insertions(+), 8 deletions(-) > > > > diff --git a/Documentation/technical/shallow.txt > > b/Documentation/technical/shallow.txt > > index 5183b154229..b3ff23c25f6 100644 > > --- a/Documentation/technical/shallow.txt > > +++ b/Documentation/technical/shallow.txt > > @@ -9,14 +9,17 @@ these commits have no parents. > > * > > > > The basic idea is to write the SHA-1s of shallow commits into > > -$GIT_DIR/shallow, and handle its contents like the contents > > -of $GIT_DIR/info/grafts (with the difference that shallow > > -cannot contain parent information). > > - > > -This information is stored in a new file instead of grafts, or > > -even the config, since the user should not touch that file > > -at all (even throughout development of the shallow clone, it > > -was never manually edited!). > > +$GIT_DIR/shallow, and handle its contents similar to replace > > +refs (with the difference that shallow does not actually > > +create those replace refs) and > > If grafts are deprecated, why not alse get rid of this mention and simply > leave the 'what it does' part. Internally, shallow commits are implemented using the graft code path, and they always will be: we will always need a list of the shallow commits, and we will always need to be able to lift the "shallow" attribute quickly, when deepening a shallow clone. So it makes sense to mention that here, because we are deep in technical details in Documentation/technical/. > > very much like the deprecated > > +graft file (with > > > the difference that shallow commits will > > +always have their parents grafted away, not replaced by > s/their parents grafted away/no parents/ (rather than being replaced..) But the commits will typically have parents. So they really will have their parents grafted away as long as they are marked "shallow"... Thank you for reviewing! Dscho
Re: [PATCH v4 05/11] replace: introduce --convert-graft-file
Hi Eric, On Sat, 21 Apr 2018, Eric Sunshine wrote: > On Sat, Apr 21, 2018 at 5:48 AM, Johannes Schindelin > wrote: > > This option is intended to help with the transition away from the > > now-deprecated graft file. > > > > Signed-off-by: Johannes Schindelin > > --- > > diff --git a/builtin/replace.c b/builtin/replace.c > > @@ -454,6 +455,38 @@ static int create_graft(int argc, const char **argv, > > int force) > > +static int convert_graft_file(int force) > > +{ > > + const char *graft_file = get_graft_file(); > > + FILE *fp = fopen_or_warn(graft_file, "r"); > > + struct strbuf buf = STRBUF_INIT, err = STRBUF_INIT; > > + struct argv_array args = ARGV_ARRAY_INIT; > > + > > + if (!fp) > > + return -1; > > + > > + while (strbuf_getline(&buf, fp) != EOF) { > > + if (*buf.buf == '#') > > + continue; > > + > > + argv_array_split(&args, buf.buf); > > + if (args.argc && create_graft(args.argc, args.argv, force)) > > + strbuf_addf(&err, "\n\t%s", buf.buf); > > + argv_array_clear(&args); > > + } > > + > > + strbuf_release(&buf); > > + argv_array_clear(&args); > > This argv_array_clear() is redundant, isn't it? It sure is! Thank you for the review, Dscho
Re: [PATCH v4 04/11] replace: "libify" create_graft() and callees
Hi Johannes, On Tue, Apr 24, 2018 at 11:51 AM, Johannes Schindelin wrote: > > Oy vey. How many more mistakes can I introduce in one commit... > I ask this myself all the time, but Software is hard when not having computer assisted checks. The test suite doesn't quite count here, as it doesn't yell loudly enough for leaks in corner cases. Thanks for taking these seriously, I was unsure if the first issues (close() clobbering the errno) were sever enough to bother. It complicates the code, but the effect is theoretical) (for EBADF) or a real niche corner case (EINTR). Speaking of that, I wonder if we eventually want to have a wrapper int xclose(int fd) { int err = errno; int ret = close(fd) if (errno == EINTR) /* on linux we don't care about this, other OSes? */ ; errno = err; return ret; } Though not in this series. Thanks, Stefan
Re: [PATCH v2 1/2] merge: Add merge.renames config setting
Sorry, I noticed something else I missed on my last reading... On Tue, Apr 24, 2018 at 10:11 AM, Ben Peart wrote: > diff --git a/builtin/merge.c b/builtin/merge.c > index 8746c5e3e8..3be52cd316 100644 > --- a/builtin/merge.c > +++ b/builtin/merge.c > @@ -424,6 +424,7 @@ static void finish(struct commit *head_commit, > opts.output_format |= > DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT; > opts.detect_rename = DIFF_DETECT_RENAME; > + git_config_get_bool("merge.renames", &opts.detect_rename); > diff_setup_done(&opts); > diff_tree_oid(head, new_head, "", &opts); > diffcore_std(&opts); Shouldn't this also be turned off if either (a) merge.renames is unset and diff.renames is false, or (b) the user specifies -Xno-renames?
Re: [PATCH v3 5/9] ref-filter: use generation number for --contains
Derrick Stolee writes: > On 4/18/2018 5:02 PM, Jakub Narebski wrote: >> Derrick Stolee writes: >> >>> A commit A can reach a commit B only if the generation number of A >>> is larger than the generation number of B. This condition allows >>> significantly short-circuiting commit-graph walks. >>> >>> Use generation number for 'git tag --contains' queries. >>> >>> On a copy of the Linux repository where HEAD is containd in v4.13 >>> but no earlier tag, the command 'git tag --contains HEAD' had the >>> following peformance improvement: >>> >>> Before: 0.81s >>> After: 0.04s >>> Rel %: -95% >> >> A question: what is the performance after if the "commit-graph" feature >> is disabled, or there is no commit-graph file? Is there performance >> regression in this case, or is the difference negligible? > > Negligible, since we are adding a small number of integer comparisons > and the main cost is in commit parsing. More on commit parsing in > response to your comments below. If it is proven to be always negligible, then its all right. If it is unlikely to be non-negligible, well, still O.K. But I wonder if maybe there is some situation where the cost of extra parsing is non-negligble. [...] >>> @@ -1618,8 +1623,18 @@ static enum contains_result >>> contains_tag_algo(struct commit *candidate, >>> struct contains_cache *cache) >>> { >>> struct contains_stack contains_stack = { 0, 0, NULL }; >>> - enum contains_result result = contains_test(candidate, want, cache); >>> + enum contains_result result; >>> + uint32_t cutoff = GENERATION_NUMBER_INFINITY; >>> + const struct commit_list *p; >>> + >>> + for (p = want; p; p = p->next) { >>> + struct commit *c = p->item; >>> + parse_commit_or_die(c); >>> + if (c->generation < cutoff) >>> + cutoff = c->generation; >>> + } >> Sholdn't the above be made conditional on the ability to get generation >> numbers from the commit-graph file (feature is turned on and file >> exists)? Otherwise here after the change contains_tag_algo() now parses >> each commit in 'want', which I think was not done previously. >> >> With commit-graph file parsing is [probably] cheap. Without it, not >> necessary. >> >> But I might be worrying about nothing. > > Not nothing. This parses the "wants" when we previously did not parse > the wants. Further: this parsing happens before we do the simple check > of comparing the OID of the candidate against the wants. > > The question is: are these parsed commits significant compared to the > walk that will parse many more commits? It is certainly possible. > > One way to fix this is to call 'prepare_commit_graph()' directly and > then test that 'commit_graph' is non-null before performing any > parses. I'm not thrilled with how that couples the commit-graph > implementation to this feature, but that may be necessary to avoid > regressions in the non-commit-graph case. Another possible solution (not sure if better or worse) would be to change the signature of contains_tag_algo() function to take parameter or flag that would decide whether to parse "wants". Best, -- Jakub Narębski
Re: [PATCH 9/9] cache.h: allow sha1_object_info to handle arbitrary repositories
On Tue, 24 Apr 2018 11:42:33 -0700 Brandon Williams wrote: > On 04/24, Jonathan Tan wrote: > > On Mon, 23 Apr 2018 16:43:27 -0700 > > Stefan Beller wrote: > > > > > This involves also adapting sha1_object_info_extended and a some > > > internal functions that are used to implement these. It all has to > > > happen in one patch, because of a single recursive chain of calls visits > > > all these functions. > > > > In packfile.c, unpack_entry() invokes get_delta_base_cache_entry(), > > which references a global (delta_base_cache). Does delta_base_cache need > > to be moved to the repo object (or object store object) first, or is > > this safe? > > After looking at this, I think it should be safe for now since its a > cache that requires a packed_git pointer to access (and those would be > per repository). We may want to move it in to the repository at some > point though. Thanks for the pointer. I see that a packed_git pointer is part of the key to that hashmap, so this is indeed safe. (Probably worth discussing this in the commit message.) There is another global, delta_base_cached, but it just limits the total size of the hashmap, so it's safe to use that too.
Re: [PATCH v4 04/11] replace: "libify" create_graft() and callees
Hi Stefan, On Mon, 23 Apr 2018, Stefan Beller wrote: > On Fri, Apr 20, 2018 at 3:21 PM, Johannes Schindelin > wrote: > > @@ -250,27 +257,38 @@ static void import_object(struct object_id *oid, enum > > object_type type, > > - if (strbuf_read(&result, cmd.out, 41) < 0) > > - die_errno("unable to read from mktree"); > > + if (strbuf_read(&result, cmd.out, 41) < 0) { > > + close(fd); > > + close(cmd.out); > > + return error_errno("unable to read from mktree"); > > So before the errno is coming directly from strbuf_read, > which will set errno on error to the desired errno. > (It will come from an underlying read()) Yes, you are right! > However close() may fail and clobber errno, > so I would think we'd need to > > if (strbuf_read(&result, cmd.out, 41) < 0) { > int err = errno; /* close shall not clobber errno */ > close(fd); > close(cmd.out); > errno = err; > return error_errno(...); > } I went for the easier route: call error_errno() before close(fd), and then return -1 after close(cmd.out). Since error_errno() always returns -1, the result is pretty much the same (I do not think that we want the caller of import_object() to rely on the errno). > > - if (fstat(fd, &st) < 0) > > - die_errno("unable to fstat %s", filename); > > + if (fstat(fd, &st) < 0) { > > + close(fd); > > + return error_errno("unable to fstat %s", filename); > > + } > > Same here? Yep. > An alternative would be to do > ret = error_errno(...) > close (..) > return ret; I even saved one variable ;-) > > @@ -288,19 +307,23 @@ static int edit_and_replace(const char *object_ref, > > int force, int raw) > > struct strbuf ref = STRBUF_INIT; > > > > if (get_oid(object_ref, &old_oid) < 0) > > - die("Not a valid object name: '%s'", object_ref); > > + return error("Not a valid object name: '%s'", object_ref); > > > > type = oid_object_info(&old_oid, NULL); > > if (type < 0) > > - die("unable to get object type for %s", > > oid_to_hex(&old_oid)); > > + return error("unable to get object type for %s", > > +oid_to_hex(&old_oid)); > > > > - check_ref_valid(&old_oid, &prev, &ref, force); > > + if (check_ref_valid(&old_oid, &prev, &ref, force)) > > + return -1; > > strbuf_release(&ref); > > > > - export_object(&old_oid, type, raw, tmpfile); > > + if (export_object(&old_oid, type, raw, tmpfile)) > > + return -1; > > if (launch_editor(tmpfile, NULL, NULL) < 0) > > - die("editing object file failed"); > > - import_object(&new_oid, type, raw, tmpfile); > > + return error("editing object file failed"); > > + if (import_object(&new_oid, type, raw, tmpfile)) > > + return -1; > > > > free(tmpfile); > > Do we need to free tmpfile in previous returns? Oy vey. How many more mistakes can I introduce in one commit... > > @@ -394,24 +422,29 @@ static int create_graft(int argc, const char **argv, > > int force) > > unsigned long size; > > > > if (get_oid(old_ref, &old_oid) < 0) > > - die(_("Not a valid object name: '%s'"), old_ref); > > - commit = lookup_commit_or_die(&old_oid, old_ref); > > + return error(_("Not a valid object name: '%s'"), old_ref); > > + commit = lookup_commit_reference(&old_oid); > > + if (!commit) > > + return error(_("could not parse %s"), old_ref); > > > > buffer = get_commit_buffer(commit, &size); > > strbuf_add(&buf, buffer, size); > > unuse_commit_buffer(commit, buffer); > > > > - replace_parents(&buf, argc - 1, &argv[1]); > > + if (replace_parents(&buf, argc - 1, &argv[1]) < 0) > > + return -1; > > > > if (remove_signature(&buf)) { > > warning(_("the original commit '%s' has a gpg signature."), > > old_ref); > > warning(_("the signature will be removed in the replacement > > commit!")); > > } > > > > - check_mergetags(commit, argc, argv); > > + if (check_mergetags(commit, argc, argv)) > > + return -1; > > > > if (write_object_file(buf.buf, buf.len, commit_type, &new_oid)) > > - die(_("could not write replacement commit for: '%s'"), > > old_ref); > > + return error(_("could not write replacement commit for: > > '%s'"), > > +old_ref); > > > > strbuf_release(&buf); > > Release &buf in the other return cases, too? Absolutely. Thank you for helping me improve these patches, Dscho
Re: [PATCH 9/9] cache.h: allow sha1_object_info to handle arbitrary repositories
On 04/24, Jonathan Tan wrote: > On Mon, 23 Apr 2018 16:43:27 -0700 > Stefan Beller wrote: > > > This involves also adapting sha1_object_info_extended and a some > > internal functions that are used to implement these. It all has to > > happen in one patch, because of a single recursive chain of calls visits > > all these functions. > > In packfile.c, unpack_entry() invokes get_delta_base_cache_entry(), > which references a global (delta_base_cache). Does delta_base_cache need > to be moved to the repo object (or object store object) first, or is > this safe? After looking at this, I think it should be safe for now since its a cache that requires a packed_git pointer to access (and those would be per repository). We may want to move it in to the repository at some point though. > > Also, in sha1_file.c, oid_object_info_extended() invokes fetch_object(), > which attempts to fetch a missing object. For this, I think that it's > best to guard with a "r == the_repository" check, or if there's a better > way to distinguish between the "default" repository and any repository > that we newly create (I vaguely remember some distinction when parsing > environment variables when determining repo paths - the envvars were > only used for the "default" repository, but not for the others). This is a little more difficult and I'm not sure I know what the best course of action would be for this. Mostly because then this puts a big recursive dependency on the whole fetch mechanism to handle arbitrary repositories at the same time these functions are converted. So maybe throwing in the runtime check would be the best way to break the dependencies for now. -- Brandon Williams
Re: [PATCH v3 7/7] contrib/git-jump/git-jump: jump to match column in addition to line
On Tue, Apr 24, 2018 at 01:37:36AM -0400, Eric Sunshine wrote: > On Tue, Apr 24, 2018 at 1:07 AM, Taylor Blau wrote: > > Take advantage of 'git-grep(1)''s new option, '--column-number' in order > > to teach Peff's 'git-jump' script how to jump to the correct column for > > any given match. > > > > 'git-grep(1)''s output is in the correct format for Vim's jump list, so > > no additional cleanup is necessary. > > > > Signed-off-by: Taylor Blau > > --- > > contrib/git-jump/git-jump | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > Based upon Ævar review[1], I was expecting to see git-jump/README > modified by this patch, as well. Perhaps you overlooked or forgot > about that review comment, or perhaps you disagreed with it? Yes, and thank you for pointing that out. I recall reading his mail and thought that when I prepared v3 that I had already included his changes, but I had in fact not done so. I amended the git-jump's README to prepare for v4, but was somewhat confused by Ævar's comment when I reread [1]. I believe he was suggesting updating the example to remove a reference to ag(1)'s '--column' when configuring jump.grepCmd to 'ag --column'. Since git-{grep,jump} support this now by default, I changed that line to simply 'ag', instead of 'ag --column', as such: diff --git a/contrib/git-jump/README b/contrib/git-jump/README index 4484bda410..7630e16854 100644 --- a/contrib/git-jump/README +++ b/contrib/git-jump/README @@ -37,3 +37,3 @@ Git-jump can generate four types of interesting lists: - 3. Any grep matches. + 3. Any grep matches, including the column of the first match on a line. @@ -67,3 +67,3 @@ git jump grep -i foo_bar # use the silver searcher for git jump grep -git config jump.grepCmd "ag --column" +git config jump.grepCmd "ag" -- @@ -84,3 +84,3 @@ leaving you to locate subsequent hits in that file or other files using the editor or pager. By contrast, git-jump provides the editor with a -complete list of files and line numbers for each match. +complete list of files, lines, and a column number for each match. --- Does this look OK? Thanks, Taylor
Re: [PATCH v2 2/6] grep.c: take column number as argument to show_line()
On Tue, Apr 24, 2018 at 03:13:55PM +0900, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason writes: > > > I think when we add features to git-grep we should be as close to GNU > > grep as possible (e.g. not add this -m alias meaning something different > > as in your v1), but if GNU grep doesn't have something go with the trend > > of other grep tools, as noted at > > https://beyondgrep.com/feature-comparison/ (and I found another one that > > has this: https://github.com/beyondgrep/website/pull/83), so there's > > already 3 prominent grep tools that call this just --column. > > > > I think we should just go with that. > > OK. If they called it --column-number, that might have been more in > line with GNU grep's --line-number, but that is not something we can > dictate retroactively anyway, so --column to match them would be > better than trying to be consistent and ending up with being > different from everybody else. That sounds sensible. Let's call the new option '--column', and the configuration options grep.column and color.grep.column to match (instead of s/column/columnNumber/g), yes? Thanks, Taylor
Re: [PATCH 9/9] cache.h: allow sha1_object_info to handle arbitrary repositories
On Mon, 23 Apr 2018 16:43:27 -0700 Stefan Beller wrote: > This involves also adapting sha1_object_info_extended and a some > internal functions that are used to implement these. It all has to > happen in one patch, because of a single recursive chain of calls visits > all these functions. In packfile.c, unpack_entry() invokes get_delta_base_cache_entry(), which references a global (delta_base_cache). Does delta_base_cache need to be moved to the repo object (or object store object) first, or is this safe? Also, in sha1_file.c, oid_object_info_extended() invokes fetch_object(), which attempts to fetch a missing object. For this, I think that it's best to guard with a "r == the_repository" check, or if there's a better way to distinguish between the "default" repository and any repository that we newly create (I vaguely remember some distinction when parsing environment variables when determining repo paths - the envvars were only used for the "default" repository, but not for the others).
Re: [PATCH 5/9] packfile: add repository argument to packed_object_info
On Mon, 23 Apr 2018 16:43:23 -0700 Stefan Beller wrote: > diff --git a/sha1_file.c b/sha1_file.c > index 93f25c6c6a..b292e04fd3 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -1307,7 +1307,8 @@ int oid_object_info_extended_the_repository(const > struct object_id *oid, struct >* information below, so return early. >*/ > return 0; > - rtype = packed_object_info(e.p, e.offset, oi); > + > + rtype = packed_object_info(the_repository, e.p, e.offset, oi); Extra blank line introduced.
Re: [PATCH 2/9] cache.h: add repository argument to oid_object_info
On Mon, 23 Apr 2018 16:43:20 -0700 Stefan Beller wrote: > Add a repository argument to allow the callers of oid_object_info > to be more specific about which repository to handle. This is a small > mechanical change; it doesn't change the implementation to handle > repositories other than the_repository yet. >From here... > In the expanded macro the identifier `the_repository` is not actually used, > so the compiler does not catch if the repository.h header is not included > at the call site. call sites needing that #include were identified by > changing the macro to definition to > > #define sha1_object_info(r, sha1, size) \ > (r, sha1_object_info_##r(sha1, size)). > > This produces a compiler warning about the left hand side of the comma > operator being unused, which can be suppressed using -Wno-unused-value. > > To avoid breaking bisection, do not include this trick in the patch. ...until here: I don't think this explanation is necessary - this macro trick is temporary anyway in all our patch sets. Also, wouldn't repository.h be needed anyway if we reference "struct repository"? I would replace this section with the "As with the previous commits" explanation you have in PATCH 3/9 and others.
Re: [PATCH v2 1/2] merge: Add merge.renames config setting
On Tue, Apr 24, 2018 at 10:11 AM, Ben Peart wrote: > Add the ability to control rename detection for merge via a config setting. Sweet, thanks for including the documentation updates. I lean towards the side of the argument that says that since merge.renameLimit inherits from diff.renameLimit, merge.renames should inherit default value from diff.renames (allow people to not have to repeat themselves as much if they want to use the same rename settings for all cases). Sounds like you and Johannes disagree. I don't feel super strongly about this item, but it'd probably be good to get some other git folks' opinions on this particular point. Other than that unresolved question, and the separate one about whether to go with a different option instead (e.g. merge.defaultStrategy), as being discussed elsewhere in this thread, the patch looks good to me.
Re: [PATCH 0/3] Colorful blame output
On Mon, Apr 23, 2018 at 7:09 PM, Junio C Hamano wrote: > Stefan Beller writes: > >> This is a revamp of >> https://public-inbox.org/git/20180416220955.46163-1-sbel...@google.com/ >> >> Junio had some issue with that version, as it would collide config and >> command >> line options. This is fixed now by introducing another option. > > Heh, that sounds as if the series was rerolled unnecessarily, only > because I had unreasonable complaints. Sorry for that wording, the complaints were of educational nature, so I rerolled it. > I was hoping that issues the > series had were fixed, and my involvement was merely that I happened > to be the one who pointed out first. There were multiple issues (many fixed already), but there was one outstanding design bug, so I had to reroll. > > In any case, will replace and take a look at it. Thanks. Thanks for looking at it, Stefan
Re: [PATCH v1 1/2] merge: Add merge.renames config setting
Hi Dscho, On Tue, Apr 24, 2018 at 4:58 AM, Johannes Schindelin wrote: > Hi Junio, > > On Tue, 24 Apr 2018, Junio C Hamano wrote: > >> Yeah, but as opposed to passing "oh, let's see if we can get a >> reasonable result without rename detection just this time" from the >> command line, configuring merge.renames=false in would mean exactly >> that: "we don't need rename detection, just want to skip the cycles >> spent for it". That is why I wondered how well the resolve strategy >> would have fit your needs. > > Please do not forget that the context is GVFS, where you would cause a lot > of pain and suffering by letting users forget to specify that command-line > option all the time, resulting in several gigabytes of objects having to > be downloaded just for the sake of rename detection. > > So there is a pretty good point in doing this as a config option. I agree you need a config option, but I think Junio has a good point that it's worth at least checking out the possibility of a different one. In particular, you could add a merge.defaultStrategy (or maybe merge.twohead to be similar to pull.twohead??) that is set to 'resolve', and use that to avoid rename detection. Perhaps performance considerations rule out the resolve strategy and favor recursive, or maybe you need the 'recursive' part of the recursive strategy (rather than the rename part), or perhaps there's some other special reason you need to go this route, but since you are avoiding renames right now it's at least worth considering the resolve strategy. Elijah
Re: [PATCH v1 2/2] merge: Add merge.aggressive config setting
Hi Ben, On Tue, Apr 24, 2018 at 9:45 AM, Ben Peart wrote: > On 4/20/2018 1:22 PM, Elijah Newren wrote: >> On Fri, Apr 20, 2018 at 6:36 AM, Ben Peart >> wrote: >>> >>> Add the ability to control the aggressive flag passed to read-tree via a >>> config setting. >> >> This feels like a workaround to the performance problems with index >> updates in merge-recursive.c. > > This change wasn't done to solve performance problems. We turned it on > because it reduced the number of unmerged entries (from 40K to 1) in the > particular merge we were looking at. The additional 3 scenarios that > --aggressive resolves made that much difference. > > That said, it makes sense to me to do Um...color me perplexed here. aggressive exists just to do some resolutions that higher-level strategies can and totally ought to be able to handle easily (the rules are almost trivially straight-forward), but deferring allows the higher level strategies (either merge-recursive or resolve's git-merge-one-file) to handle slightly differently (e.g. by detecting renames). merge-recursive should be able to resolve anything that the unpack_trees aggressive setting handles. If it can't, it sounds like there's a horrible bug somewhere. Perhaps fixing that bug is the real problem? Is there any chance you can dig out more details about any of these conflicts or come up with a simple testcase where running 'git merge -X no-renames' gives a merge conflict but running with this option would run to completion? >> this when rename detection is turned off. In fact, I think you'd >> automatically want to set aggressive to true whenever rename detection >> is turned off (whether by your merge.renames option or the >> -Xno-renames flag). >> > I can't think of any reason this setting would be useful separate from >> turning rename detection off, and it'd actively harm rename detection >> performance improvements I have in the pipeline. I'd really prefer to >> not add this option, and instead combine the setting of aggressive >> with the other flag. Do you have an independent reason for wanting >> this? >> > > While combining them would work for our specific use scenario (since we turn > both on already along with turning off merge.stat), I really hesitate to tie > these two different flags and code paths together with a single config > setting. > > While I don't want to needlessly complicate your optimizations in this area > (they are already complex enough!) I believe we need to keep the option to > turn on --aggressive without turning off rename detection as a viable > option. Perhaps if that is the case, your optimizations have less impact or > don't apply but the user should be able to make that choice for their > specific situation. I totally buy that you need at least one option to avoid waiting for (current) rename detection in some fashion, and that you don't want lots of spurious conflicts. But I don't understand why you believe that we need to keep the option to turn on the aggressive flag independently. What's the usecase? It wasn't possible before in the code, no one else has asked for it, and even you say you don't need it as a separate option. Is it a concern that turning on aggressive whenever rename-detection is turned off will break something? The only reason I can see to keep the aggressive codepath in unpack_trees behind a branch instead of it always running unconditionally for every single caller throughout the codebase is because of renames. So the fact that you're turning renames off, to me, suggests that aggressive flag should automatically be turned on. I'd even call pre-existing code (e.g. the -X no-renames option in merge-recursive) that doesn't turn on the aggressive flag buggy (even if the only result is suboptimal-performance). I don't see how an option to turn on the aggressive flag independently is possibly useful to anyone. Further, we have strong reason to believe it will soon be actively harmful. So...why? It's totally possible I'm just missing something. If there's a good reason for it, providing some kind of benefit that the user could weigh in a tradeoff, then I can get on board with providing it as an option, but right now I just don't see it.
[PATCH v2 1/2] merge: Add merge.renames config setting
Add the ability to control rename detection for merge via a config setting. Reviewed-by: Johannes Schindelin Signed-off-by: Ben Peart --- Documentation/diff-config.txt | 3 ++- Documentation/merge-config.txt | 8 +++- Documentation/merge-strategies.txt | 6 -- builtin/merge.c| 1 + merge-recursive.c | 1 + 5 files changed, 15 insertions(+), 4 deletions(-) diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt index 5ca942ab5e..77caa66c2f 100644 --- a/Documentation/diff-config.txt +++ b/Documentation/diff-config.txt @@ -112,7 +112,8 @@ diff.orderFile:: diff.renameLimit:: The number of files to consider when performing the copy/rename - detection; equivalent to the 'git diff' option `-l`. + detection; equivalent to the 'git diff' option `-l`. This setting + has no effect if rename detection is turned off. diff.renames:: Whether and how Git detects renames. If set to "false", diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt index 12b6bbf591..0540c44e23 100644 --- a/Documentation/merge-config.txt +++ b/Documentation/merge-config.txt @@ -35,7 +35,13 @@ include::fmt-merge-msg-config.txt[] merge.renameLimit:: The number of files to consider when performing rename detection during a merge; if not specified, defaults to the value of - diff.renameLimit. + diff.renameLimit. This setting has no effect if rename detection + is turned off. + +merge.renames:: + Whether and how Git detects renames. If set to "false", + rename detection is disabled. If set to "true", basic rename + detection is enabled. This is the default. merge.renormalize:: Tell Git that canonical representation of files in the diff --git a/Documentation/merge-strategies.txt b/Documentation/merge-strategies.txt index 4a58aad4b8..1e0728aa12 100644 --- a/Documentation/merge-strategies.txt +++ b/Documentation/merge-strategies.txt @@ -84,12 +84,14 @@ no-renormalize;; `merge.renormalize` configuration variable. no-renames;; - Turn off rename detection. + Turn off rename detection. This overrides the `merge.renames` + configuration variable. See also linkgit:git-diff[1] `--no-renames`. find-renames[=];; Turn on rename detection, optionally setting the similarity - threshold. This is the default. + threshold. This is the default. This overrides the + 'merge.renames' configuration variable. See also linkgit:git-diff[1] `--find-renames`. rename-threshold=;; diff --git a/builtin/merge.c b/builtin/merge.c index 8746c5e3e8..3be52cd316 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -424,6 +424,7 @@ static void finish(struct commit *head_commit, opts.output_format |= DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT; opts.detect_rename = DIFF_DETECT_RENAME; + git_config_get_bool("merge.renames", &opts.detect_rename); diff_setup_done(&opts); diff_tree_oid(head, new_head, "", &opts); diffcore_std(&opts); diff --git a/merge-recursive.c b/merge-recursive.c index 9c05eb7f70..cd5367e890 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -3256,6 +3256,7 @@ static void merge_recursive_config(struct merge_options *o) git_config_get_int("merge.verbosity", &o->verbosity); git_config_get_int("diff.renamelimit", &o->diff_rename_limit); git_config_get_int("merge.renamelimit", &o->merge_rename_limit); + git_config_get_bool("merge.renames", &o->detect_rename); git_config(git_xmerge_config, NULL); } -- 2.17.0.windows.1
[PATCH v2 2/2] merge: Add merge.aggressive config setting
Add the ability to control the aggressive flag passed to read-tree via a config setting. Reviewed-by: Johannes Schindelin Signed-off-by: Ben Peart --- Documentation/merge-config.txt | 4 merge-recursive.c | 1 + 2 files changed, 5 insertions(+) diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt index 0540c44e23..38492bcb98 100644 --- a/Documentation/merge-config.txt +++ b/Documentation/merge-config.txt @@ -1,3 +1,7 @@ +merge.aggressive:: + Passes "aggressive" to read-tree which makes the command resolve + a few more cases internally. See "--aggressive" in linkgit:git-read-tree[1]. + merge.conflictStyle:: Specify the style in which conflicted hunks are written out to working tree files upon merge. The default is "merge", which diff --git a/merge-recursive.c b/merge-recursive.c index cd5367e890..0ca84e4b82 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -355,6 +355,7 @@ static int git_merge_trees(struct merge_options *o, o->unpack_opts.fn = threeway_merge; o->unpack_opts.src_index = &the_index; o->unpack_opts.dst_index = &the_index; + git_config_get_bool("merge.aggressive", (int *)&o->unpack_opts.aggressive); setup_unpack_trees_porcelain(&o->unpack_opts, "merge"); init_tree_desc_from_tree(t+0, common); -- 2.17.0.windows.1
[PATCH v2 0/2] add additional config settings for merge
Updated in response to feedback. Mostly documentation changes but the diffstat at the end of the merge (if on) now honors the new merge.rename setting as well. Base Ref: master Web-Diff: https://github.com/benpeart/git/commit/653bfe6e01 Checkout: git fetch https://github.com/benpeart/git merge-options-v2 && git checkout 653bfe6e01 ### Interdiff (v1..v2): diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt index 5ca942ab5e..77caa66c2f 100644 --- a/Documentation/diff-config.txt +++ b/Documentation/diff-config.txt @@ -112,7 +112,8 @@ diff.orderFile:: diff.renameLimit:: The number of files to consider when performing the copy/rename - detection; equivalent to the 'git diff' option `-l`. + detection; equivalent to the 'git diff' option `-l`. This setting + has no effect if rename detection is turned off. diff.renames:: Whether and how Git detects renames. If set to "false", diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt index 5a9ab969db..38492bcb98 100644 --- a/Documentation/merge-config.txt +++ b/Documentation/merge-config.txt @@ -39,7 +39,8 @@ include::fmt-merge-msg-config.txt[] merge.renameLimit:: The number of files to consider when performing rename detection during a merge; if not specified, defaults to the value of - diff.renameLimit. + diff.renameLimit. This setting has no effect if rename detection + is turned off. merge.renames:: Whether and how Git detects renames. If set to "false", diff --git a/Documentation/merge-strategies.txt b/Documentation/merge-strategies.txt index 4a58aad4b8..1e0728aa12 100644 --- a/Documentation/merge-strategies.txt +++ b/Documentation/merge-strategies.txt @@ -84,12 +84,14 @@ no-renormalize;; `merge.renormalize` configuration variable. no-renames;; - Turn off rename detection. + Turn off rename detection. This overrides the `merge.renames` + configuration variable. See also linkgit:git-diff[1] `--no-renames`. find-renames[=];; Turn on rename detection, optionally setting the similarity - threshold. This is the default. + threshold. This is the default. This overrides the + 'merge.renames' configuration variable. See also linkgit:git-diff[1] `--find-renames`. rename-threshold=;; diff --git a/builtin/merge.c b/builtin/merge.c index 8746c5e3e8..3be52cd316 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -424,6 +424,7 @@ static void finish(struct commit *head_commit, opts.output_format |= DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT; opts.detect_rename = DIFF_DETECT_RENAME; + git_config_get_bool("merge.renames", &opts.detect_rename); diff_setup_done(&opts); diff_tree_oid(head, new_head, "", &opts); diffcore_std(&opts); ### Patches Ben Peart (2): merge: Add merge.renames config setting merge: Add merge.aggressive config setting Documentation/diff-config.txt | 3 ++- Documentation/merge-config.txt | 12 +++- Documentation/merge-strategies.txt | 6 -- builtin/merge.c| 1 + merge-recursive.c | 2 ++ 5 files changed, 20 insertions(+), 4 deletions(-) base-commit: 0b0cc9f86731f894cff8dd25299a9b38c254569e -- 2.17.0.windows.1
[PATCH] git: add -N as a short option for --no-pager
In modern setups, less, the pager, uses alternate screen to show the content. When it is closed, it switches back to the original screen, and all content is gone. It is not uncommon to request that the output remains visible in the terminal. For this, the option --no-pager can be used. But it is a bit cumbersome to type, even when command completion is available. Provide a short option, -N, to make the option easier accessible. Signed-off-by: Johannes Sixt --- Documentation/git.txt | 3 ++- git.c | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/Documentation/git.txt b/Documentation/git.txt index 4767860e72..17b50b0dc6 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -11,7 +11,7 @@ SYNOPSIS [verse] 'git' [--version] [--help] [-C ] [-c =] [--exec-path[=]] [--html-path] [--man-path] [--info-path] -[-p|--paginate|--no-pager] [--no-replace-objects] [--bare] +[-p|--paginate|-N|--no-pager] [--no-replace-objects] [--bare] [--git-dir=] [--work-tree=] [--namespace=] [--super-prefix=] [] @@ -103,6 +103,7 @@ foo.bar= ...`) sets `foo.bar` to the empty string which `git config configuration options (see the "Configuration Mechanism" section below). +-N:: --no-pager:: Do not pipe Git output into a pager. diff --git a/git.c b/git.c index ceaa58ef40..9e2d78f442 100644 --- a/git.c +++ b/git.c @@ -7,7 +7,7 @@ const char git_usage_string[] = N_("git [--version] [--help] [-C ] [-c =]\n" " [--exec-path[=]] [--html-path] [--man-path] [--info-path]\n" - " [-p | --paginate | --no-pager] [--no-replace-objects] [--bare]\n" + " [-p | --paginate | -N | --no-pager] [--no-replace-objects] [--bare]\n" " [--git-dir=] [--work-tree=] [--namespace=]\n" "[]"); @@ -81,7 +81,7 @@ static int handle_options(const char ***argv, int *argc, int *envchanged) exit(0); } else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) { use_pager = 1; - } else if (!strcmp(cmd, "--no-pager")) { + } else if (!strcmp(cmd, "-N") || !strcmp(cmd, "--no-pager")) { use_pager = 0; if (envchanged) *envchanged = 1; -- 2.17.0.69.g0c1d01d9b6
Re: [PATCH v1 1/2] merge: Add merge.renames config setting
On 4/23/2018 5:32 PM, Eckhard Maaß wrote: On Mon, Apr 23, 2018 at 09:15:09AM -0400, Ben Peart wrote: In commit 2a2ac926547 when merge.renamelimit was added, it was decided to have separate settings for merge and diff to give users the ability to control that behavior. In this particular case, it will default to the value of diff.renamelimit when it isn't set. That isn't consistent with the other merge settings. However, it seems like a desirable way to do it. I'm just one opinion among many but I personally believe the cascading settings are complicated enough just with the various config files and command line options and which overwrite the others. I'd rather not complicate them further by having settings inherited from one feature (diff) to another (merge). There are currently ~15 merge specific config settings and only merge.renamelimit currently does this inheritance. That said, at least one other person thought it was a good idea. :) Maybe let me throw in some code for discussion (test and documentation is missing, mainly to form an idea what the change in options should be). I admit the patch below is concerned only with diff.renames, but whatever we come up with for merge should be reflected there, too, doesn't it > Greetings, Eckhard -- >8 -- From e8a88111f2aaf338a4c19e83251c7178f7152129 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Eckhard=20S=2E=20Maa=C3=9F?= Date: Sun, 22 Apr 2018 23:29:08 +0200 Subject: [PATCH] diff: enhance diff.renames to be able to set rename score MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Eckhard S. Maaß --- diff.c | 35 --- 1 file changed, 28 insertions(+), 7 deletions(-) diff --git a/diff.c b/diff.c index 1289df4b1f..a3cedad5cf 100644 --- a/diff.c +++ b/diff.c @@ -30,6 +30,7 @@ #endif static int diff_detect_rename_default; +static int diff_rename_score_default; static int diff_indent_heuristic = 1; static int diff_rename_limit_default = 400; static int diff_suppress_blank_empty; @@ -177,13 +178,33 @@ static int parse_submodule_params(struct diff_options *options, const char *valu return 0; } +int parse_rename_score(const char **cp_p); + +static int git_config_rename_score(const char *value) +{ + int parsed_rename_score = parse_rename_score(&value); + if (parsed_rename_score == -1) + return error("invalid argument to diff.renamescore: %s", value); + diff_rename_score_default = parsed_rename_score; + return 0; +} + static int git_config_rename(const char *var, const char *value) { - if (!value) - return DIFF_DETECT_RENAME; - if (!strcasecmp(value, "copies") || !strcasecmp(value, "copy")) - return DIFF_DETECT_COPY; - return git_config_bool(var,value) ? DIFF_DETECT_RENAME : 0; + if (!value) { + diff_detect_rename_default = DIFF_DETECT_RENAME; + return 0; + } + if (skip_to_optional_arg(value, "copies", &value) || skip_to_optional_arg(value, "copy", &value)) { + diff_detect_rename_default = DIFF_DETECT_COPY; + return git_config_rename_score(value); + } + if (skip_to_optional_arg(value, "renames", &value) || skip_to_optional_arg(value, "rename", &value)) { + diff_detect_rename_default = DIFF_DETECT_RENAME; + return git_config_rename_score(value); + } + diff_detect_rename_default = git_config_bool(var,value) ? DIFF_DETECT_RENAME : 0; + return 0; } long parse_algorithm_value(const char *value) @@ -307,8 +328,7 @@ int git_diff_ui_config(const char *var, const char *value, void *cb) return 0; } if (!strcmp(var, "diff.renames")) { - diff_detect_rename_default = git_config_rename(var, value); - return 0; + return git_config_rename(var, value); } if (!strcmp(var, "diff.autorefreshindex")) { diff_auto_refresh_index = git_config_bool(var, value); @@ -4116,6 +4136,7 @@ void diff_setup(struct diff_options *options) options->add_remove = diff_addremove; options->use_color = diff_use_color_default; options->detect_rename = diff_detect_rename_default; + options->rename_score = diff_rename_score_default; options->xdl_opts |= diff_algorithm; if (diff_indent_heuristic) DIFF_XDL_SET(options, INDENT_HEURISTIC);
Re: [PATCH v1 2/2] merge: Add merge.aggressive config setting
On 4/20/2018 1:22 PM, Elijah Newren wrote: On Fri, Apr 20, 2018 at 6:36 AM, Ben Peart wrote: Add the ability to control the aggressive flag passed to read-tree via a config setting. This feels like a workaround to the performance problems with index updates in merge-recursive.c. This change wasn't done to solve performance problems. We turned it on because it reduced the number of unmerged entries (from 40K to 1) in the particular merge we were looking at. The additional 3 scenarios that --aggressive resolves made that much difference. That said, it makes sense to me to do this when rename detection is turned off. In fact, I think you'd automatically want to set aggressive to true whenever rename detection is turned off (whether by your merge.renames option or the -Xno-renames flag). > I can't think of any reason this setting would be useful separate from turning rename detection off, and it'd actively harm rename detection performance improvements I have in the pipeline. I'd really prefer to not add this option, and instead combine the setting of aggressive with the other flag. Do you have an independent reason for wanting this? While combining them would work for our specific use scenario (since we turn both on already along with turning off merge.stat), I really hesitate to tie these two different flags and code paths together with a single config setting. While I don't want to needlessly complicate your optimizations in this area (they are already complex enough!) I believe we need to keep the option to turn on --aggressive without turning off rename detection as a viable option. Perhaps if that is the case, your optimizations have less impact or don't apply but the user should be able to make that choice for their specific situation. Thanks, Elijah
Re: [PATCH 2/2] unpack_trees_options: free messages when done
On Mon, Apr 23, 2018 at 10:13 PM, Martin Ågren wrote: > The strings allocated in `setup_unpack_trees_porcelain()` are never > freed. Provide a function `clear_unpack_trees_porcelain()` to do so and > call it in the functions which use `setup_unpack_trees_porcelain()`. This is awesome; thanks. > diff --git a/merge-recursive.c b/merge-recursive.c > index 0c0d48624d..8229b91e2f 100644 > --- a/merge-recursive.c > +++ b/merge-recursive.c > @@ -301,6 +301,7 @@ static int git_merge_trees(int index_only, > init_tree_desc_from_tree(t+2, merge); > > rc = unpack_trees(3, t, &opts); > + clear_unpack_trees_porcelain(&opts); > cache_tree_free(&active_cache_tree); > return rc; Yeah, this could result in an evil merge. In my series, I want to continue to be able to call verify_uptodate() from unpack_trees.c in order to check if files affected by renames are dirty and we need to avoid overwriting them. That can trigger error messages, so they need to not be freed until later. So, instead, I'd like to see something like the below (built on top of my series): -- >8 -- --- merge-recursive.c | 25 - 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/merge-recursive.c b/merge-recursive.c index f2cbad4f10..3491a27bf2 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -337,10 +337,10 @@ static void init_tree_desc_from_tree(struct tree_desc *desc, struct tree *tree) init_tree_desc(desc, tree->buffer, tree->size); } -static int git_merge_trees(struct merge_options *o, - struct tree *common, - struct tree *head, - struct tree *merge) +static int unpack_trees_start(struct merge_options *o, + struct tree *common, + struct tree *head, + struct tree *merge) { int rc; struct tree_desc t[3]; @@ -378,6 +378,12 @@ static int git_merge_trees(struct merge_options *o, return rc; } +static void unpack_trees_finish(struct merge_options *o) +{ + discard_index(&o->orig_index); + clear_unpack_trees_porcelain(&o->unpack_opts); +} + struct tree *write_tree_from_memory(struct merge_options *o) { struct tree *result = NULL; @@ -3079,7 +3085,7 @@ int merge_trees(struct merge_options *o, return 1; } - code = git_merge_trees(o, common, head, merge); + code = unpack_trees_start(o, common, head, merge); if (code != 0) { if (show(o, 4) || o->call_depth) @@ -3144,14 +3150,7 @@ int merge_trees(struct merge_options *o, else clean = 1; - /* Free the extra index left from git_merge_trees() */ - /* -* FIXME: Need to also free data allocated by -* setup_unpack_trees_porcelain() tucked away in o->unpack_opts.msgs, -* but the problem is that only half of it refers to dynamically -* allocated data, while the other half points at static strings. -*/ - discard_index(&o->orig_index); + unpack_trees_finish(o); if (o->call_depth && !(*result = write_tree_from_memory(o))) return -1; -- 2.17.0.295.g791b7256b2.dirty
Re: [PATCH v3 4/6] git.c: implement --list-cmds=porcelain
On Tue, Apr 24, 2018 at 6:12 PM, Duy Nguyen wrote: > git-completion.bash will be updated to ask git "give me the commands > in the mainporcelain, completable or external category". This also > addresses another thing that bugs me: I wanted an option to let me > complete all commands instead of just porcelain. This approach kinda > generalizes that and it would be easy to let the user choose what > category they want to complete. To complete this: there could also be yet another special category "custom", where --list-cmds=custom simply returns a list of commands specified in config file. With this the user can pick the "custom" category to have total control of what command shows up at "git " if they are not satisfied with the predefined categories. -- Duy
Re: [PATCH v3 4/6] git.c: implement --list-cmds=porcelain
On Mon, Apr 23, 2018 at 3:32 PM, SZEDER Gábor wrote: > But then I noticed that it's not an accurate description of the > current situation, because there is a wide grey area between > porcelains and plumbing, and the completion script doesn't "filter out > plumbing commands", but rather filters out commands that can be > considered too low-level to be useful for "general" usage. > Consequently, after 'git ' we also list: > > - some 'ancillaryinterrogators': blame, annotate, difftool, fsck, > help > - some 'ancillarymanipulators': config, mergetool, remote > - some 'foreignscminterface': p4, request-pull, svn, send-email > - even some plumbing: apply, name-rev (though 'name-rev' could be > omitted; we have 'git describe') > - and also all "unknown" 'git-foo' commands that can be found in > $PATH, which can be the user's own git scripts or other > git-related tools ('git-annex', Git LFS, etc.). > > With this change we wouldn't list any of the above commands, but only > those that are explicitly categorized as 'mainporcelain'. I'd much > prefer the current behaviour. Yeah I noticed this (kinda) with filter-branch but I did not look further to see all this. It's good that you review this series then :) For the first group (known commands), how about we add a new category "completion" in command-list.txt? Each command may belong to multiple categories (and my updated script has to deal with that [1]). For the second group, we could also have a special "external" category that is produced at run time, not specified in command-list.txt. --list-cmds option either has to accept multiple values, or we accept multiple --list-cmds= options. git-completion.bash will be updated to ask git "give me the commands in the mainporcelain, completable or external category". This also addresses another thing that bugs me: I wanted an option to let me complete all commands instead of just porcelain. This approach kinda generalizes that and it would be easy to let the user choose what category they want to complete. [1] which also means I could bring "deprecated" category back. -- Duy
Re: [PATCH v2 0/3] Some add-on patches on top of dj/runtime-prefix
Hi Junio, On Tue, 24 Apr 2018, Junio C Hamano wrote: > Junio C Hamano writes: > > > Junio C Hamano writes: > > > >>> base-commit: b46fe60e1d7235603a29499822493bd3791195da > >> > >> Basing your work on the tip of 'next' is good for discussion, but > >> not readily usable for final application. Let me see if I can > >> untangle the dependents to come up with a reasonable base. > > > > I'll queue this on top of a merge of 'dj/runtime-prefix' into 'master'. > > Merging the resulting topic into 'next' and applying these patches > > directly on top of 'next' result in identical trees, of course ;-) > > Actually, these trivially rebase on top of dj/runtime-prefix, so > I'll queue them like so without taking it hostage to other things in > 'master'. We'd want to keep these mergeable to any integration > branch that dj/runtime-prefix would be merged to, so that is the > most logical organization, I would think, even though I do not > immediately see the reason why we would want to merge > dj/runtime-prefix to 'maint' and lower right now. > > Thanks. Thank you, and sorry for the trouble. I am just too used to a Continuous Integration setting with exactly one integration branch. I will make an effort in the future to figure out the best base branch for patches that do not apply cleanly on `master` but require more stuff from `next`/`pu`. Ciao, Dscho
Re: [RFC PATCH] checkout: Force matching mtime between files
On 2018-04-13 01:01 PM, Michał Górny wrote: Currently git does not control mtimes of files being checked out. This means that the only assumption you could make is that all files created or modified within a single checkout action will have mtime between start time and end time of this checkout. The relations between mtimes of different files depend on the order in which they are checked out, filesystem speed and timestamp precision. Thanks for scratching this old itch [1]! Big +1 from me. We've had incremental smoke-test rebuilds fail because of inconsistent file timestamps. M. [1] https://public-inbox.org/git/50c79d1f.1080...@xiplink.com/ Git repositories sometimes contain both generated and respective source files. For example, the Gentoo 'user syncing' repository combines source ebuild files with generated metadata cache for user convenience. Ideally, the 'git checkout' would be fast enough that (combined with low timestamp precision) all files created or modified within a single checkout would have matching timestamp. However, in reality the cache files may end up being wrongly 'older' than source file, requiring unnecessary recheck. The opposite problem (cache files not being regenerated when they should have been) might also occur. However, it could not be solved without preserving timestamps, therefore it is out of scope of this change. In order to avoid unnecessary cache mismatches, force a matching mtime between all files created by a single checkout action. This seems to be the best course of action. Matching mtimes do not trigger cache updates. They also match the concept of 'checkout' being an atomic action. Finally, this change does not break backwards compatibility as the new result is a subset of the possible previous results. For example, let's say that 'git checkout' creates two files in order, with respective timestamps T1 and T2. Before this patch, T1 <= T2. After this patch, T1 == T2 which also satisfies T1 <= T2. A similar problem was previously being addressed via git-restore-mtime tool. However, that solution is unnecessarily complex for Gentoo's use case and does not seem to be suitable for 'seamless' integration. The patch integrates mtime forcing via adding an additional member of 'struct checkout'. This seemed the simplest way of adding it without having to modify prototypes and adjust multiple call sites. The member is set to the current time in check_updates() function and is afterwards enforced by checkout_entry(). The patch uses utime() rather than futimes() as that seems to be the function used everywhere else in git and provided some MinGW compatibility code. Signed-off-by: Michał Górny --- cache.h| 1 + entry.c| 12 +++- unpack-trees.c | 1 + 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/cache.h b/cache.h index bbaf5c349..9f0a7c867 100644 --- a/cache.h +++ b/cache.h @@ -1526,6 +1526,7 @@ struct checkout { const char *base_dir; int base_dir_len; struct delayed_checkout *delayed_checkout; + time_t checkout_mtime; unsigned force:1, quiet:1, not_new:1, diff --git a/entry.c b/entry.c index 2101201a1..7ee5a6d28 100644 --- a/entry.c +++ b/entry.c @@ -411,6 +411,7 @@ int checkout_entry(struct cache_entry *ce, { static struct strbuf path = STRBUF_INIT; struct stat st; + int ret; if (topath) return write_entry(ce, topath, state, 1); @@ -473,5 +474,14 @@ int checkout_entry(struct cache_entry *ce, return 0; create_directories(path.buf, path.len, state); - return write_entry(ce, path.buf, state, 0); + ret = write_entry(ce, path.buf, state, 0); + + if (ret == 0 && state->checkout_mtime != 0) { + struct utimbuf t; + t.modtime = state->checkout_mtime; + if (utime(path.buf, &t) < 0) + warning_errno("failed utime() on %s", path.buf); + } + + return ret; } diff --git a/unpack-trees.c b/unpack-trees.c index e73745051..e1efefb68 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -346,6 +346,7 @@ static int check_updates(struct unpack_trees_options *o) state.quiet = 1; state.refresh_cache = 1; state.istate = index; + state.checkout_mtime = time(NULL); progress = get_progress(o);
Re: [PATCH v3 7/9] commit: add short-circuit to paint_down_to_common()
On 4/23/2018 5:38 PM, Jakub Narebski wrote: Derrick Stolee writes: On 4/18/2018 7:19 PM, Jakub Narebski wrote: Derrick Stolee writes: [...] [...], and this saves time during 'git branch --contains' queries that would otherwise walk "around" the commit we are inspecting. If I understand the code properly, what happens is that we can now short-circuit if all commits that are left are lower than the target commit. This is because max-order priority queue is used: if the commit with maximum generation number is below generation number of target commit, then target commit is not reachable from any commit in the priority queue (all of which has generation number less or equal than the commit at head of queue, i.e. all are same level or deeper); compare what I have written in [1] [1]: https://public-inbox.org/git/866052dkju@gmail.com/ Do I have that right? If so, it looks all right to me. Yes, the priority queue needs to compare via generation number first or there will be errors. This is why we could not use commit time before. I was more concerned about getting right the order in the priority queue (does it return minimal or maximal generation number). I understand that the cutoff could not be used without generation numbers because of the possibility of clock skew - using cutoff on dates could lead to wrong results. Maximal generation number is important so we do not visit commits multiple times (say, once with PARENT1 set, and a second time when PARENT2 is set). A minimal generation number order would create a DFS order and walk until the cutoff every time. In cases without clock skew, maximal generation number order will walk the same set of commits as maximal commit time; the order may differ, but only between incomparable commits. For a copy of the Linux repository, where HEAD is checked out at v4.13~100, we get the following performance improvement for 'git branch --contains' over the previous commit: Before: 0.21s After: 0.13s Rel %: -38% [...] flags = commit->object.flags & (PARENT1 | PARENT2 | STALE); if (flags == (PARENT1 | PARENT2)) { if (!(commit->object.flags & RESULT)) { @@ -876,7 +886,7 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co return NULL; } -list = paint_down_to_common(one, n, twos); + list = paint_down_to_common(one, n, twos, 0); while (list) { struct commit *commit = pop_commit(&list); @@ -943,7 +953,7 @@ static int remove_redundant(struct commit **array, int cnt) filled_index[filled] = j; work[filled++] = array[j]; } - common = paint_down_to_common(array[i], filled, work); + common = paint_down_to_common(array[i], filled, work, 0); if (array[i]->object.flags & PARENT2) redundant[i] = 1; for (j = 0; j < filled; j++) @@ -1067,7 +1077,7 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit * if (commit->generation > min_generation) return 0; -bases = paint_down_to_common(commit, nr_reference, reference); + bases = paint_down_to_common(commit, nr_reference, reference, commit->generation); Is it the only case where we would call paint_down_to_common() with non-zero last parameter? Would we always use commit->generation where commit is the first parameter of paint_down_to_common()? If both are true and will remain true, then in my humble opinion it is not necessary to change the signature of this function. We need to change the signature some way, but maybe the way I chose is not the best. No, after taking longer I think the new signature is a good choice. To elaborate: paint_down_to_common() is used for multiple purposes. The caller here that supplies 'commit->generation' is used only to compute reachability (by testing if the flag PARENT2 exists on the commit, then clears all flags). The other callers expect the full walk down to the common commits, and keeps those PARENT1, PARENT2, and STALE flags for future use (such as reporting merge bases). Usually the call to paint_down_to_common() is followed by a revision walk that only halts when reaching root commits or commits with both PARENT1 and PARENT2 flags on, so always short-circuiting on generations would break the functionality; this is confirmed by the t5318-commit-graph.sh. Right. I have realized that just after sending the email. I'm sorry about this. An alternative to the signature change is to add a boolean parameter "use_cutoff" or something, that specifies "don't walk beyond the commit". This may give a more of a clear description of what it will do with the generation value, but since we are already performing generation comparisons before calling paint_down_to_common() I find this simple enough. Two things:
Re: [PATCH v1 1/2] merge: Add merge.renames config setting
Hi Junio, On Tue, 24 Apr 2018, Junio C Hamano wrote: > Ben Peart writes: > > >> I also had to wonder how "merge -s resolve" faired, if the project > >> is not interested in renamed paths at all. > >> > > > > To be clear, it isn't that we're not interested in detecting renamed > > files and paths. We're just opposed to it taking an hour to figure > > that out! > > Yeah, but as opposed to passing "oh, let's see if we can get a > reasonable result without rename detection just this time" from the > command line, configuring merge.renames=false in would mean exactly > that: "we don't need rename detection, just want to skip the cycles > spent for it". That is why I wondered how well the resolve strategy > would have fit your needs. Please do not forget that the context is GVFS, where you would cause a lot of pain and suffering by letting users forget to specify that command-line option all the time, resulting in several gigabytes of objects having to be downloaded just for the sake of rename detection. So there is a pretty good point in doing this as a config option. Ciao, Dscho
Re: [PATCH 1/2] merge: setup `opts` later in `checkout_fast_forward()`
Hi Martin, On Tue, 24 Apr 2018, Martin Ågren wrote: > On 24 April 2018 at 08:20, Jacob Keller wrote: > > I'm guessing the diff algorithm simply found that this was a more > > compact representation of the change? It's a bit confusing when your > > description indicates you "moved" some code down, but it looks like > > you moved code up. > > Agreed. I'll play with --anchored and other magic stuff to see if I can > improve this. Or I could instead try to sell this patch as "move some > other stuff out of the way" ;-) That seems a bit less direct though. Or you could add a remark to the commit message along the lines "best viewed with `--anchored=...`". This is what I would do ;-) Ciao, Dscho
Re: [PATCH v8 09/16] rebase: introduce the --rebase-merges option
Hi Philip, On Sun, 22 Apr 2018, Philip Oakley wrote: > From: "Johannes Schindelin" > > Once upon a time, this here developer thought: wouldn't it be nice if, > > say, Git for Windows' patches on top of core Git could be represented as > > a thicket of branches, and be rebased on top of core Git in order to > > maintain a cherry-pick'able set of patch series? > > > > The original attempt to answer this was: git rebase --preserve-merges. > > > > However, that experiment was never intended as an interactive option, > > and it only piggy-backed on git rebase --interactive because that > > command's implementation looked already very, very familiar: it was > > designed by the same person who designed --preserve-merges: yours truly. > > > > Some time later, some other developer (I am looking at you, Andreas! > > ;-)) decided that it would be a good idea to allow --preserve-merges to > > be combined with --interactive (with caveats!) and the Git maintainer > > (well, the interim Git maintainer during Junio's absence, that is) > > agreed, and that is when the glamor of the --preserve-merges design > > started to fall apart rather quickly and unglamorously. > > > > The reason? In --preserve-merges mode, the parents of a merge commit (or > > for that matter, of *any* commit) were not stated explicitly, but were > > *implied* by the commit name passed to the `pick` command. > > > Aside: I think this para should be extracted to the --preserve-merges > documentation to highlight what it does / why it is 'wrong' (not what would be > expected in some case). It may also need to discuss the (figurative) Cousins > vs. Siblings distinction [merge of branches external, or internal, to the > rebase. Quite honestly, I'd much rather spend time improving --rebase-merges than improving --preserve-merges documentation. In my mind, the latter is pretty useless, especially once the former lands in an official Git version. Of course, feel free to disagree with me by sending a patch to improve the documentation of --preserve-merges ;-) > "In --preserve-merges, the commit being selected for merging is implied by the > commit name passed to the `pick` command (i.e. of the original merge commit), > not that of the rebased version of that parent." It is much, much worse: In --preserve-merges, no commit can change its ancestry. Every rebased commit's parents will be the rebased original parents. Or some such. But really, why bother describing something *that* broken? Why not work toward a solution that makes that broken option obsolete? Like, say, --rebase-merges? ;-) > A similar issue occurs with (figuratively) '--ancestry-path --first parent' > searches which lacks the alternate '--lead parent' post-walk selection. [1]. I > don't think there is a dot notation to select the merge cousins, nor merge > siblings either A.,B ? (that's dot-comma ;-) I actually had missed `--ancestry-path`... I should probably use it in the description of the "cousins". > [... lots of quoted text...] Could I ask you to make it easier for me by cutting quoted text that is irrelevant to your reply? The way I read mails forces me to scroll down (sometimes on a phone) all the way to the end, just to find that that time was spent in vain. Thanks, Dscho
Re: Git archeology
Hi Christian, Thank you for the reply. After I got the reply from Szedar I was so excited about git community passion for help. And your reply made me ever more assure in it. Once again thank your for the comprehensive answer. I appreciate Daniel German's research and going to try token based method in my task as well. Have a nice day, Vladimir 2018-04-21 8:43 GMT+02:00 Christian Couder : > Hi, > > On Sat, Apr 21, 2018 at 8:19 AM, Vladimir Gorshenin > wrote: >> Hi, >> >> My team and I as well as millions of other developers are excited to >> have such tool at hand as Git. It helps us a lot. >> >> Now we challenged ourselves to be even more productive with Git >> analyzing our usage history. > > What kind of analysis do you want to do? Is it the same kind of > analysis as described in the "Token-based authorship information from > Git" article (https://lwn.net/Articles/698425/) on LWN.net? > >> And there is a problem, which I believe is fundamental for Git (please >> prove me wrong): how to find all overlapping commits, e.g. touching >> the same lines of code? > > It is not very clear what you would consider overlapping commits or > commits touching the same lines of code. If some lines of code have > been duplicated in different files, for example, are the commits > touching the original lines relevant to what happened to the > duplicated lines? And what about lines that were moved from one file > to another or in the same file? > >> I played with “Git diff” and “Git blame” but without a reliable >> result. “Git diff” gives only relative number of lines and it’s not >> easy to track these number through 1000+ commits. “Git blame” has nice >> output but without any information about deletion. > > Did you try 'git log -L' as Szeder Gábor just suggested? > >> What would you advice me to do? > > If 'git log -L' and other git commands are not doing what you want, > you might want to take a look at cregit > (https://github.com/cregit/cregit) and maybe at other work from the > people who developed it. The above LWN.net article is about their > early work. > > There are links related to this tool in: > https://git.github.io/rev_news/2017/05/17/edition-27/
Re: Git archeology
Hi Szedar, Thank you for the reply! I didn't expect it could be so instant! I checked 'git log -L' option and it seems to the best one so far. But nevertheless is has a pit fall: I run it like 'git log -L ,:somefile' and get the output that needs manual scraping since each commit spans the whole file despite only few lines were actually altered. I would like to have an output form 'git log -L' in patch style. Could it be done somehow? Have a nice day, Vladimir 2018-04-21 8:29 GMT+02:00 SZEDER Gábor : >> And there is a problem, which I believe is fundamental for Git (please >> prove me wrong): how to find all overlapping commits, e.g. touching >> the same lines of code? > > You might be looking for 'git log -L'. >
Re: [PATCH 31/41] wt-status: convert two uses of EMPTY_TREE_SHA1_HEX
On 24 April 2018 at 01:39, brian m. carlson wrote: > Convert two uses of EMPTY_TREE_SHA1_HEX to use oid_to_hex_r and > the_hash_algo to avoid a dependency on a given hash algorithm. Use > oid_to_hex_r in preference to oid_to_hex because the buffer needs to > last through several function calls which might exhaust the limit of > four static buffers. > > Signed-off-by: brian m. carlson > --- > wt-status.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/wt-status.c b/wt-status.c > index 50815e5faf..857724bd60 100644 > --- a/wt-status.c > +++ b/wt-status.c > @@ -600,10 +600,11 @@ static void wt_status_collect_changes_index(struct > wt_status *s) > { > struct rev_info rev; > struct setup_revision_opt opt; > + char hex[GIT_MAX_HEXSZ + 1]; > > init_revisions(&rev, NULL); > memset(&opt, 0, sizeof(opt)); > - opt.def = s->is_initial ? EMPTY_TREE_SHA1_HEX : s->reference; > + opt.def = s->is_initial ? oid_to_hex_r(hex, > the_hash_algo->empty_tree) : s->reference; > setup_revisions(0, NULL, &rev, &opt); > > rev.diffopt.flags.override_submodule_config = 1; > @@ -975,13 +976,14 @@ static void wt_longstatus_print_verbose(struct > wt_status *s) > struct setup_revision_opt opt; > int dirty_submodules; > const char *c = color(WT_STATUS_HEADER, s); > + char hex[GIT_MAX_HEXSZ + 1]; > > init_revisions(&rev, NULL); > rev.diffopt.flags.allow_textconv = 1; > rev.diffopt.ita_invisible_in_index = 1; > > memset(&opt, 0, sizeof(opt)); > - opt.def = s->is_initial ? EMPTY_TREE_SHA1_HEX : s->reference; > + opt.def = s->is_initial ? oid_to_hex_r(hex, > the_hash_algo->empty_tree) : s->reference; > setup_revisions(0, NULL, &rev, &opt); Just a thought: Maybe it would make sense to have a function `oid_hex_empty_tree()` or similar to replace the oid_to_hex[_r](the_hash_algo->empty_tree) idiom. It would help avoid the buffer here, but also get rid of a few instances of code peeking into the_hash_algo. I dunno. I've been scanning this series semi-sloppily up to here, and left some comments along the way. Thank you for working on this. Martin