Re: Questions about the hash function transition
brian m. carlson wrote: > On Thu, Aug 23, 2018 at 06:54:38PM -0700, Jonathan Nieder wrote: >> For what it's worth, even if it all is in one commit with message >> "wip", I think I'd benefit from being able to see this code. I can >> promise not to critique it, and to only treat it as a rough >> premonition of the future. > > It's in my object-id-partn branch at https://github.com/bk2204/git. > > It doesn't do protocol v2 yet, but it does do protocol v1. It is, of > course, subject to change (especially naming) depending on what the list > thinks is most appropriate. $ git diff --shortstat origin/master...bmc/object-id-partn 185 files changed, 2263 insertions(+), 1535 deletions(-) Beautiful. Thanks much for this. Sincerely, Jonathan
Re: Questions about the hash function transition
On Thu, Aug 23, 2018 at 06:54:38PM -0700, Jonathan Nieder wrote: > brian m. carlson wrote: > > I realize I have a lot of code that has not been sent in yet, but I also > > tend to build on my own series a lot, and I probably need to be a bit > > better about extracting reusable pieces that can go in independently > > without waiting for the previous series to land. > > For what it's worth, even if it all is in one commit with message > "wip", I think I'd benefit from being able to see this code. I can > promise not to critique it, and to only treat it as a rough > premonition of the future. It's in my object-id-partn branch at https://github.com/bk2204/git. It doesn't do protocol v2 yet, but it does do protocol v1. It is, of course, subject to change (especially naming) depending on what the list thinks is most appropriate. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [PATCH/RFC] commit: new option to abort -a something is already staged
On Thu, Aug 23, 2018 at 9:28 AM Junio C Hamano wrote: > I think the above example forgets "-a" on the final "git commit" > step. With it added, I can understand the concern (and I am sure > you would, too). > > The user is trying to add everything done in the working tree, and > "commit -a" would catch all changes to paths that were already > tracked, but a separate "add" is necessary for newly created paths. > But adding a new path means the index no longer matches HEAD, and > the "commit -a" at the final step sweeps the changes to already > tracked paths---failing that because there "already is something > staged" will break the workflow. Right. I think this would need to be able to understand the case of "different only by new files". Thanks, Jake
Re: [ANNOUNCE] Git v2.19.0-rc0
On Thu, Aug 23, 2018 at 07:48:42PM -0700, Jacob Keller wrote: > Odd... > > What about.. > > - if (oidcmp(a,b)) > + if(!oideq(a,b)) > { ... } Nope, it doesn't like that syntactically. > Or maybe you need to use something like > > <... > - if (oidcmp(a,b)) > + if (!oideq(a,b)) > ...> Nor that (I also tried finding documentation on what exactly the angle brackets mean, but couldn't). > Hmm. Yea, semantic patches are a bit confusing overall sometimes. > > But it looks like you got something which works? Actually, what I showed earlier does seem to have some weirdness with else-if. But I finally stumbled on something even better: - oidcmp(a, b) != 0 + !oideq(a, b) Because of the isomorphisms that coccinelle knows about, that catches everything we want. Obvious ones like: diff --git a/bisect.c b/bisect.c index 41c56a665e..7c1d8f1a6d 100644 --- a/bisect.c +++ b/bisect.c @@ -595,7 +595,7 @@ static struct commit_list *skip_away(struct commit_list *list, int count) for (i = 0; cur; cur = cur->next, i++) { if (i == index) { - if (oidcmp(>item->object.oid, current_bad_oid)) + if (!oideq(>item->object.oid, current_bad_oid)) return cur; if (previous) return previous; and compound conditionals like: diff --git a/blame.c b/blame.c index 10d72e36dd..538d0ab1aa 100644 --- a/blame.c +++ b/blame.c @@ -1834,7 +1834,7 @@ void setup_scoreboard(struct blame_scoreboard *sb, sb->revs->children.name = "children"; while (c->parents && - oidcmp(>object.oid, >final->object.oid)) { + !oideq(>object.oid, >final->object.oid)) { struct commit_list *l = xcalloc(1, sizeof(*l)); l->item = c; and even non-if contexts, like: diff --git a/sha1-file.c b/sha1-file.c index 631f6b9dc2..d85f4e93e1 100644 --- a/sha1-file.c +++ b/sha1-file.c @@ -825,7 +825,7 @@ int check_object_signature(const struct object_id *oid, void *map, if (map) { hash_object_file(map, size, type, _oid); - return oidcmp(oid, _oid) ? -1 : 0; + return !oideq(oid, _oid) ? -1 : 0; } So I think we have a winner. I'll polish that up into patches and send it out later tonight. -Peff
Re: [PATCH/RFC] commit: new option to abort -a something is already staged
On Mon, Aug 20, 2018 at 8:43 AM Nguyễn Thái Ngọc Duy wrote: > Instead, let's handle just this problem for now. This new option > commit.preciousDirtyIndex, if set to false, will not allow `commit -a` > to continue if the final index is different from the existing one. I > don't think this can be achieved with hooks because the hooks don't > know about "-a" or different commit modes. > Here you say setting it to "false" enables this check but... > +commit.preciousDirtyIndex:: > + If some changes are partially staged in the index (i.e. > + "git commit -a" and "git commit" commit different changes), reject > + "git commit -a". > + Here, sounds like it is enabled by setting it to true. Thanks, Jake
Re: Questions about the hash function transition
Ævar Arnfjörð Bjarmason wrote: >> Objective >> - >> Migrate Git from SHA-1 to a stronger hash function. > > Should way say "Migrate Git from SHA-1 to SHA-256" here instead? > > Maybe it's overly specific, i.e. really we're also describnig how /any/ > hash function transition might happen, but having just read this now > from start to finish it takes us a really long time to mention (and at > first, only offhand) that SHA-256 is the new hash. I answered this question in my other reply, but my answer missed the point. I think it would be fine for this to say "Migrate Git from SHA-1 to a stronger hash function (SHA-256)". More importantly, I think the Background section should say something about SHA-256 --- e.g. how about replacing the sentence SHA-1 still possesses the other properties such as fast object lookup and safe error checking, but other hash functions are equally suitable that are believed to be cryptographically secure. with something about SHA-256? Rereading the background section, I see some other bits that could be clarified, too. It has a run-on sentence: Thus Git has in effect already migrated to a new hash that isn't SHA-1 and doesn't share its vulnerabilities, its new hash function just happens to produce exactly the same output for all known inputs, except two PDFs published by the SHAttered researchers, and the new implementation (written by those researchers) claims to detect future cryptanalytic collision attacks. The "," after vulnerabilities should be a period, ending the sentence. My understanding is that sha1collisiondetection's safe-hash is meant to protect against known attacks and that the code is meant to be adaptable for future attacks of the same kind (by updating the list of disturbance vectors), but it doesn't claim to guard against future novel cryptanalysis methods that haven't been published yet. Thanks, Jonathan
Re: [ANNOUNCE] Git v2.19.0-rc0
On Thu, Aug 23, 2018 at 5:16 PM Jeff King wrote: > > On Thu, Aug 23, 2018 at 08:06:37PM -0400, Jeff King wrote: > > > This almost works: > > > > @@ > > expression a, b; > > statement s; > > @@ > > - if (oidcmp(a, b)) s > > + if (!oideq(a, b)) s > > > > [...] > > > So I really do want some way of saying "all of the block, no matter what > > it is". Or of leaving it out as context. > > Aha. The magic invocation is: > > @@ > expression a, b; > statement s; > @@ > - if (oidcmp(a, b)) > + if (!oideq(a, b)) > s > > I would have expected that you could replace "s" with "...", but that > does not seem to work. > > -Peff Odd... What about.. - if (oidcmp(a,b)) + if(!oideq(a,b)) { ... } Or maybe you need to use something like <... - if (oidcmp(a,b)) + if (!oideq(a,b)) ...> Hmm. Yea, semantic patches are a bit confusing overall sometimes. But it looks like you got something which works? Thanks, Jake
Re: Questions about the hash function transition
Hi, brian m. carlson wrote: > On Thu, Aug 23, 2018 at 04:02:51PM +0200, Ævar Arnfjörð Bjarmason wrote: >>> 1. Add SHA-256 support to Git protocol. This is valuable and the >>>logical next step but it is out of scope for this initial design. >> >> This is a non-goal according to the docs, but now that we have protocol >> v2 in git, perhaps we could start specifying or describing how this >> protocol extension will work? > > I have code that does this. The reason is that the first stage of the [nice explanation snipped] > I hope to be able to spend some time documenting this in a little bit. > I have documentation for that code in my branch, but I haven't sent it > in yet. Yay! > I realize I have a lot of code that has not been sent in yet, but I also > tend to build on my own series a lot, and I probably need to be a bit > better about extracting reusable pieces that can go in independently > without waiting for the previous series to land. For what it's worth, even if it all is in one commit with message "wip", I think I'd benefit from being able to see this code. I can promise not to critique it, and to only treat it as a rough premonition of the future. [...] > For SHA-1, I have 0x73686131, which is "sha1", big-endian, and for > SHA-256, I have 0x73323536, which is "s256", big-endian. The former is > in the codebase already; the latter, in my hash-impl branch. I mentioned in another reply that "sha2" sounds fine. "s256" of course also sounds fine to me. Thanks to Ævar for asking so that we have the reminder to pin it down in the doc. Thanks, Jonathan
Re: Questions about the hash function transition
Hi, Ævar Arnfjörð Bjarmason wrote: > I wanted to send another series to clarify things in > hash-function-transition.txt, but for some of the issues I don't know > the answer, and I had some questions after giving this another read. Thanks for looking it over! Let's go. :) [...] >> Objective >> - >> Migrate Git from SHA-1 to a stronger hash function. > > Should way say "Migrate Git from SHA-1 to SHA-256" here instead? > > Maybe it's overly specific, i.e. really we're also describnig how /any/ > hash function transition might happen, but having just read this now > from start to finish it takes us a really long time to mention (and at > first, only offhand) that SHA-256 is the new hash. Well, the objective really is to migrate to a stronger hash function, and that we chose SHA-256 is part of the details of how we chose to do that. So I think this would be a misleading change. You can tell that I'm not just trying to justify after the fact because the initial version of the design doc at [*] already uses this wording, and that version assumed that the hash function was going to be SHA-256. [*] https://public-inbox.org/git/20170304011251.ga26...@aiede.mtv.corp.google.com/ [...] >> Non-Goals >> - >> 1. Add SHA-256 support to Git protocol. This is valuable and the >>logical next step but it is out of scope for this initial design. > > This is a non-goal according to the docs, but now that we have protocol > v2 in git, perhaps we could start specifying or describing how this > protocol extension will work? Yes, that would be great! But I suspect it's cleanest to do so in a separate doc. That would allow clarifying this part, by pointing to the protocol doc. [...] >> 3. Intermixing objects using multiple hash functions in a single >>repository. > > But isn't that the goal now per "Translation table" & writing both SHA-1 > and SHA-256 versions of objects? No, we don't write both versions of objects. The translation records both names of an object. [...] >> - For each object format: >> - 4-byte format identifier (e.g., 'sha1' for SHA-1) > > So, given that we have 4-byte limit and have decided on SHA-256 are we > just going to call this 'sha2'? Good question. 'sha2' sounds fine to me. If we want to do SHA-512/256 later, say, we'd just have to come up with a name for that at that point (and it doesn't have to be ASCII). > That might be confusingly ambiguous This is a binary format. Are you really worried that people are going to misinterpret the magic numbers it contains? > since SHA2 is a standard with more than just SHA-256, maybe 's256', or > maybe we should give this 8 bytes with trailing \0s so we can have > "SHA-1\0\0\0" and "SHA-256\0"? For what it's worth, if that's the alternative, I'd rather have four random bytes. [...] >> The loose object index is protected against concurrent writes by a >> lock file $GIT_OBJECT_DIR/loose-object-idx.lock. To add a new loose >> object: >> >> 1. Write the loose object to a temporary file, like today. >> 2. Open loose-object-idx.lock with O_CREAT | O_EXCL to acquire the lock. >> 3. Rename the loose object into place. >> 4. Open loose-object-idx with O_APPEND and write the new object >> 5. Unlink loose-object-idx.lock to release the lock. >> >> To remove entries (e.g. in "git pack-refs" or "git-prune"): >> >> 1. Open loose-object-idx.lock with O_CREAT | O_EXCL to acquire the >>lock. >> 2. Write the new content to loose-object-idx.lock. >> 3. Unlink any loose objects being removed. >> 4. Rename to replace loose-object-idx, releasing the lock. > > Do we expect multiple concurrent writers to poll the lock if they can't > aquire it right away? I.e. concurrent "git commit" would block? Has this > overall approach been benchmarked somewhere? Git doesn't support concurrent "git commit" today. My feeling is that if loose object writing becomes a performance problem, we should switch to writing packfiles instead (as "git receive-pack" already does). So when there's a choice between better performance of writing loose objects and simplicity, I lean toward simplicity (though that's not absolute, there are definitely tradeoffs to be made). Earlier discussion about this had sharded loose object indices for each xy/ subdir. It was more complicated, for not much gain. [...] > Maybe I've missed some subtlety where that won't work, I'm just > concerned that something that's writing a lot of objects in parallel > will be slowed down (e.g. the likes of BFG repo cleaner). BFG repo cleaner is an application like fast-import that is a good fit for writing packs, not loose objects. [...] >> Since all operations that make new objects (e.g., "git commit") add >> the new objects to the corresponding index, this mapping is possible >> for all objects in the object store. > > Are we going to need a midx version of these mapping files? How does > midx fit into this picture? Perhaps it's too obscure to
Re: Questions about the hash function transition
On Thu, Aug 23, 2018 at 04:02:51PM +0200, Ævar Arnfjörð Bjarmason wrote: > > [...] > > Goals > > - > > 1. The transition to SHA-256 can be done one local repository at a time. > >a. Requiring no action by any other party. > >b. A SHA-256 repository can communicate with SHA-1 Git servers > > (push/fetch). > >c. Users can use SHA-1 and SHA-256 identifiers for objects > > interchangeably (see "Object names on the command line", below). > >d. New signed objects make use of a stronger hash function than > > SHA-1 for their security guarantees. > > 2. Allow a complete transition away from SHA-1. > >a. Local metadata for SHA-1 compatibility can be removed from a > > repository if compatibility with SHA-1 is no longer needed. > > 3. Maintainability throughout the process. > >a. The object format is kept simple and consistent. > >b. Creation of a generalized repository conversion tool. > > > > Non-Goals > > - > > 1. Add SHA-256 support to Git protocol. This is valuable and the > >logical next step but it is out of scope for this initial design. > > This is a non-goal according to the docs, but now that we have protocol > v2 in git, perhaps we could start specifying or describing how this > protocol extension will work? I have code that does this. The reason is that the first stage of the transition code is to implement stage 4 of the transition: that is, a full SHA-256 implementation without any SHA-1 support. Implementing it that way means that we don't have to deal with any of the SHA-1 to SHA-256 mapping in the first stage of the code. In order to clone an SHA-256 repo (which the testsuite is completely broken without), you need to be able to have basic SHA-256 support in the protocol. I know this was a non-goal, but the alternative is a an inability to run the testsuite using SHA-256 until all the code is merged, which is unsuitable for development. The transition plan also anticipates stage 4 (full SHA-256) support before earlier stages, so this will be required. I hope to be able to spend some time documenting this in a little bit. I have documentation for that code in my branch, but I haven't sent it in yet. I realize I have a lot of code that has not been sent in yet, but I also tend to build on my own series a lot, and I probably need to be a bit better about extracting reusable pieces that can go in independently without waiting for the previous series to land. > > [...] > > 3. Intermixing objects using multiple hash functions in a single > >repository. > > But isn't that the goal now per "Translation table" & writing both SHA-1 > and SHA-256 versions of objects? No, I think this statement is basically that you have to have the entire repository use all one algorithm under the hood in the .git directory, translation tables excluded. I don't think that's controversial. > > [...] > > Pack index > > ~~ > > Pack index (.idx) files use a new v3 format that supports multiple > > hash functions. They have the following format (all integers are in > > network byte order): > > > > - A header appears at the beginning and consists of the following: > > - The 4-byte pack index signature: '\377t0c' > > - 4-byte version number: 3 > > - 4-byte length of the header section, including the signature and > > version number > > - 4-byte number of objects contained in the pack > > - 4-byte number of object formats in this pack index: 2 > > - For each object format: > > - 4-byte format identifier (e.g., 'sha1' for SHA-1) > > So, given that we have 4-byte limit and have decided on SHA-256 are we > just going to call this 'sha2'? That might be confusingly ambiguous > since SHA2 is a standard with more than just SHA-256, maybe 's256', or > maybe we should give this 8 bytes with trailing \0s so we can have > "SHA-1\0\0\0" and "SHA-256\0"? This is the format_version field in struct git_hash_algo. For SHA-1, I have 0x73686131, which is "sha1", big-endian, and for SHA-256, I have 0x73323536, which is "s256", big-endian. The former is in the codebase already; the latter, in my hash-impl branch. If people have objections, we can change this up until we merge the pack index v3 code (which is not yet finished). It needs to be unique, and that's it. We could specify 0x0001 and 0x0002 if we wanted, although I feel the values I mentioned above are self-documenting, which is desirable. > > [...] > > - The trailer consists of the following: > > - A copy of the 20-byte SHA-256 checksum at the end of the > > corresponding packfile. > > > > - 20-byte SHA-256 checksum of all of the above. > > We need to update both of these to 32 byte, right? Or are we planning to > truncate the checksums? > > This seems like just a mistake when we did s/NewHash/SHA-256/g, but then > again it was originally "20-byte NewHash checksum" ever since 752414ae43 > ("technical doc: add a design doc for hash function transition", >
Re: [ANNOUNCE] Git v2.19.0-rc0
On Thu, Aug 23, 2018 at 08:06:37PM -0400, Jeff King wrote: > This almost works: > > @@ > expression a, b; > statement s; > @@ > - if (oidcmp(a, b)) s > + if (!oideq(a, b)) s > > [...] > So I really do want some way of saying "all of the block, no matter what > it is". Or of leaving it out as context. Aha. The magic invocation is: @@ expression a, b; statement s; @@ - if (oidcmp(a, b)) + if (!oideq(a, b)) s I would have expected that you could replace "s" with "...", but that does not seem to work. -Peff
Re: [ANNOUNCE] Git v2.19.0-rc0
On Thu, Aug 23, 2018 at 07:40:49PM -0400, Jeff King wrote: > > You can look for explicitly "if (oidcmp(...))" though. I don't know if > > you can catch *any* use which degrades to boolean outside of an if > > statement, but I wouldn't expect there to be too many of those? > > Yeah, that was my thought, too. And I've been trying this all afternoon > without success. Why doesn't this work: > > @@ > expression a, b; > @@ > - if (oidcmp(a, b)) > + if (!oideq(a, b)) > > I get: > > Fatal error: exception Failure("minus: parse error: \n = File > \"contrib/coccinelle/oideq.cocci\", line 21, column 0, charpos = > 221\naround = '', whole content = \n") > > If I do: > > - if (oidcmp(a, b)) { ... } > > that seems to please the parser for the minus line. But I cannot include > the "..." on the plus line. Clearly the "..." part should be context, > but I can't seem to find the right syntax. This almost works: @@ expression a, b; statement s; @@ - if (oidcmp(a, b)) s + if (!oideq(a, b)) s It generates this, for example: diff -u -p a/bisect.c b/bisect.c --- a/bisect.c +++ b/bisect.c @@ -595,7 +595,7 @@ static struct commit_list *skip_away(str for (i = 0; cur; cur = cur->next, i++) { if (i == index) { - if (oidcmp(>item->object.oid, current_bad_oid)) + if (!oideq(>item->object.oid, current_bad_oid)) return cur; if (previous) return previous; which is what we want. But it also generates this: diff -u -p a/bundle.c b/bundle.c --- a/bundle.c +++ b/bundle.c @@ -369,25 +369,11 @@ static int write_bundle_refs(int bundle_ * commit that is referenced by the tag, and not the tag * itself. */ - if (oidcmp(, >item->oid)) { - /* -* Is this the positive end of a range expressed -* in terms of a tag (e.g. v2.0 from the range -* "v1.0..v2.0")? -*/ - struct commit *one = lookup_commit_reference(the_repository, -); + if (!oideq(, >item->oid)) { + struct commit *one=lookup_commit_reference(the_repository, + ); struct object *obj; - if (e->item == &(one->object)) { - /* -* Need to include e->name as an -* independent ref to the pack-objects -* input, so that the tag is included -* in the output; otherwise we would -* end up triggering "empty bundle" -* error. -*/ obj = parse_object_or_die(, e->name); obj->flags |= SHOWN; add_pending_object(revs, obj, e->name); So I really do want some way of saying "all of the block, no matter what it is". Or of leaving it out as context. -Peff
Re: [ANNOUNCE] Git v2.19.0-rc0
On Thu, Aug 23, 2018 at 04:30:10PM -0700, Jacob Keller wrote: > On Thu, Aug 23, 2018 at 9:30 AM Jeff King wrote: > > I think that audit isn't actually too bad (but definitely not something > > we should do for v2.19). The cocci patch I showed earlier hits most of > > them. It misses the negated ones (e.g., "if (oidcmp(...))"). I'm not > > sure if there's a way to ask coccinelle only for oidcmp() > > used in a boolean context. > > > > You can look for explicitly "if (oidcmp(...))" though. I don't know if > you can catch *any* use which degrades to boolean outside of an if > statement, but I wouldn't expect there to be too many of those? Yeah, that was my thought, too. And I've been trying this all afternoon without success. Why doesn't this work: @@ expression a, b; @@ - if (oidcmp(a, b)) + if (!oideq(a, b)) I get: Fatal error: exception Failure("minus: parse error: \n = File \"contrib/coccinelle/oideq.cocci\", line 21, column 0, charpos = 221\naround = '', whole content = \n") If I do: - if (oidcmp(a, b)) { ... } that seems to please the parser for the minus line. But I cannot include the "..." on the plus line. Clearly the "..." part should be context, but I can't seem to find the right syntax. FWIW, I do have patches adding hasheq() and converting all of the !oidcmp() cases. I may resort to hand-investigating each of the negated ones, but I really feel like I should be able to do better with coccinelle. -Peff
Re: [ANNOUNCE] Git v2.19.0-rc0
On Thu, Aug 23, 2018 at 9:30 AM Jeff King wrote: > I think that audit isn't actually too bad (but definitely not something > we should do for v2.19). The cocci patch I showed earlier hits most of > them. It misses the negated ones (e.g., "if (oidcmp(...))"). I'm not > sure if there's a way to ask coccinelle only for oidcmp() > used in a boolean context. > You can look for explicitly "if (oidcmp(...))" though. I don't know if you can catch *any* use which degrades to boolean outside of an if statement, but I wouldn't expect there to be too many of those? Thanks, Jake
Re: [PATCH v3] range-diff: update stale summary of --no-dual-color
Junio C Hamano wrote: > Jonathan Nieder writes: >> Kyle Meyer wrote: >>> Subject: [PATCH v3] range-diff: update stale summary of --no-dual-color [...] >> Reviewed-by: Jonathan Nieder > > Sorry, too late. I'll revert the merge of the previous round out of > 'next' and requeue this one, but that will have to wait until the > next integration cycle. Thanks for the heads up. Sounds like a fine plan. Jonathan
Re: [PATCH v3] range-diff: update stale summary of --no-dual-color
Jonathan Nieder writes: > Kyle Meyer wrote: > >> 275267937b (range-diff: make dual-color the default mode, 2018-08-13) >> replaced --dual-color with --no-dual-color but left the option's >> summary untouched. Rewrite the summary to describe --no-dual-color >> rather than dual-color. >> >> Helped-by: Jonathan Nieder > > Ha, I don't think I deserve much credit here, but I'll take it. ;-) > >> Helped-by: Johannes Schindelin >> Signed-off-by: Kyle Meyer >> --- >> builtin/range-diff.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) > > Reviewed-by: Jonathan Nieder Sorry, too late. I'll revert the merge of the previous round out of 'next' and requeue this one, but that will have to wait until the next integration cycle.
Re: [PATCH v3] range-diff: update stale summary of --no-dual-color
Kyle Meyer wrote: > 275267937b (range-diff: make dual-color the default mode, 2018-08-13) > replaced --dual-color with --no-dual-color but left the option's > summary untouched. Rewrite the summary to describe --no-dual-color > rather than dual-color. > > Helped-by: Jonathan Nieder Ha, I don't think I deserve much credit here, but I'll take it. ;-) > Helped-by: Johannes Schindelin > Signed-off-by: Kyle Meyer > --- > builtin/range-diff.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Jonathan Nieder
[PATCH v3] range-diff: update stale summary of --no-dual-color
275267937b (range-diff: make dual-color the default mode, 2018-08-13) replaced --dual-color with --no-dual-color but left the option's summary untouched. Rewrite the summary to describe --no-dual-color rather than dual-color. Helped-by: Jonathan Nieder Helped-by: Johannes Schindelin Signed-off-by: Kyle Meyer --- builtin/range-diff.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/range-diff.c b/builtin/range-diff.c index f52d45d9d6..0aa9bed41f 100644 --- a/builtin/range-diff.c +++ b/builtin/range-diff.c @@ -25,7 +25,7 @@ int cmd_range_diff(int argc, const char **argv, const char *prefix) OPT_INTEGER(0, "creation-factor", _factor, N_("Percentage by which creation is weighted")), OPT_BOOL(0, "no-dual-color", _color, - N_("color both diff and diff-between-diffs")), + N_("use simple diff colors")), OPT_END() }; int i, j, res = 0; -- 2.18.0
Re: [PATCH] range-diff: update stale summary of --no-dual-color
Kyle Meyer writes: > Junio C Hamano writes: > >> Johannes Schindelin writes: > > [...] > - N_("color both diff and diff-between-diffs")), + N_("restrict coloring to outer diff markers")), >>> >>> How about "use simple diff colors" instead? > > That's certainly better than the one above, and I also prefer it to > "color only based on the diff-between-diffs" in v2. > >> I am wondering if it makes sense to remove the option altogether. >> I've been trying to view the comparison of the same ranges in both >> styles for the past few days, and I never found a reason to choose >> "no dual color" option myself. > > But I like this suggestion even better. That wasn't even a suggestion. I just did not see the point of the optional behaviour myself, and was soliciting concrete examples where the --no-dual mode would help.
Re: [PATCH] range-diff: update stale summary of --no-dual-color
Hi Junio, On Thu, 23 Aug 2018, Junio C Hamano wrote: > Johannes Schindelin writes: > > > On Wed, 22 Aug 2018, Kyle Meyer wrote: > > > >> 275267937b (range-diff: make dual-color the default mode, 2018-08-13) > >> replaced --dual-color with --no-dual-color but left the option's > >> summary untouched. Rewrite the summary to describe --no-dual-color > >> rather than dual-color. > >> > >> Signed-off-by: Kyle Meyer > >> --- > >> builtin/range-diff.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/builtin/range-diff.c b/builtin/range-diff.c > >> index f52d45d9d6..7dc90a5ec3 100644 > >> --- a/builtin/range-diff.c > >> +++ b/builtin/range-diff.c > >> @@ -25,7 +25,7 @@ int cmd_range_diff(int argc, const char **argv, const > >> char *prefix) > >>OPT_INTEGER(0, "creation-factor", _factor, > >>N_("Percentage by which creation is weighted")), > >>OPT_BOOL(0, "no-dual-color", _color, > >> - N_("color both diff and diff-between-diffs")), > >> + N_("restrict coloring to outer diff markers")), > > > > How about "use simple diff colors" instead? > > I am wondering if it makes sense to remove the option altogether. > I've been trying to view the comparison of the same ranges in both > styles for the past few days, and I never found a reason to choose > "no dual color" option myself. We do have a track record of making decisions based on our little bubble, don't we. On IRC, there is at least on publicly viewable comment by a user who preferred the simple color diff, at least in one use case: http://colabti.org/irclogger/irclogger_log/git-devel?date=2018-07-13#l97 And I am living in my own bubble, too. I think I heard feedback regarding range-diff from some dozen people. Multiplying 6% by the download numbers of Git for Windows alone... that's a lot of people who can put --no-dual-color to good use at least in *some* situations. In short: I am hesitant to remove a feature that would help some users. Ciao, Dscho
Re: [PATCH] i18n: fix mistakes in translated strings
Jean-Noël Avila writes: > - die(_("run_command returned non-zero status while" > + die(_("run_command returned non-zero status while " > "recursing in the nested submodules of %s\n."), Obviously good. > diff --git a/config.c b/config.c > index 9a0b10d4bc..3461993f0a 100644 > --- a/config.c > +++ b/config.c > @@ -124,7 +124,7 @@ static const char include_depth_advice[] = N_( > "%s\n" > "from\n" > "%s\n" > -"Do you have circular includes?"); > +"This might be due to circular includes."); OK. > diff --git a/sequencer.c b/sequencer.c > index 65d371c746..84bf598c3e 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -720,7 +720,7 @@ static const char *read_author_ident(struct strbuf *buf) > /* dequote values and construct ident line in-place */ > for (in = buf->buf; i < 3 && in - buf->buf < buf->len; i++) { > if (!skip_prefix(in, keys[i], (const char **))) { > - warning(_("could not parse '%s' (looking for '%s'"), > + warning(_("could not parse '%s' (looking for '%s')"), Good. Thanks.
Re: [PATCH] range-diff: update stale summary of --no-dual-color
Junio C Hamano writes: > Johannes Schindelin writes: [...] >>> - N_("color both diff and diff-between-diffs")), >>> + N_("restrict coloring to outer diff markers")), >> >> How about "use simple diff colors" instead? That's certainly better than the one above, and I also prefer it to "color only based on the diff-between-diffs" in v2. > I am wondering if it makes sense to remove the option altogether. > I've been trying to view the comparison of the same ranges in both > styles for the past few days, and I never found a reason to choose > "no dual color" option myself. But I like this suggestion even better.
Re: [GSoC][PATCH v6 15/20] rebase -i: rewrite write_basic_state() in C
Hi Phillip, On Fri, 17 Aug 2018, Phillip Wood wrote: > On 10/08/2018 17:51, Alban Gruin wrote: > > > +{ > > + const char *quiet = getenv("GIT_QUIET"); > > + > > + if (head_name) > > + write_file(rebase_path_head_name(), "%s\n", head_name); > > write_file() can call die() which isn't encouraged for code in libgit. > I'm not sure how much it matters in this case. Rewriting all these as > > if (head_name && write_message(onto, strlen(onto), rebase_path_onto(), > 1)) > return -1; > > is a bit tedious. An alternative would be it leave it for now and in the > longer term move this function (and the ones above which I've just > noticed also call write_file()) to in builtin/rebase.c (assuming that > builtin/rebase--interactive.c and builtin/rebase.c get merged once > they're finalized - I'm not sure if there is a plan for that or not.) This came up in the review, and Alban said exactly what you did. I then even dragged Peff into the discussion, as it was his idea to change `write_file()` from returning an `int` to returning a `void` (instead of libifying the function so that it would not `die()` in error cases and `return 0` otherwise): https://github.com/git/git/pull/518#discussion_r200606997 Christian Couder (one of Alban's mentors) then even jumped in and *agreed* that libifying code "could be seen as unnecessary code churn and rejected." In light of these two respected community members suggesting to Alban to go and not give a flying fish about proper error handling, I have to admit that I am sympathetic to Alban simply using `write_file()` as-is. I do agree with you, of course, that the over-use of `die()` in our code base is a pretty bad thing. But that's neither Alban's fault, nor should he be punished for the advice he has been given. In short: I agree with you that `write_file()` should be libified properly, and I would suggest not to burden Alban with this (Alban, of course you should feel free to work on this if this is something you care about, too). Ciao, Dscho
Re: [PATCH] range-diff: update stale summary of --no-dual-color
Johannes Schindelin writes: > On Wed, 22 Aug 2018, Kyle Meyer wrote: > >> 275267937b (range-diff: make dual-color the default mode, 2018-08-13) >> replaced --dual-color with --no-dual-color but left the option's >> summary untouched. Rewrite the summary to describe --no-dual-color >> rather than dual-color. >> >> Signed-off-by: Kyle Meyer >> --- >> builtin/range-diff.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/builtin/range-diff.c b/builtin/range-diff.c >> index f52d45d9d6..7dc90a5ec3 100644 >> --- a/builtin/range-diff.c >> +++ b/builtin/range-diff.c >> @@ -25,7 +25,7 @@ int cmd_range_diff(int argc, const char **argv, const char >> *prefix) >> OPT_INTEGER(0, "creation-factor", _factor, >> N_("Percentage by which creation is weighted")), >> OPT_BOOL(0, "no-dual-color", _color, >> -N_("color both diff and diff-between-diffs")), >> +N_("restrict coloring to outer diff markers")), > > How about "use simple diff colors" instead? I am wondering if it makes sense to remove the option altogether. I've been trying to view the comparison of the same ranges in both styles for the past few days, and I never found a reason to choose "no dual color" option myself.
Re: [PATCH] range-diff: update stale summary of --no-dual-color
Hi, Johannes Schindelin wrote: > On Wed, 22 Aug 2018, Jonathan Nieder wrote: >> OPT_INTEGER(0, "creation-factor", _factor, >> N_("Percentage by which creation is weighted")), >> -OPT_BOOL(0, "no-dual-color", _color, >> -N_("color both diff and diff-between-diffs")), >> +OPT_BOOL(0, "dual-color", _color, >> +N_("color both diff and diff-between-diffs >> (default)")), > > There is one very good reason *not* to do that. And that reason is the > output of `git range-diff -h`. If anybody read that the option > `--dual-color` exists, they are prone to believe that the default is *not* > dual color. In contrast, when reading `--no-dual-color`, it is clear that > dual color mode is the default. The whole patch is about "git range-diff -h" output, and of course I tested it. Did you see the "(default)" part of the string in the patch? That said, the conversation continued and I agree with the conclusion it led to (which is better than the patch you're replying to). Thanks, Jonathan
Re: [PATCH v2] range-diff: update stale summary of --no-dual-color
Kyle Meyer wrote: > OPT_BOOL(0, "no-dual-color", _color, > - N_("color both diff and diff-between-diffs")), > + N_("color only based on the diff-between-diffs")), Reviewed-by: Jonathan Nieder Dscho's suggestion "use simple diff colors" also sounds fine to me (probably even better). Thanks, Jonathan
Re: [GSoC][PATCH v6 18/20] rebase--interactive2: rewrite the submodes of interactive rebase in C
Hi Junio, On Wed, 22 Aug 2018, Junio C Hamano wrote: > Johannes Schindelin writes: > > > I made this same mistake over and over again, myself. For some reason, > > John Keeping decided to use the singular form "revision" in 1e0dacdbdb75 > > (rebase: omit patch-identical commits with --fork-point, 2014-07-16), not > > the plural. > > Perhaps we should give a synonym to the option? Renaming it to the > plural form may keep the existing usage working as it would be the > unique abbreviation, which may be a way to reduce the mistake. The obvious plan is to switch from spawning a separate process after parsing the `git rebase` options just to execute the interactive rebase to performing both parts in the same process. In other words: this option will simply go away. I'd much rather spend time and effort on designing a nice API for calling the rebase backends in-process than on adding code that adds some plural form (which might not even be appropriate, given that you really can only pass one single negative revision to restrict the commit range). Ciao, Dscho > > > > > So you will need to squash this in: > > > > -- snipsnap -- > > diff --git a/git-legacy-rebase.sh b/git-legacy-rebase.sh > > index fb0395af5b1..7600765f541 100755 > > --- a/git-legacy-rebase.sh > > +++ b/git-legacy-rebase.sh > > @@ -145,8 +145,8 @@ run_interactive () { > > test -n "$autosquash" && autosquash="--autosquash" > > test -n "$verbose" && verbose="--verbose" > > test -n "$force_rebase" && force_rebase="--no-ff" > > - test -n "$restrict_revisions" && \ > > - restrict_revisions="--restrict-revisions=^$restrict_revisions" > > + test -n "$restrict_revision" && \ > > + restrict_revision="--restrict-revision=^$restrict_revision" > > test -n "$upstream" && upstream="--upstream=$upstream" > > test -n "$onto" && onto="--onto=$onto" > > test -n "$squash_onto" && squash_onto="--squash-onto=$squash_onto" >
[PATCH] i18n: fix mistakes in translated strings
Fix typos and convert a question which does not expect to be replied to a simple advice. Signed-off-by: Jean-Noël Avila --- builtin/submodule--helper.c | 2 +- config.c| 2 +- sequencer.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 2bcc70fdfe..b56028ba9d 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -542,7 +542,7 @@ static void runcommand_in_submodule_cb(const struct cache_entry *list_item, argv_array_pushv(, info->argv); if (run_command()) - die(_("run_command returned non-zero status while" + die(_("run_command returned non-zero status while " "recursing in the nested submodules of %s\n."), displaypath); } diff --git a/config.c b/config.c index 9a0b10d4bc..3461993f0a 100644 --- a/config.c +++ b/config.c @@ -124,7 +124,7 @@ static const char include_depth_advice[] = N_( " %s\n" "from\n" " %s\n" -"Do you have circular includes?"); +"This might be due to circular includes."); static int handle_path_include(const char *path, struct config_include_data *inc) { int ret = 0; diff --git a/sequencer.c b/sequencer.c index 65d371c746..84bf598c3e 100644 --- a/sequencer.c +++ b/sequencer.c @@ -720,7 +720,7 @@ static const char *read_author_ident(struct strbuf *buf) /* dequote values and construct ident line in-place */ for (in = buf->buf; i < 3 && in - buf->buf < buf->len; i++) { if (!skip_prefix(in, keys[i], (const char **))) { - warning(_("could not parse '%s' (looking for '%s'"), + warning(_("could not parse '%s' (looking for '%s')"), rebase_path_author_script(), keys[i]); return NULL; } -- 2.18.0
Re: [PATCH v2 1/1] t2024: mark a `checkout -p` test as requiring Perl
Hi Ævar, On Thu, 23 Aug 2018, Ævar Arnfjörð Bjarmason wrote: > > On Thu, Aug 23 2018, Johannes Schindelin via GitGitGadget wrote: > > > From: Johannes Schindelin > > > > A recently-added test case tries to verify that the output of `checkout > > -p` contains a certain piece of advice. > > > > But if Git was built without Perl and therefore lacks support for `git > > add -i`, the error output contains the hint that `-p` is not even > > available instead. > > > > Let's just skip that test case altogether if Git was built with NO_PERL. > > > > Signed-off-by: Johannes Schindelin > > --- > > t/t2024-checkout-dwim.sh | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/t/t2024-checkout-dwim.sh b/t/t2024-checkout-dwim.sh > > index 26dc3f1fc0..29e1e25300 100755 > > --- a/t/t2024-checkout-dwim.sh > > +++ b/t/t2024-checkout-dwim.sh > > @@ -76,7 +76,8 @@ test_expect_success 'checkout of branch from multiple > > remotes fails #1' ' > > test_branch master > > ' > > > > -test_expect_success 'checkout of branch from multiple remotes fails with > > advice' ' > > +test_expect_success NO_PERL \ > > + 'checkout of branch from multiple remotes fails with advice' ' > > git checkout -B master && > > test_might_fail git branch -D foo && > > test_must_fail git checkout foo 2>stderr && > > This issue is already fixed in master as 3338e9950e ("t2024: mark test > using "checkout -p" with PERL prerequisite", 2018-08-18). Excellent, Dscho
Re: [ANNOUNCE] Git v2.19.0-rc0
On 8/23/2018 2:53 PM, Jeff King wrote: On Thu, Aug 23, 2018 at 06:26:58AM -0400, Derrick Stolee wrote: I think you can safely ignore the rest of it if you are otherwise occupied. Even if v2.19 ships without some mitigation, I don't know that it's all that big a deal, given the numbers I generated (which for some reason are less dramatic than Stolee's). My numbers may be more dramatic because my Linux environment is a virtual machine. If you have a chance, can you run p0001 on my patch (compared to 2.19-rc0, or to both v2.18 and v2.19-rc0)? It would be nice to double check that it really is fixing the problem you saw. Sure. Note: I had to create a new Linux VM on a different machine between Tuesday and today, so the absolute numbers are different. Using git/git: Test v2.18.0 v2.19.0-rc0 HEAD - 0001.2: 3.10(3.02+0.08) 3.27(3.17+0.09) +5.5% 3.14(3.02+0.11) +1.3% Using torvalds/linux: Test v2.18.0 v2.19.0-rc0 HEAD -- 0001.2: 56.08(45.91+1.50) 56.60(46.62+1.50) +0.9% 54.61(45.47+1.46) -2.6% Now here is where I get on my soapbox (and create a TODO for myself later). I ran the above with GIT_PERF_REPEAT_COUNT=10, which intuitively suggests that the results should be _more_ accurate than the default of 3. However, I then remember that we only report the *minimum* time from all the runs, which is likely to select an outlier from the distribution. To test this, I ran a few tests manually and found the variation between runs to be larger than 3%. When I choose my own metrics for performance tests, I like to run at least 10 runs, remove the largest AND smallest runs from the samples, and then take the average. I did this manually for 'git rev-list --all --objects' on git/git and got the following results: v2.18.0 v2.19.0-rc0 HEAD 3.126 s 3.308 s 3.170 s For full disclosure, here is a full table including all samples: | | v2.18.0 | v2.19.0-rc0 | HEAD | |--|-|-|-| | | 4.58 | 3.302 | 3.239 | | | 3.13 | 3.337 | 3.133 | | | 3.213 | 3.291 | 3.159 | | | 3.219 | 3.318 | 3.131 | | | 3.077 | 3.302 | 3.163 | | | 3.074 | 3.328 | 3.119 | | | 3.022 | 3.277 | 3.125 | | | 3.083 | 3.259 | 3.203 | | | 3.057 | 3.311 | 3.223 | | | 3.155 | 3.413 | 3.225 | | Max | 4.58 | 3.413 | 3.239 | | Min | 3.022 | 3.259 | 3.119 | | Avg* | 3.126 | 3.30825 | 3.17025 | (Note that the largest one was the first run, on v2.18.0, which is due to a cold disk.) I just kicked off a script that will run this test on the Linux repo while I drive home. I'll be able to report a similar table of data easily. My TODO is to consider aggregating the data this way (or with a median) instead of reporting the minimum. Thanks, -Stolee
Re: [PATCH v3 3/5] tests: use shorter here-docs in chainlint.sed for AIX sed
On Thu, Aug 23, 2018 at 4:36 PM Ævar Arnfjörð Bjarmason wrote: > As noted in [1] there's still a remaining recently introduced > portability issue also introduced in 878f988350 ("t/test-lib: teach > --chain-lint to detect broken &&-chains in subshells", 2018-07-11), so > under AIX the tests must be run with GIT_TEST_CHAIN_LINT=0. > > I don't know how to solve the other issue, and this gets us some of > the way to GIT_TEST_CHAIN_LINT=1 working again on AIX. Does unindenting the comment, as I suggested in [1], fix the remaining problem for you? [1]: https://public-inbox.org/git/CAPig+cTTbU5HFMKgNyrxTp3+kcK46-Fn=4zh6zdt1oqchac...@mail.gmail.com/
Re: [PATCH] range-diff: update stale summary of --no-dual-color
Hi Kyle, On Wed, 22 Aug 2018, Kyle Meyer wrote: > 275267937b (range-diff: make dual-color the default mode, 2018-08-13) > replaced --dual-color with --no-dual-color but left the option's > summary untouched. Rewrite the summary to describe --no-dual-color > rather than dual-color. > > Signed-off-by: Kyle Meyer > --- > builtin/range-diff.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/builtin/range-diff.c b/builtin/range-diff.c > index f52d45d9d6..7dc90a5ec3 100644 > --- a/builtin/range-diff.c > +++ b/builtin/range-diff.c > @@ -25,7 +25,7 @@ int cmd_range_diff(int argc, const char **argv, const char > *prefix) > OPT_INTEGER(0, "creation-factor", _factor, > N_("Percentage by which creation is weighted")), > OPT_BOOL(0, "no-dual-color", _color, > - N_("color both diff and diff-between-diffs")), > + N_("restrict coloring to outer diff markers")), How about "use simple diff colors" instead? Ciao, Johannes
Re: [PATCH] t/lib-rebase.sh: support explicit 'pick' commands in 'fake_editor.sh'
Hi Gábor, On Thu, 23 Aug 2018, SZEDER Gábor wrote: > The verbose output of the test 'reword without issues functions as > intended' in 't3423-rebase-reword.sh', added in a9279c6785 (sequencer: > do not squash 'reword' commits when we hit conflicts, 2018-06-19), > contains the following error output: > > sed: -e expression #1, char 2: extra characters after command > > This error comes from within the 'fake-editor.sh' script created by > 'lib-rebase.sh's set_fake_editor() function, and the root cause is the > FAKE_LINES="pick 1 reword 2" variable in the test in question, in > particular the "pick" word. 'fake-editor.sh' assumes 'pick' to be the > default rebase command and doesn't support an explicit 'pick' command > in FAKE_LINES. As a result, 'pick' will be used instead of a line > number when assembling the following 'sed' script: > > sed -n picks/^pick/pick/p > > which triggers the aforementioned error. > > Luckily, this didn't affect the test's correctness: the erroring 'sed' > command doesn't write anything to the todo script, and processing the > rest of FAKE_LINES generates the desired todo script, as if that > 'pick' command were not there at all. > > The minimal fix would be to remove the 'pick' word from FAKE_LINES, > but that would leave us susceptible to similar issues in the future. > > Instead, teach the fake-editor script to recognize an explicit 'pick' > command, which is still a fairly trivial change. > > In the future we might want to consider reinforcing this fake editor > script with an &&-chain and stricter parsing of the FAKE_LINES > variable (e.g. to error out when encountering unknown rebase commands > or commands and line numbers in the wrong order). > > Signed-off-by: SZEDER Gábor ACK! Thank you very much, Dscho > --- > t/lib-rebase.sh | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/t/lib-rebase.sh b/t/lib-rebase.sh > index 25a77ee5cb..592865d019 100644 > --- a/t/lib-rebase.sh > +++ b/t/lib-rebase.sh > @@ -47,7 +47,7 @@ set_fake_editor () { > action=pick > for line in $FAKE_LINES; do > case $line in > - squash|fixup|edit|reword|drop) > + pick|squash|fixup|edit|reword|drop) > action="$line";; > exec*) > echo "$line" | sed 's/_/ /g' >> "$1";; > -- > 2.19.0.rc0.136.gd2dd172e64 > >
Re: [PATCH v3 5/5] tests: fix and add lint for non-portable grep --file
Ævar Arnfjörð Bjarmason writes: > The --file option to grep isn't in POSIX[1], but -f is[1]. Let's check > for that in the future, and fix the portability regression in > f237c8b6fe ("commit-graph: implement git-commit-graph write", > 2018-04-02) that broke e.g. AIX. > > 1. http://pubs.opengroup.org/onlinepubs/009695399/utilities/grep.html > > Signed-off-by: Ævar Arnfjörð Bjarmason > --- Thanks. > t/check-non-portable-shell.pl | 1 + > t/t5318-commit-graph.sh | 2 +- > 2 files changed, 2 insertions(+), 1 deletion(-) > > diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl > index 75f38298d7..b45bdac688 100755 > --- a/t/check-non-portable-shell.pl > +++ b/t/check-non-portable-shell.pl > @@ -43,6 +43,7 @@ sub err { > /\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use > test_line_count)'; > /\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes > BYTES out)'; > /(?:\$\(seq|^\s*seq\b)/ and err 'seq is not portable (use test_seq)'; > + /\bgrep\b.*--file\b/ and err 'grep --file FILE is not portable (use > grep -f FILE)'; > /\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable > (use FOO=bar && export FOO)'; > /^\s*([A-Z0-9_]+=(\w+|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and > err '"FOO=bar shell_func" assignment extends beyond > "shell_func"'; > diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh > index 3c1ffad491..0c500f7ca2 100755 > --- a/t/t5318-commit-graph.sh > +++ b/t/t5318-commit-graph.sh > @@ -134,7 +134,7 @@ test_expect_success 'Add one more commit' ' > git branch commits/8 && > ls $objdir/pack | grep idx >existing-idx && > git repack && > - ls $objdir/pack| grep idx | grep -v --file=existing-idx >new-idx > + ls $objdir/pack| grep idx | grep -v -f existing-idx >new-idx > ' > > # Current graph structure:
Re: [PATCH v3 3/5] tests: use shorter here-docs in chainlint.sed for AIX sed
Ævar Arnfjörð Bjarmason writes: > Improve the portability of chainlint by using shorter here-docs. On > AIX sed will complain about: > > sed: 0602-417 The label :hereslurp is greater than eight > characters > > As noted in [1] there's still a remaining recently introduced > portability issue also introduced in 878f988350 ("t/test-lib: teach > --chain-lint to detect broken &&-chains in subshells", 2018-07-11), so > under AIX the tests must be run with GIT_TEST_CHAIN_LINT=0. > > I don't know how to solve the other issue, and this gets us some of > the way to GIT_TEST_CHAIN_LINT=1 working again on AIX. > > 1. https://public-inbox.org/git/871sapezba@evledraar.gmail.com/ > > Signed-off-by: Ævar Arnfjörð Bjarmason > --- I'll globally do s/here-doc/label/ while queueing. POSIX says "The implementation shall support label arguments recognized as unique up to at least 8 bytes", so replacing these labels to shorter strings makes perfect sense. Will queue; thanks. > t/chainlint.sed | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/t/chainlint.sed b/t/chainlint.sed > index 8544df38df..2333705b27 100644 > --- a/t/chainlint.sed > +++ b/t/chainlint.sed > @@ -97,11 +97,11 @@ > /<<[ ]*[-\\']*[A-Za-z0-9_]/ { > s/^\(.*\)<<[]*[-\\']*\([A-Za-z0-9_][A-Za-z0-9_]*\)'*/<\2>\1< s/[ ]*< - :hereslurp > + :hered > N > /^<\([^>]*\)>.*\n[ ]*\1[ ]*$/!{ > s/\n.*$// > - bhereslurp > + bhered > } > s/^<[^>]*>// > s/\n.*$// > @@ -283,11 +283,11 @@ bfolded > :heredoc > s/^\(.*\)<<[ ]*[-\\']*\([A-Za-z0-9_][A-Za-z0-9_]*\)'*/<\2>\1< s/[ ]*< -:hereslurpsub > +:heredsub > N > /^<\([^>]*\)>.*\n[ ]*\1[ ]*$/!{ > s/\n.*$// > - bhereslurpsub > + bheredsub > } > s/^<[^>]*>// > s/\n.*$//
[PATCH v3 1/5] tests: fix and add lint for non-portable head -c N
The "head -c BYTES" option is non-portable (not in POSIX[1]). Change such invocations to use the test_copy_bytes wrapper added in 48860819e8 ("t9300: factor out portable "head -c" replacement", 2016-06-30). This fixes a test added in 9d2e330b17 ("ewah_read_mmap: bounds-check mmap reads", 2018-06-14), which has been breaking t5310-pack-bitmaps.sh on OpenBSD since 2.18.0. The OpenBSD ports already have a similar workaround after their upgrade to 2.18.0[2]. I have not tested this on IRIX, but according to 4de0bbd898 ("t9300: use perl "head -c" clone in place of "dd bs=1 count=16000" kluge", 2010-12-13) this invocation would have broken things there too. Also, change a valgrind-specific codepath in test-lib.sh to use this wrapper. Given where valgrind runs I don't think this would ever become a portability issue in practice, but it's easier to just use the wrapper than introduce some exception for the "make test-lint" check being added here. 1. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/head.html 2. https://github.com/openbsd/ports/commit/08d5d82eaefe5cf2f125ecc0c6a57df9cf91350c#diff-f7d3c4fabeed1691620d608f1534f5e5 Signed-off-by: Ævar Arnfjörð Bjarmason --- t/check-non-portable-shell.pl | 1 + t/t5310-pack-bitmaps.sh | 2 +- t/test-lib.sh | 4 ++-- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl index d5823f71d8..c8f10d40a1 100755 --- a/t/check-non-portable-shell.pl +++ b/t/check-non-portable-shell.pl @@ -41,6 +41,7 @@ sub err { /^\s*[^#]\s*which\s/ and err 'which is not portable (use type)'; /\btest\s+[^=]*==/ and err '"test a == b" is not portable (use =)'; /\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use test_line_count)'; + /\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes BYTES out)'; /\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable (use FOO=bar && export FOO)'; /^\s*([A-Z0-9_]+=(\w+|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and err '"FOO=bar shell_func" assignment extends beyond "shell_func"'; diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index 557bd0d0c0..7bff7923f2 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -335,7 +335,7 @@ test_expect_success 'truncated bitmap fails gracefully' ' git rev-list --use-bitmap-index --count --all >expect && bitmap=$(ls .git/objects/pack/*.bitmap) && test_when_finished "rm -f $bitmap" && - head -c 512 <$bitmap >$bitmap.tmp && + test_copy_bytes 512 <$bitmap >$bitmap.tmp && mv -f $bitmap.tmp $bitmap && git rev-list --use-bitmap-index --count --all >actual 2>stderr && test_cmp expect actual && diff --git a/t/test-lib.sh b/t/test-lib.sh index 8bb0f4348e..44288cbb59 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -867,7 +867,7 @@ then # handle only executables, unless they are shell libraries that # need to be in the exec-path. test -x "$1" || - test "# " = "$(head -c 2 <"$1")" || + test "# " = "$(test_copy_bytes 2 <"$1")" || return; base=$(basename "$1") @@ -882,7 +882,7 @@ then # do not override scripts if test -x "$symlink_target" && test ! -d "$symlink_target" && - test "#!" != "$(head -c 2 < "$symlink_target")" + test "#!" != "$(test_copy_bytes 2 <"$symlink_target")" then symlink_target=../valgrind.sh fi -- 2.18.0.865.gffc8e1a3cd6
[PATCH v3 3/5] tests: use shorter here-docs in chainlint.sed for AIX sed
Improve the portability of chainlint by using shorter here-docs. On AIX sed will complain about: sed: 0602-417 The label :hereslurp is greater than eight characters As noted in [1] there's still a remaining recently introduced portability issue also introduced in 878f988350 ("t/test-lib: teach --chain-lint to detect broken &&-chains in subshells", 2018-07-11), so under AIX the tests must be run with GIT_TEST_CHAIN_LINT=0. I don't know how to solve the other issue, and this gets us some of the way to GIT_TEST_CHAIN_LINT=1 working again on AIX. 1. https://public-inbox.org/git/871sapezba@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason --- t/chainlint.sed | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/t/chainlint.sed b/t/chainlint.sed index 8544df38df..2333705b27 100644 --- a/t/chainlint.sed +++ b/t/chainlint.sed @@ -97,11 +97,11 @@ /<<[ ]*[-\\']*[A-Za-z0-9_]/ { s/^\(.*\)<<[]*[-\\']*\([A-Za-z0-9_][A-Za-z0-9_]*\)'*/<\2>\1<]*\)>.*\n[ ]*\1[ ]*$/!{ s/\n.*$// - bhereslurp + bhered } s/^<[^>]*>// s/\n.*$// @@ -283,11 +283,11 @@ bfolded :heredoc s/^\(.*\)<<[ ]*[-\\']*\([A-Za-z0-9_][A-Za-z0-9_]*\)'*/<\2>\1<]*\)>.*\n[ ]*\1[ ]*$/!{ s/\n.*$// - bhereslurpsub + bheredsub } s/^<[^>]*>// s/\n.*$// -- 2.18.0.865.gffc8e1a3cd6
[PATCH v3 4/5] tests: fix version-specific portability issue in Perl JSON
The test guarded by PERLJSON added in 75459410ed ("json_writer: new routines to create JSON data", 2018-07-13) assumed that a JSON boolean value like "true" or "false" would be represented as "1" or "0" in Perl. This behavior can't be relied upon, e.g. with JSON.pm 2.50 and JSON::PP A JSON::PP::Boolean object will be represented as "true" or "false". To work around this let's check if we have any refs left after we check for hashes and arrays, assume those are JSON objects, and coerce them to a known boolean value. The behavior of this test still looks odd to me. Why implement our own ad-hoc encoder just for some one-off test, as opposed to say Perl's own Data::Dumper with Sortkeys et al? But with this change it works, so let's leave it be. Signed-off-by: Ævar Arnfjörð Bjarmason --- t/t0019/parse_json.perl | 3 +++ 1 file changed, 3 insertions(+) diff --git a/t/t0019/parse_json.perl b/t/t0019/parse_json.perl index ca4e5bfa78..fea87fb81b 100644 --- a/t/t0019/parse_json.perl +++ b/t/t0019/parse_json.perl @@ -34,6 +34,9 @@ sub dump_item { } elsif (ref($value) eq 'HASH') { print "$label_in hash\n"; dump_hash($label_in, $value); +} elsif (ref $value) { + my $bool = $value ? 1 : 0; + print "$label_in $bool\n"; } elsif (defined $value) { print "$label_in $value\n"; } else { -- 2.18.0.865.gffc8e1a3cd6
[PATCH v3 2/5] tests: fix and add lint for non-portable seq
The seq command is not in POSIX, and doesn't exist on e.g. OpenBSD. We've had the test_seq wrapper since d17cf5f3a3 ("tests: Introduce test_seq", 2012-08-04), but use of it keeps coming back, e.g. in the recently added "fetch negotiator" tests being added here. So let's also add a check to "make test-lint". The regex is aiming to capture the likes of $(seq ..) and "seq" as a stand-alone command, without capturing some existing cases where we e.g. have files called "seq", as \bseq\b would do. Signed-off-by: Ævar Arnfjörð Bjarmason --- t/check-non-portable-shell.pl| 1 + t/t5552-skipping-fetch-negotiator.sh | 12 ++-- t/t5703-upload-pack-ref-in-want.sh | 4 ++-- 3 files changed, 9 insertions(+), 8 deletions(-) diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl index c8f10d40a1..75f38298d7 100755 --- a/t/check-non-portable-shell.pl +++ b/t/check-non-portable-shell.pl @@ -42,6 +42,7 @@ sub err { /\btest\s+[^=]*==/ and err '"test a == b" is not portable (use =)'; /\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use test_line_count)'; /\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes BYTES out)'; + /(?:\$\(seq|^\s*seq\b)/ and err 'seq is not portable (use test_seq)'; /\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable (use FOO=bar && export FOO)'; /^\s*([A-Z0-9_]+=(\w+|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and err '"FOO=bar shell_func" assignment extends beyond "shell_func"'; diff --git a/t/t5552-skipping-fetch-negotiator.sh b/t/t5552-skipping-fetch-negotiator.sh index 5ad5bece55..30857b84a8 100755 --- a/t/t5552-skipping-fetch-negotiator.sh +++ b/t/t5552-skipping-fetch-negotiator.sh @@ -46,7 +46,7 @@ test_expect_success 'commits with no parents are sent regardless of skip distanc test_commit -C server to_fetch && git init client && - for i in $(seq 7) + for i in $(test_seq 7) do test_commit -C client c$i done && @@ -89,7 +89,7 @@ test_expect_success 'when two skips collide, favor the larger one' ' test_commit -C server to_fetch && git init client && - for i in $(seq 11) + for i in $(test_seq 11) do test_commit -C client c$i done && @@ -168,14 +168,14 @@ test_expect_success 'do not send "have" with ancestors of commits that server AC test_commit -C server to_fetch && git init client && - for i in $(seq 8) + for i in $(test_seq 8) do git -C client checkout --orphan b$i && test_commit -C client b$i.c0 done && - for j in $(seq 19) + for j in $(test_seq 19) do - for i in $(seq 8) + for i in $(test_seq 8) do git -C client checkout b$i && test_commit -C client b$i.c$j @@ -205,7 +205,7 @@ test_expect_success 'do not send "have" with ancestors of commits that server AC # fetch-pack should thus not send any more commits in the b1 branch, but # should still send the others (in this test, just check b2). - for i in $(seq 0 8) + for i in $(test_seq 0 8) do have_not_sent b1.c$i done && diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh index a73c55a47e..d1ccc22331 100755 --- a/t/t5703-upload-pack-ref-in-want.sh +++ b/t/t5703-upload-pack-ref-in-want.sh @@ -176,7 +176,7 @@ test_expect_success 'setup repos for change-while-negotiating test' ' git clone "http://127.0.0.1:$LIB_HTTPD_PORT/smart/repo; "$LOCAL_PRISTINE" && cd "$LOCAL_PRISTINE" && git checkout -b side && - for i in $(seq 1 33); do test_commit s$i; done && + for i in $(test_seq 1 33); do test_commit s$i; done && # Add novel commits to upstream git checkout master && @@ -289,7 +289,7 @@ test_expect_success 'setup repos for fetching with ref-in-want tests' ' git clone "file://$REPO" "$LOCAL_PRISTINE" && cd "$LOCAL_PRISTINE" && git checkout -b side && - for i in $(seq 1 33); do test_commit s$i; done && + for i in $(test_seq 1 33); do test_commit s$i; done && # Add novel commits to upstream git checkout master && -- 2.18.0.865.gffc8e1a3cd6
[PATCH v3 0/5] OpenBSD & AIX etc. portability fixes
This grew a bit more. I'm going to stop poking at this for now. The tests are still broken on OpenBSD (3-5 broken) and on AIX something like 20-30 are broken, but this makes it slightly better. Ævar Arnfjörð Bjarmason (5): tests: fix and add lint for non-portable head -c N tests: fix and add lint for non-portable seq tests: use shorter here-docs in chainlint.sed for AIX sed tests: fix version-specific portability issue in Perl JSON tests: fix and add lint for non-portable grep --file t/chainlint.sed | 8 t/check-non-portable-shell.pl| 3 +++ t/t0019/parse_json.perl | 3 +++ t/t5310-pack-bitmaps.sh | 2 +- t/t5318-commit-graph.sh | 2 +- t/t5552-skipping-fetch-negotiator.sh | 12 ++-- t/t5703-upload-pack-ref-in-want.sh | 4 ++-- t/test-lib.sh| 4 ++-- 8 files changed, 22 insertions(+), 16 deletions(-) -- 2.18.0.865.gffc8e1a3cd6
[PATCH v3 5/5] tests: fix and add lint for non-portable grep --file
The --file option to grep isn't in POSIX[1], but -f is[1]. Let's check for that in the future, and fix the portability regression in f237c8b6fe ("commit-graph: implement git-commit-graph write", 2018-04-02) that broke e.g. AIX. 1. http://pubs.opengroup.org/onlinepubs/009695399/utilities/grep.html Signed-off-by: Ævar Arnfjörð Bjarmason --- t/check-non-portable-shell.pl | 1 + t/t5318-commit-graph.sh | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl index 75f38298d7..b45bdac688 100755 --- a/t/check-non-portable-shell.pl +++ b/t/check-non-portable-shell.pl @@ -43,6 +43,7 @@ sub err { /\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use test_line_count)'; /\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes BYTES out)'; /(?:\$\(seq|^\s*seq\b)/ and err 'seq is not portable (use test_seq)'; + /\bgrep\b.*--file\b/ and err 'grep --file FILE is not portable (use grep -f FILE)'; /\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable (use FOO=bar && export FOO)'; /^\s*([A-Z0-9_]+=(\w+|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and err '"FOO=bar shell_func" assignment extends beyond "shell_func"'; diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 3c1ffad491..0c500f7ca2 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -134,7 +134,7 @@ test_expect_success 'Add one more commit' ' git branch commits/8 && ls $objdir/pack | grep idx >existing-idx && git repack && - ls $objdir/pack| grep idx | grep -v --file=existing-idx >new-idx + ls $objdir/pack| grep idx | grep -v -f existing-idx >new-idx ' # Current graph structure: -- 2.18.0.865.gffc8e1a3cd6
Re: [PATCH v1] read-cache: speed up index load through parallelization
On 8/23/2018 2:06 PM, Junio C Hamano wrote: Ben Peart writes: This patch helps address the CPU cost of loading the index by creating multiple threads to divide the work of loading and converting the cache entries across all available CPU cores. Nice. +int git_config_get_fast_index(void) +{ + int val; + + if (!git_config_get_maybe_bool("core.fastindex", )) + return val; + + if (getenv("GIT_FASTINDEX_TEST")) + return 1; It probably makes sense to use git_env_bool() to be consistent, which allows GIT_FASTINDEX_TEST=0 to turn it off after this becomes the default. diff --git a/read-cache.c b/read-cache.c index 7b1354d759..0fa7e1a04c 100644 --- a/read-cache.c +++ b/read-cache.c @@ -24,6 +24,10 @@ #include "utf8.h" #include "fsmonitor.h" +#ifndef min +#define min(a,b) (((a) < (b)) ? (a) : (b)) +#endif Let's lose this, which is used only once, even though it could be used elsewhere but not used (e.g. threads vs cpus near the beginning of load_cache_entries()). I didn't have it, then added it to make it trivial to see what was actually happening. I can switch back. +static unsigned long load_cache_entry_block(struct index_state *istate, struct mem_pool *ce_mem_pool, int offset, int nr, void *mmap, unsigned long start_offset, struct strbuf *previous_name) Wrap and possibly add comment before the function to describe what it does and what its parameters mean? +{ + int i; + unsigned long src_offset = start_offset; + + for (i = offset; i < offset + nr; i++) { + struct ondisk_cache_entry *disk_ce; + struct cache_entry *ce; + unsigned long consumed; + + disk_ce = (struct ondisk_cache_entry *)((char *)mmap + src_offset); + ce = create_from_disk(ce_mem_pool, disk_ce, , previous_name); + set_index_entry(istate, i, ce); + + src_offset += consumed; + } + return src_offset - start_offset; +} OK. +static unsigned long load_all_cache_entries(struct index_state *istate, void *mmap, size_t mmap_size, unsigned long src_offset) +{ (following aloud) This "all" variant is "one thread does all", iow, unthreaded version. Makes sense. + struct strbuf previous_name_buf = STRBUF_INIT, *previous_name; + unsigned long consumed; + + if (istate->version == 4) { + previous_name = _name_buf; + mem_pool_init(>ce_mem_pool, + estimate_cache_size_from_compressed(istate->cache_nr)); + } else { + previous_name = NULL; + mem_pool_init(>ce_mem_pool, + estimate_cache_size(mmap_size, istate->cache_nr)); + } I count there are three instances of "if version 4 use the strbuf for name-buf, otherwise..." in this patch, which made me wonder if we can make them shared more and/or if it makes sense to attempt to do so. Actually, they are all different and all required. One sets it up for the "do it all on one thread" path. One sets it up for each thread. The last one is used by the primary thread when scanning for blocks to hand off to the child threads. + consumed = load_cache_entry_block(istate, istate->ce_mem_pool, 0, istate->cache_nr, mmap, src_offset, previous_name); + strbuf_release(_name_buf); + return consumed; +} + +#ifdef NO_PTHREADS + +#define load_cache_entries load_all_cache_entries + +#else + +#include "thread-utils.h" + +/* +* Mostly randomly chosen maximum thread counts: we +* cap the parallelism to online_cpus() threads, and we want +* to have at least 7500 cache entries per thread for it to +* be worth starting a thread. +*/ +#define THREAD_COST(7500) + +struct load_cache_entries_thread_data +{ + pthread_t pthread; + struct index_state *istate; + struct mem_pool *ce_mem_pool; + int offset, nr; + void *mmap; + unsigned long start_offset; + struct strbuf previous_name_buf; + struct strbuf *previous_name; + unsigned long consumed; /* return # of bytes in index file processed */ +}; + +/* +* A thread proc to run the load_cache_entries() computation +* across multiple background threads. +*/ +static void *load_cache_entries_thread(void *_data) +{ + struct load_cache_entries_thread_data *p = _data; + + p->consumed += load_cache_entry_block(p->istate, p->ce_mem_pool, p->offset, p->nr, p->mmap, p->start_offset, p->previous_name); + return NULL; +} (following aloud) And the threaded version chews the block of ce's given to each thread. Makes sense. +static unsigned long load_cache_entries(struct index_state *istate, void *mmap, size_t mmap_size, unsigned long src_offset) +{ + struct strbuf previous_name_buf = STRBUF_INIT, *previous_name; + struct load_cache_entries_thread_data *data; + int threads, cpus, thread_nr; + unsigned long consumed; +
Re: [PATCH v1] read-cache: speed up index load through parallelization
On 8/23/2018 1:31 PM, Stefan Beller wrote: On Thu, Aug 23, 2018 at 8:45 AM Ben Peart wrote: This patch helps address the CPU cost of loading the index by creating multiple threads to divide the work of loading and converting the cache entries across all available CPU cores. It accomplishes this by having the primary thread loop across the index file tracking the offset and (for V4 indexes) expanding the name. It creates a thread to process each block of entries as it comes to them. Once the threads are complete and the cache entries are loaded, the rest of the extensions can be loaded and processed normally on the primary thread. Performance impact: read cache .git/index times on a synthetic repo with: 100,000 entries FALSE TRUESavings %Savings 0.014798767 0.009580433 0.005218333 35.26% 1,000,000 entries FALSE TRUESavings %Savings 0.240896533 0.1751243 0.065772233 27.30% read cache .git/index times on an actual repo with: ~3M entries FALSE TRUESavings %Savings 0.59898098 0.4513169 0.14766408 24.65% Signed-off-by: Ben Peart --- Notes: Base Ref: master Web-Diff: https://github.com/benpeart/git/commit/67a700419b Checkout: git fetch https://github.com/benpeart/git read-index-multithread-v1 && git checkout 67a700419b Documentation/config.txt | 8 ++ config.c | 13 +++ config.h | 1 + read-cache.c | 218 ++- 4 files changed, 216 insertions(+), 24 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 1c42364988..3344685cc4 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -899,6 +899,14 @@ relatively high IO latencies. When enabled, Git will do the index comparison to the filesystem data in parallel, allowing overlapping IO's. Defaults to true. +core.fastIndex:: + Enable parallel index loading ++ +This can speed up operations like 'git diff' and 'git status' especially +when the index is very large. When enabled, Git will do the index +loading from the on disk format to the in-memory format in parallel. +Defaults to true. "fast" is a non-descriptive word as we try to be fast in any operation? Maybe core.parallelIndexReading as that just describes what it turns on/off, without second guessing its effects? (Are there still computers with just a single CPU, where this would not make it faster? ;-)) How about core.parallelReadIndex? Slightly shorter and matches the function names better. +int git_config_get_fast_index(void) +{ + int val; + + if (!git_config_get_maybe_bool("core.fastindex", )) + return val; + + if (getenv("GIT_FASTINDEX_TEST")) + return 1; We look at this env value just before calling this function, can be write it to only look at the evn variable once? Sure, I didn't like the fact that it was called twice but didn't get around to cleaning it up. +++ b/config.h @@ -250,6 +250,7 @@ extern int git_config_get_untracked_cache(void); extern int git_config_get_split_index(void); extern int git_config_get_max_percent_split_change(void); extern int git_config_get_fsmonitor(void); +extern int git_config_get_fast_index(void); Oh. nd/no-extern did not cover config.h +#ifndef min +#define min(a,b) (((a) < (b)) ? (a) : (b)) +#endif We do not have a minimum function in the tree, except for xdiff/xmacros.h:29: XDL_MIN. I wonder what the rationale is for not having a MIN() definition, I think we discussed that on the list a couple times but the rationale escaped me. If we introduce a min/max macro, can we put it somewhere more prominent? (I would find it useful elsewhere) I'll avoid that particular rabbit hole and just remove the min macro definition. ;-) +/* +* Mostly randomly chosen maximum thread counts: we +* cap the parallelism to online_cpus() threads, and we want +* to have at least 7500 cache entries per thread for it to +* be worth starting a thread. +*/ +#define THREAD_COST(7500) This reads very similar to preload-index.c THREAD_COST + /* loop through index entries starting a thread for every thread_nr entries */ + consumed = thread = 0; + for (i = 0; ; i++) { + struct ondisk_cache_entry *ondisk; + const char *name; + unsigned int flags; + + /* we've reached the begining of a block of cache entries, kick off a thread to process them */ beginning Thanks Thanks, Stefan
Re: [ANNOUNCE] Git v2.19.0-rc0
On Thu, Aug 23, 2018 at 06:26:58AM -0400, Derrick Stolee wrote: > > I think you can safely > > ignore the rest of it if you are otherwise occupied. Even if v2.19 ships > > without some mitigation, I don't know that it's all that big a deal, > > given the numbers I generated (which for some reason are less dramatic > > than Stolee's). > My numbers may be more dramatic because my Linux environment is a virtual > machine. If you have a chance, can you run p0001 on my patch (compared to 2.19-rc0, or to both v2.18 and v2.19-rc0)? It would be nice to double check that it really is fixing the problem you saw. -Peff
Re: [PATCH 0/9] trailer-parsing false positives
Jeff King writes: > So this turned into a bit of a rabbit hole. Here's what I have so far > (which I think is not quite ready, but I wanted to start discussing). > > Issues (2) and (3) are actually the same issue. The caller that does the > bogus appending for (2) is always append_signoff(). But it always has > just a commit message (modulo some complications, which I'll get to in a > minute), and so fixing it with respect to (3) magically solves (2). > I.e., the code was simply not prepared for the case of "end of string is > not end of trailers". But since that case cannot exist, we do not have > to deal with it. :) > > Now here's the tricky part. I think patches 1-8 are mostly sensible. Yeah, nothing that made me go "Huh?" in these 8 patches. Thanks. > So I think there may be further opportunities for cleanup here. I'm not > sure if we'd need to retain this behavior for git-interpret-trailers. > AFAICT it is not documented, and I suspect is mostly historical > accident, and not anything anybody ever wanted. I tend to think the behaviour was not designed but it "just happens to work that way". > If we do keep it by default, then the "--no-divider" option I added in > patch 4 should probably get a more generic name and cover this. > Something like "--verbatim-input". Perhaps. Even if this is not covered, --verbatim-input would be a good name for the option ;-) > I'm going to sleep on it and see how I feel tomorrow. > > [1/9]: trailer: use size_t for string offsets > [2/9]: trailer: use size_t for iterating trailer list > [3/9]: trailer: pass process_trailer_opts to trailer_info_get() > [4/9]: interpret-trailers: tighten check for "---" patch boundary > [5/9]: interpret-trailers: allow suppressing "---" divider > [6/9]: pretty, ref-filter: format %(trailers) with no_divider option > [7/9]: sequencer: ignore "---" divider when parsing trailers > [8/9]: append_signoff: use size_t for string offsets > [9/9]: sequencer: handle ignore_footer when parsing trailers > > Documentation/git-interpret-trailers.txt | 10 +++- > builtin/interpret-trailers.c | 1 + > commit.c | 6 +-- > commit.h | 2 +- > pretty.c | 3 ++ > ref-filter.c | 2 + > sequencer.c | 20 ++-- > sequencer.h | 9 +++- > t/t4205-log-pretty-formats.sh| 23 + > t/t6300-for-each-ref.sh | 23 + > t/t7501-commit.sh| 16 ++ > t/t7513-interpret-trailers.sh| 42 > trailer.c| 62 +--- > trailer.h| 4 +- > 14 files changed, 184 insertions(+), 39 deletions(-) > > -Peff
Re: [PATCH v2 01/10] t/test-lib: teach --chain-lint to detect broken &&-chains in subshells
On Thu, Aug 23, 2018 at 2:02 PM Ævar Arnfjörð Bjarmason wrote: > Found in some 2.19 testing on AIX: Thanks for reporting these issues. > > + /^[ ]*EOF[ ]*$/!bhereslurp > > AIX sed doesn't like this, and will yell: > :hereslurp is greater than eight characters > This on top fixes it: Fix make sense, and checking POSIX[1] , I see that it says that behavior is unspecified for labels longer than 8 bytes. [1]: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html > > +# bare "(" line? > > +/^[ ]*([]*$/ { > > + # stash for later printing > > AIX sed doesn't like this either, and prints: > sed:# stash for later printing is not a recognized function. > I have no idea what the fix is for that one. That's the only indented comment in the entire script, so it's almost certainly that. How about we move the indented comment up to the comment for the enclosing block, so it reads: # bare "(" line? -- stash for later printing /^[ ]*([]*$/ { h bnextline } I could prepare such a patch, although perhaps it would be better for you to do so since you are in a position to test it (and because you discovered the problems)?
Re: worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff
On Tue, Aug 21, 2018 at 4:43 PM Jeff King wrote: > On Tue, Aug 21, 2018 at 04:22:08PM -0400, Eric Sunshine wrote: > > On Tue, Aug 21, 2018 at 3:36 PM Jeff King wrote: > > How about using "git clone --shared" instead? > > That seems even more dangerous to me, since the created clone can become > corrupt when the parent prunes. Probably not huge for a single > operation, but you may be surprised when you run the script a few days > later and it barfs horribly due to a missing object. Okay. I had thought that doc-diff was never doing anything other than read-only operations on the checked-out worktree after the initial creation, but, looking more closely at the script, I now see that it can perform other Git-based operations, so what you say makes sense. > > In the case that you've already blown away the directory, then having > > "git worktree add" prune away the old worktree bookkeeping would make > > sense and wouldn't lose anything (you've already thrown it away > > manually). However, it could be lossy for the case when the directory > > is only temporarily missing (because it's on removable media or a > > network share). > > I think the removable ones already suffer from that problem (an auto-gc > can prune them). And they should already be marked with "git worktree > lock". That said, people don't always do what they should, and I'd > rather not make the problem worse. :) Hmph. I thought that "git worktree prune" had a sensible "expire" default to protect against such cases of removable media for which "git worktree lock" wasn't invoked, but, looking at the code, I see that the default is TIME_MAX. > > In this case, it might make sense for "git worktree add" to refuse to > > operate if an existing worktree entry still points at the directory > > that you're trying to add. That should prevent those duplicate > > worktree entries you saw. > > Yes, but then what's the next step for my script? I can't "remove" since > the worktree isn't there. I can't blow away any directory that I know > about, since there isn't one. I was thinking that "worktree add" could start respecting the --force option as an escape hatch. > I need to somehow know that an existing > "$GIT_DIR/worktrees/foo" is the problem. But "foo" is not even > deterministic. Looking at the duplicates, it seems to be the basename of > the working tree, but then mutated to avoid collisions with other > worktrees. If the worktree directory still existed, "git -C rev-parse --git-dir" inside the worktree would give you the proper path of $GIT_DIR/worktrees/foo, but the directory doesn't exist, so... nothing. > What about refusing by default, but forcing an overwrite with "-f"? My thought, also.
Re: [PATCH v1] read-cache: speed up index load through parallelization
Ben Peart writes: > This patch helps address the CPU cost of loading the index by creating > multiple threads to divide the work of loading and converting the cache > entries across all available CPU cores. Nice. > +int git_config_get_fast_index(void) > +{ > + int val; > + > + if (!git_config_get_maybe_bool("core.fastindex", )) > + return val; > + > + if (getenv("GIT_FASTINDEX_TEST")) > + return 1; It probably makes sense to use git_env_bool() to be consistent, which allows GIT_FASTINDEX_TEST=0 to turn it off after this becomes the default. > diff --git a/read-cache.c b/read-cache.c > index 7b1354d759..0fa7e1a04c 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -24,6 +24,10 @@ > #include "utf8.h" > #include "fsmonitor.h" > > +#ifndef min > +#define min(a,b) (((a) < (b)) ? (a) : (b)) > +#endif Let's lose this, which is used only once, even though it could be used elsewhere but not used (e.g. threads vs cpus near the beginning of load_cache_entries()). > +static unsigned long load_cache_entry_block(struct index_state *istate, > struct mem_pool *ce_mem_pool, int offset, int nr, void *mmap, unsigned long > start_offset, struct strbuf *previous_name) Wrap and possibly add comment before the function to describe what it does and what its parameters mean? > +{ > + int i; > + unsigned long src_offset = start_offset; > + > + for (i = offset; i < offset + nr; i++) { > + struct ondisk_cache_entry *disk_ce; > + struct cache_entry *ce; > + unsigned long consumed; > + > + disk_ce = (struct ondisk_cache_entry *)((char *)mmap + > src_offset); > + ce = create_from_disk(ce_mem_pool, disk_ce, , > previous_name); > + set_index_entry(istate, i, ce); > + > + src_offset += consumed; > + } > + return src_offset - start_offset; > +} OK. > +static unsigned long load_all_cache_entries(struct index_state *istate, void > *mmap, size_t mmap_size, unsigned long src_offset) > +{ (following aloud) This "all" variant is "one thread does all", iow, unthreaded version. Makes sense. > + struct strbuf previous_name_buf = STRBUF_INIT, *previous_name; > + unsigned long consumed; > + > + if (istate->version == 4) { > + previous_name = _name_buf; > + mem_pool_init(>ce_mem_pool, > + > estimate_cache_size_from_compressed(istate->cache_nr)); > + } else { > + previous_name = NULL; > + mem_pool_init(>ce_mem_pool, > + estimate_cache_size(mmap_size, istate->cache_nr)); > + } I count there are three instances of "if version 4 use the strbuf for name-buf, otherwise..." in this patch, which made me wonder if we can make them shared more and/or if it makes sense to attempt to do so. > + consumed = load_cache_entry_block(istate, istate->ce_mem_pool, 0, > istate->cache_nr, mmap, src_offset, previous_name); > + strbuf_release(_name_buf); > + return consumed; > +} > + > +#ifdef NO_PTHREADS > + > +#define load_cache_entries load_all_cache_entries > + > +#else > + > +#include "thread-utils.h" > + > +/* > +* Mostly randomly chosen maximum thread counts: we > +* cap the parallelism to online_cpus() threads, and we want > +* to have at least 7500 cache entries per thread for it to > +* be worth starting a thread. > +*/ > +#define THREAD_COST (7500) > + > +struct load_cache_entries_thread_data > +{ > + pthread_t pthread; > + struct index_state *istate; > + struct mem_pool *ce_mem_pool; > + int offset, nr; > + void *mmap; > + unsigned long start_offset; > + struct strbuf previous_name_buf; > + struct strbuf *previous_name; > + unsigned long consumed; /* return # of bytes in index file processed */ > +}; > + > +/* > +* A thread proc to run the load_cache_entries() computation > +* across multiple background threads. > +*/ > +static void *load_cache_entries_thread(void *_data) > +{ > + struct load_cache_entries_thread_data *p = _data; > + > + p->consumed += load_cache_entry_block(p->istate, p->ce_mem_pool, > p->offset, p->nr, p->mmap, p->start_offset, p->previous_name); > + return NULL; > +} (following aloud) And the threaded version chews the block of ce's given to each thread. Makes sense. > +static unsigned long load_cache_entries(struct index_state *istate, void > *mmap, size_t mmap_size, unsigned long src_offset) > +{ > + struct strbuf previous_name_buf = STRBUF_INIT, *previous_name; > + struct load_cache_entries_thread_data *data; > + int threads, cpus, thread_nr; > + unsigned long consumed; > + int i, thread; > + > + cpus = online_cpus(); > + threads = istate->cache_nr / THREAD_COST; > + if (threads > cpus) > + threads = cpus; No other caller of online_cpus() is prepared to deal with faulty return from the function (e.g. 0 or negative), so it is perfectly
Re: [PATCH v2 01/10] t/test-lib: teach --chain-lint to detect broken &&-chains in subshells
On Wed, Jul 11 2018, Eric Sunshine wrote: Found in some 2.19 testing on AIX: > +# here-doc -- swallow it to avoid false hits within its body (but keep the > +# command to which it was attached) > +/<<[ ]*[-\\]*EOF[]*/ { > + s/[ ]*<<[ ]*[-\\]*EOF// > + h > + :hereslurp > + N > + s/.*\n// > + /^[ ]*EOF[ ]*$/!bhereslurp > + x > +} AIX sed doesn't like this, and will yell: :hereslurp is greater than eight characters This on top fixes it: diff --git a/t/chainlint.sed b/t/chainlint.sed index 8544df38df..2333705b27 100644 --- a/t/chainlint.sed +++ b/t/chainlint.sed @@ -100 +100 @@ - :hereslurp + :hered @@ -104 +104 @@ - bhereslurp + bhered @@ -286 +286 @@ s/[ ]*< +:subshell > +# bare "(" line? > +/^[ ]*([]*$/ { > + # stash for later printing > + h > + bnextline > +} > +# "(..." line -- split off and stash "(", then process "..." as its own line AIX sed doesn't like this either, and prints: sed:# stash for later printing is not a recognized function. I have no idea what the fix is for that one.
Re: [PATCH v1] read-cache: speed up index load through parallelization
On Thu, Aug 23, 2018 at 8:45 AM Ben Peart wrote: > > This patch helps address the CPU cost of loading the index by creating > multiple threads to divide the work of loading and converting the cache > entries across all available CPU cores. > > It accomplishes this by having the primary thread loop across the index file > tracking the offset and (for V4 indexes) expanding the name. It creates a > thread to process each block of entries as it comes to them. Once the > threads are complete and the cache entries are loaded, the rest of the > extensions can be loaded and processed normally on the primary thread. > > Performance impact: > > read cache .git/index times on a synthetic repo with: > > 100,000 entries > FALSE TRUESavings %Savings > 0.014798767 0.009580433 0.005218333 35.26% > > 1,000,000 entries > FALSE TRUESavings %Savings > 0.240896533 0.1751243 0.065772233 27.30% > > read cache .git/index times on an actual repo with: > > ~3M entries > FALSE TRUESavings %Savings > 0.59898098 0.4513169 0.14766408 24.65% > > Signed-off-by: Ben Peart > --- > > Notes: > Base Ref: master > Web-Diff: https://github.com/benpeart/git/commit/67a700419b > Checkout: git fetch https://github.com/benpeart/git > read-index-multithread-v1 && git checkout 67a700419b > > Documentation/config.txt | 8 ++ > config.c | 13 +++ > config.h | 1 + > read-cache.c | 218 ++- > 4 files changed, 216 insertions(+), 24 deletions(-) > > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1c42364988..3344685cc4 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -899,6 +899,14 @@ relatively high IO latencies. When enabled, Git will do > the > index comparison to the filesystem data in parallel, allowing > overlapping IO's. Defaults to true. > > +core.fastIndex:: > + Enable parallel index loading > ++ > +This can speed up operations like 'git diff' and 'git status' especially > +when the index is very large. When enabled, Git will do the index > +loading from the on disk format to the in-memory format in parallel. > +Defaults to true. "fast" is a non-descriptive word as we try to be fast in any operation? Maybe core.parallelIndexReading as that just describes what it turns on/off, without second guessing its effects? (Are there still computers with just a single CPU, where this would not make it faster? ;-)) > +int git_config_get_fast_index(void) > +{ > + int val; > + > + if (!git_config_get_maybe_bool("core.fastindex", )) > + return val; > + > + if (getenv("GIT_FASTINDEX_TEST")) > + return 1; We look at this env value just before calling this function, can be write it to only look at the evn variable once? > +++ b/config.h > @@ -250,6 +250,7 @@ extern int git_config_get_untracked_cache(void); > extern int git_config_get_split_index(void); > extern int git_config_get_max_percent_split_change(void); > extern int git_config_get_fsmonitor(void); > +extern int git_config_get_fast_index(void); Oh. nd/no-extern did not cover config.h > > +#ifndef min > +#define min(a,b) (((a) < (b)) ? (a) : (b)) > +#endif We do not have a minimum function in the tree, except for xdiff/xmacros.h:29: XDL_MIN. I wonder what the rationale is for not having a MIN() definition, I think we discussed that on the list a couple times but the rationale escaped me. If we introduce a min/max macro, can we put it somewhere more prominent? (I would find it useful elsewhere) > +/* > +* Mostly randomly chosen maximum thread counts: we > +* cap the parallelism to online_cpus() threads, and we want > +* to have at least 7500 cache entries per thread for it to > +* be worth starting a thread. > +*/ > +#define THREAD_COST(7500) This reads very similar to preload-index.c THREAD_COST > + /* loop through index entries starting a thread for every thread_nr > entries */ > + consumed = thread = 0; > + for (i = 0; ; i++) { > + struct ondisk_cache_entry *ondisk; > + const char *name; > + unsigned int flags; > + > + /* we've reached the begining of a block of cache entries, > kick off a thread to process them */ beginning Thanks, Stefan
Re: [PATCH] tests: fix and add lint for non-portable head -c N
On Thu, Aug 23 2018, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason writes: > >> Junio: Even though this isn't a 2.19.0-rc0 regression I think it makes >> sense for 2.19.0. The fix is trivial, and it'll unbreak (at least some >> of) the tests on stock git on OpenBSD. > > I would have been a bit more sympathetic if this were recent > regression (say, 2.18 or so), even if it is not new in this cycle, > and the patch applied to the target track cleanly (say, maint or > maint-2.17), Yeah it's not a big deal if it's not in 2.19. > but seeing that some of these are quite old, dating back to 2009, I am > not sure I am enthused. Note that those parts are the "while I'm at it" valgrind-specific parts, i.e. parts of the code that likely never ran on anything except a GNU toolchain. > The changes certainly look trivial. Let's apply. > > Thanks. > >> >> OpenBSD guys: If you CC the git mailing list when you find you need to >> apply patches like these, we're happy to fix this more pro-actively. I >> just happened to be testing the upcoming 2.19.0 on OpenBSD and spotted >> this. >> >> t/check-non-portable-shell.pl | 1 + >> t/t5310-pack-bitmaps.sh | 2 +- >> t/test-lib.sh | 4 ++-- >> 3 files changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl >> index d5823f71d8..94a7e6165e 100755 >> --- a/t/check-non-portable-shell.pl >> +++ b/t/check-non-portable-shell.pl >> @@ -35,6 +35,7 @@ sub err { >> chomp; >> } >> >> +/\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes >> BYTES out)'; >> /\bsed\s+-i/ and err 'sed -i is not portable'; >> /\becho\s+-[neE]/ and err 'echo with option is not portable (use >> printf)'; >> /^\s*declare\s+/ and err 'arrays/declare not portable'; >> diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh >> index 557bd0d0c0..7bff7923f2 100755 >> --- a/t/t5310-pack-bitmaps.sh >> +++ b/t/t5310-pack-bitmaps.sh >> @@ -335,7 +335,7 @@ test_expect_success 'truncated bitmap fails gracefully' ' >> git rev-list --use-bitmap-index --count --all >expect && >> bitmap=$(ls .git/objects/pack/*.bitmap) && >> test_when_finished "rm -f $bitmap" && >> -head -c 512 <$bitmap >$bitmap.tmp && >> +test_copy_bytes 512 <$bitmap >$bitmap.tmp && >> mv -f $bitmap.tmp $bitmap && >> git rev-list --use-bitmap-index --count --all >actual 2>stderr && >> test_cmp expect actual && >> diff --git a/t/test-lib.sh b/t/test-lib.sh >> index 8bb0f4348e..44288cbb59 100644 >> --- a/t/test-lib.sh >> +++ b/t/test-lib.sh >> @@ -867,7 +867,7 @@ then >> # handle only executables, unless they are shell libraries that >> # need to be in the exec-path. >> test -x "$1" || >> -test "# " = "$(head -c 2 <"$1")" || >> +test "# " = "$(test_copy_bytes 2 <"$1")" || >> return; >> >> base=$(basename "$1") >> @@ -882,7 +882,7 @@ then >> # do not override scripts >> if test -x "$symlink_target" && >> test ! -d "$symlink_target" && >> -test "#!" != "$(head -c 2 < "$symlink_target")" >> +test "#!" != "$(test_copy_bytes 2 <"$symlink_target")" >> then >> symlink_target=../valgrind.sh >> fi
Re: [PATCH v2 2/2] tests: fix and add lint for non-portable seq
On Thu, Aug 23 2018, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason writes: > >> GNU seq is not a POSIX command, and doesn't exist on > > s/GNU //; the command did not even originate there, but came from V8 > and/or Plan9 IIRC. > >> e.g. OpenBSD. We've had the test_seq wrapper since d17cf5f3a3 ("tests: >> Introduce test_seq", 2012-08-04), but use of it keeps coming back, >> e.g. in the recently added "fetch negotiator" tests being added here. >> >> So let's also add a check to "make test-lint". The regex is aiming to >> capture the likes of $(seq ..) and "seq" as a stand-alone command, >> without capturing some existing cases where we e.g. have files called >> "seq". >> >> Signed-off-by: Ævar Arnfjörð Bjarmason >> --- >> >> Now with a fix & check in v2 for the seq issue mentioned in >> https://public-inbox.org/git/87a7pdfltp@evledraar.gmail.com/ > > Thanks. > >> t/check-non-portable-shell.pl| 1 + >> t/t5552-skipping-fetch-negotiator.sh | 12 ++-- >> t/t5703-upload-pack-ref-in-want.sh | 4 ++-- >> 3 files changed, 9 insertions(+), 8 deletions(-) >> >> diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl >> index c8f10d40a1..75f38298d7 100755 >> --- a/t/check-non-portable-shell.pl >> +++ b/t/check-non-portable-shell.pl >> @@ -42,6 +42,7 @@ sub err { >> /\btest\s+[^=]*==/ and err '"test a == b" is not portable (use =)'; >> /\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use >> test_line_count)'; >> /\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes >> BYTES out)'; >> +/(?:\$\(seq|^\s*seq\b)/ and err 'seq is not portable (use test_seq)'; > > Looking at $(wc -l) thing a few lines above, I am not sure if this > deviation is a good idea. Wouldn't /\bseq\s/ be sufficient? As alluded to in the commit message that will find cases like: $ git grep -P '\bseq\b' -- t | grep -P -v '(?:\$\(seq|^\s*seq\b)' t/perf/p3400-rebase.sh:16: seq 1000 >>unrelated-file$i && t/perf/p3400-rebase.sh:21: seq 1000 | tac >>unrelated-file$i && t/t3404-rebase-interactive.sh:1227: test_seq 5 | sed "s/$double/&&/" >seq && t/t3404-rebase-interactive.sh:1228: git add seq && t/t3404-rebase-interactive.sh:1230: git commit -m seq-$double t/t3404-rebase-interactive.sh:1232: git tag seq-onto && t/t3404-rebase-interactive.sh:1234: git cherry-pick seq-onto && t/t3404-rebase-interactive.sh:1236: test_must_fail env FAKE_LINES= git rebase -i seq-onto && t/t3404-rebase-interactive.sh:1239: git diff --exit-code seq-onto && t/t4022-diff-rewrite.sh:69: test_seq 1 99 >seq && t/t4022-diff-rewrite.sh:70: printf 100 >>seq && t/t4022-diff-rewrite.sh:71: git add seq && t/t4022-diff-rewrite.sh:72: git commit seq -m seq t/t4022-diff-rewrite.sh:76: test_seq 1 5 >seq && t/t4022-diff-rewrite.sh:77: test_seq 9331 9420 >>seq && t/t4022-diff-rewrite.sh:78: test_seq 96 100 >>seq t/t4022-diff-rewrite.sh:82: git diff -B seq >res && t/t4022-diff-rewrite.sh:87: git diff seq >res && t/t4022-diff-rewrite.sh:93: git diff -B seq >res && t/t4150-am.sh:933: git format-patch -2 --stdout >seq.patch && t/t4150-am.sh:945: test_must_fail git am -3 seq.patch && t/t4150-am.sh:955: test_must_fail git am -3 seq.patch && t/t5520-pull.sh:295:git checkout -b seq && t/t5520-pull.sh:296:test_seq 5 >seq.txt && t/t5520-pull.sh:297:git add seq.txt && t/t5520-pull.sh:299:git commit -m "Add seq.txt" && t/t5520-pull.sh:300:echo 6 >>seq.txt && t/t5520-pull.sh:302:git commit -m "Append to seq.txt" seq.txt && t/t5520-pull.sh:304:echo conflicting >>seq.txt && t/t5520-pull.sh:306:git commit -m "Create conflict" seq.txt && t/t5520-pull.sh:307:test_must_fail git pull --rebase . seq 2>err >out &&
Re: [PATCH v2] config: fix commit-graph related config docs
Derrick Stolee writes: > The core.commitGraph config setting was accidentally removed from > the config documentation. In that same patch, the config setting > that writes a commit-graph during garbage collection was incorrectly > written to the doc as "gc.commitGraph" instead of "gc.writeCommitGraph". > > Reported-by: Szeder Gábor > Signed-off-by: Derrick Stolee > --- > Documentation/config.txt | 17 +++-- > 1 file changed, 11 insertions(+), 6 deletions(-) Thanks; that's an appropriate update before the release to tie the final loose ends. Will apply. > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1c42364988..6ee1890984 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -917,12 +917,10 @@ core.notesRef:: > This setting defaults to "refs/notes/commits", and it can be overridden by > the `GIT_NOTES_REF` environment variable. See linkgit:git-notes[1]. > > -gc.commitGraph:: > - If true, then gc will rewrite the commit-graph file when > - linkgit:git-gc[1] is run. When using linkgit:git-gc[1] > - '--auto' the commit-graph will be updated if housekeeping is > - required. Default is false. See linkgit:git-commit-graph[1] > - for details. > +core.commitGraph:: > + If true, then git will read the commit-graph file (if it exists) > + to parse the graph structure of commits. Defaults to false. See > + linkgit:git-commit-graph[1] for more information. > > core.useReplaceRefs:: > If set to `false`, behave as if the `--no-replace-objects` > @@ -1763,6 +1761,13 @@ this configuration variable is ignored, all packs > except the base pack > will be repacked. After this the number of packs should go below > gc.autoPackLimit and gc.bigPackThreshold should be respected again. > > +gc.writeCommitGraph:: > + If true, then gc will rewrite the commit-graph file when > + linkgit:git-gc[1] is run. When using linkgit:git-gc[1] > + '--auto' the commit-graph will be updated if housekeeping is > + required. Default is false. See linkgit:git-commit-graph[1] > + for details. > + > gc.logExpiry:: > If the file gc.log exists, then `git gc --auto` won't run > unless that file is more than 'gc.logExpiry' old. Default is
wide t/perf output, was Re: [ANNOUNCE] Git v2.19.0-rc0
On Thu, Aug 23, 2018 at 06:20:26AM -0700, Junio C Hamano wrote: > Jeff King writes: > > > Here are numbers for p0001.2 run against linux.git on a few > > versions. This is using -O2 with gcc 8.2.0. > > > > Test v2.18.0 v2.19.0-rc0 HEAD > > > > -- > > 0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) > > -1.0% > > I see what you did to the formatting here, which is a topic of > another thread ;-). Do you happen to have a link? I missed that one, and digging turned up nothing. A while ago I wrote the patch below. I don't recall why I never sent it in (and it doesn't apply cleanly these days, though I'm sure it could be forward-ported). -- >8 -- Date: Wed, 20 Jan 2016 23:54:14 -0500 Subject: [PATCH] t/perf: add "tall" output format When aggregating results, we usually show a list of tests, one per line, with the tested revisions in columns across. Like: $ ./aggregate.perl 348d4f2^ 348d4f2 p7000-filter-branch.sh Test 348d4f2^ 348d4f2 --- 7000.2: noop filter 295.32(269.61+14.36) 7.92(0.85+0.72) -97.3% This is useful if you have a lot of tests to show, but few revisions; you're effectively comparing the two items on each line. But sometimes you have the opposite: few tests, but a large number of revisions. In this case, the lines get very long, and it's hard to compare values. This patch introduces a "tall" format that shows the same data in a more vertical manner: $ ./aggregate.perl --tall \ 348d4f2^ 348d4f2 \ jk/filter-branch-empty^ \ jk/filter-branch-empty \ p7000-filter-branch.sh Test: p7000-filter-branch.2 348d4f2^ 295.32(269.61+14.36) 348d4f27.92(0.85+0.72) -97.3% jk/filter-branch-empty^9.37(0.87+0.80) -96.8% jk/filter-branch-empty 7.71(0.92+0.62) -97.4% Signed-off-by: Jeff King --- t/perf/aggregate.perl | 124 ++ 1 file changed, 88 insertions(+), 36 deletions(-) diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl index e401208488..d108a02ccd 100755 --- a/t/perf/aggregate.perl +++ b/t/perf/aggregate.perl @@ -17,29 +17,41 @@ sub get_times { return ($rt, $4, $5); } -sub format_times { +sub format_times_list { my ($r, $u, $s, $firstr) = @_; if (!defined $r) { return ""; } my $out = sprintf "%.2f(%.2f+%.2f)", $r, $u, $s; + my $pct; if (defined $firstr) { if ($firstr > 0) { - $out .= sprintf " %+.1f%%", 100.0*($r-$firstr)/$firstr; + $pct = sprintf "%+.1f%%", 100.0*($r-$firstr)/$firstr; } elsif ($r == 0) { - $out .= " ="; + $pct = "="; } else { - $out .= " +inf"; + $pct = "+inf"; } } - return $out; + return ($out, $pct); +} + +sub format_times { + my ($times, $pct) = format_times_list(@_); + return defined $pct ? "$times $pct" : $times; } my (@dirs, %dirnames, %dirabbrevs, %prefixes, @tests); +my ($tall_format); while (scalar @ARGV) { my $arg = $ARGV[0]; my $dir; last if -f $arg or $arg eq "--"; + if ($arg eq "--tall") { + $tall_format = 1; + shift @ARGV; + next; + } if (! -d $arg) { my $rev = Git::command_oneline(qw(rev-parse --verify), $arg); $dir = "build/".$rev; @@ -122,6 +134,11 @@ sub have_slash { return 0; } +sub printable_dir { + my ($d) = @_; + return exists $dirabbrevs{$d} ? $dirabbrevs{$d} : $dirnames{$d}; +} + my %newdirabbrevs = %dirabbrevs; while (!have_duplicate(values %newdirabbrevs)) { %dirabbrevs = %newdirabbrevs; @@ -132,44 +149,79 @@ sub have_slash { } } -my %times; -my @colwidth = ((0)x@dirs); -for my $i (0..$#dirs) { - my $d = $dirs[$i]; - my $w = length (exists $dirabbrevs{$d} ? $dirabbrevs{$d} : $dirnames{$d}); - $colwidth[$i] = $w if $w > $colwidth[$i]; -} -for my $t (@subtests) { - my $firstr; +binmode STDOUT, ":utf8" or die "PANIC on binmode: $!"; + +if (!$tall_format) { + my %times; + my @colwidth = ((0)x@dirs); for my $i (0..$#dirs) { my $d = $dirs[$i]; - $times{$prefixes{$d}.$t} = [get_times("$resultsdir/$prefixes{$d}$t.times")]; - my ($r,$u,$s) = @{$times{$prefixes{$d}.$t}}; - my $w = length format_times($r,$u,$s,$firstr); + my $w = length(printable_dir($d)); $colwidth[$i] = $w if $w > $colwidth[$i]; - $firstr = $r unless defined $firstr; } -} -my
Re: [PATCH] t/lib-rebase.sh: support explicit 'pick' commands in 'fake_editor.sh'
SZEDER Gábor writes: > diff --git a/t/lib-rebase.sh b/t/lib-rebase.sh > index 25a77ee5cb..592865d019 100644 > --- a/t/lib-rebase.sh > +++ b/t/lib-rebase.sh > @@ -47,7 +47,7 @@ set_fake_editor () { > action=pick > for line in $FAKE_LINES; do > case $line in > - squash|fixup|edit|reword|drop) > + pick|squash|fixup|edit|reword|drop) > action="$line";; > exec*) > echo "$line" | sed 's/_/ /g' >> "$1";; There is a large comment before the function that says ... # The following word combinations are possible: # # "" -- add a "pick" line with the SHA1 taken from the # specified line. # # " " -- add a line with the specified command # ("squash", "fixup", "edit", "reword" or "drop") and the SHA1 taken # from the specified line. # # "exec_cmd_with_args" -- add an "exec cmd with args" line. ... which probably needs an update, too.
Re: [ANNOUNCE] Git v2.19.0-rc0
On Thu, Aug 23, 2018 at 06:26:58AM -0400, Derrick Stolee wrote: > Around the time that my proposed approaches were getting vetoed for > alignment issues, I figured I was out of my depth here. I reached out to > Daniel Lemire (of EWAH bitmap fame) on Twitter [1]. His blog is full of > posts of word-based approaches to different problems, so I thought he might > know something off the top of his head that would be applicable. His > conclusion (after looking only a short time) was to take a 'hasheq' approach > [2] like Peff suggested [3]. Since that requires auditing all callers of > hashcmp to see if hasheq is appropriate, it is not a good solution for 2.19 > but (in my opinion) should be evaluated as part of the 2.20 cycle. I think that audit isn't actually too bad (but definitely not something we should do for v2.19). The cocci patch I showed earlier hits most of them. It misses the negated ones (e.g., "if (oidcmp(...))"). I'm not sure if there's a way to ask coccinelle only for oidcmp() used in a boolean context. Just skimming the grep output, it's basically every call except the ones we use for binary search. -Peff
Re: Questions about the hash function transition
Ævar Arnfjörð Bjarmason writes: > On Thu, Aug 23 2018, Junio C Hamano wrote: > >> Ævar Arnfjörð Bjarmason writes: >> - The trailer consists of the following: - A copy of the 20-byte SHA-256 checksum at the end of the corresponding packfile. - 20-byte SHA-256 checksum of all of the above. >>> >>> We need to update both of these to 32 byte, right? Or are we planning to >>> truncate the checksums? >> >> https://public-inbox.org/git/ca+55afwc7uq61ebnj36pfu_abcxgya4jut-tvppj21hkhre...@mail.gmail.com/ > > Thanks. > > Yeah for this checksum purpose even 10 or 5 characters would do, but > since we'll need a new pack format anyway for SHA-256 why not just use > the full length of the SHA-256 here? We're using the full length of the > SHA-1. > > I don't see it mattering for security / corruption detection purposes, > but just to avoid confusion. We'll have this one place left where > something looks like a SHA-1, but is actually a trunctated SHA-256. I would prefer to see us at least explore if the gain in throughput is sufficiently big if we switch to weaker checksum, like crc32. If does not give us sufficient gain, I'd agree with you that consistently using full hash everywhere would conceptually be cleaner.
Re: [PATCH v2 1/2] tests: fix and add lint for non-portable head -c N
On Thu, Aug 23, 2018 at 03:25:01PM +, Ævar Arnfjörð Bjarmason wrote: > The "head -c BYTES" option is non-portable (not in POSIX[1]). Change > such invocations to use the test_copy_bytes wrapper added in > 48860819e8 ("t9300: factor out portable "head -c" replacement", > 2016-06-30). > > This fixes a test added in 9d2e330b17 ("ewah_read_mmap: bounds-check > mmap reads", 2018-06-14), which has been breaking > t5310-pack-bitmaps.sh on OpenBSD since 2.18.0. The OpenBSD ports > already have a similar workaround after their upgrade to 2.18.0[2]. Heh, I even considered using this when writing that test. But the reason I introduced test_copy_bytes is not because the target platform did not have "head -c" at all, but because some tests need very specific buffering guarantees when reading from a shared pipe. That said, if OpenBSD's "head" doesn't have "-c" at all, I'm fine with this as a fix (and it sounds like we know that IRIX lacks it, too). > Also, change a valgrind-specific codepath in test-lib.sh to use this > wrapper. Given where valgrind runs I don't think this would ever > become a portability issue in practice, but it's easier to just use > the wrapper than introduce some exception for the "make test-lint" > check being added here. When working on 9d2e330b17, I recall finding these other "head -c" invocations in the test suite when I did 9d2e330b17 and took them as evidence that it was OK to use in vanilla cases. So even if these sites don't affect any platforms in practice, I think it's worth it to ban "head -c" completely. > Signed-off-by: Ævar Arnfjörð Bjarmason > --- > t/check-non-portable-shell.pl | 1 + > t/t5310-pack-bitmaps.sh | 2 +- > t/test-lib.sh | 4 ++-- > 3 files changed, 4 insertions(+), 3 deletions(-) Patch itself looks good to me. Thanks. -Peff
Re: [PATCH v2 2/2] tests: fix and add lint for non-portable seq
Ævar Arnfjörð Bjarmason writes: > GNU seq is not a POSIX command, and doesn't exist on s/GNU //; the command did not even originate there, but came from V8 and/or Plan9 IIRC. > e.g. OpenBSD. We've had the test_seq wrapper since d17cf5f3a3 ("tests: > Introduce test_seq", 2012-08-04), but use of it keeps coming back, > e.g. in the recently added "fetch negotiator" tests being added here. > > So let's also add a check to "make test-lint". The regex is aiming to > capture the likes of $(seq ..) and "seq" as a stand-alone command, > without capturing some existing cases where we e.g. have files called > "seq". > > Signed-off-by: Ævar Arnfjörð Bjarmason > --- > > Now with a fix & check in v2 for the seq issue mentioned in > https://public-inbox.org/git/87a7pdfltp@evledraar.gmail.com/ Thanks. > t/check-non-portable-shell.pl| 1 + > t/t5552-skipping-fetch-negotiator.sh | 12 ++-- > t/t5703-upload-pack-ref-in-want.sh | 4 ++-- > 3 files changed, 9 insertions(+), 8 deletions(-) > > diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl > index c8f10d40a1..75f38298d7 100755 > --- a/t/check-non-portable-shell.pl > +++ b/t/check-non-portable-shell.pl > @@ -42,6 +42,7 @@ sub err { > /\btest\s+[^=]*==/ and err '"test a == b" is not portable (use =)'; > /\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use > test_line_count)'; > /\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes > BYTES out)'; > + /(?:\$\(seq|^\s*seq\b)/ and err 'seq is not portable (use test_seq)'; Looking at $(wc -l) thing a few lines above, I am not sure if this deviation is a good idea. Wouldn't /\bseq\s/ be sufficient?
Re: Test failures on OpenBSD
Ævar Arnfjörð Bjarmason writes: > + echo Error (-1) reading configuration file a-directory. > + > expect > + mkdir a-directory > + test_expect_code 2 test-tool config configset_get_value foo.bar > a-directory > + 2> output > Value not found for "foo.bar" > test_expect_code: command exited with 1, we wanted 2 test-tool config > configset_get_value foo.bar a-directory > error: last command exited with $?=1 > not ok 23 - proper error on directory "files" A blind guess. is this "BSD allows opening a FILE* on a directory and reading bytes from it" aka FREAD_READS_DIRECTORIES? I wonder why FreeBSD sets it but not OpenBSD in config.mak.uname. > + make_dir extract > + tar xf subtree-all.tar -C extract > tar: Cannot identify format. Searching... > tar: End of archive volume 1 reached > tar: Sorry, unable to determine archive format. > error: last command exited with $?=1 Perhaps modern tar elements like extended headers are not supported on the platform-supplied "tar"? "make TAR=gnutar"? > + seq 11 > ./t5552-skipping-fetch-negotiator.sh: seq: not found This must use test_seq. t/t5552-skipping-fetch-negotiator.sh | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/t/t5552-skipping-fetch-negotiator.sh b/t/t5552-skipping-fetch-negotiator.sh index 5ad5bece55..30857b84a8 100755 --- a/t/t5552-skipping-fetch-negotiator.sh +++ b/t/t5552-skipping-fetch-negotiator.sh @@ -46,7 +46,7 @@ test_expect_success 'commits with no parents are sent regardless of skip distanc test_commit -C server to_fetch && git init client && - for i in $(seq 7) + for i in $(test_seq 7) do test_commit -C client c$i done && @@ -89,7 +89,7 @@ test_expect_success 'when two skips collide, favor the larger one' ' test_commit -C server to_fetch && git init client && - for i in $(seq 11) + for i in $(test_seq 11) do test_commit -C client c$i done && @@ -168,14 +168,14 @@ test_expect_success 'do not send "have" with ancestors of commits that server AC test_commit -C server to_fetch && git init client && - for i in $(seq 8) + for i in $(test_seq 8) do git -C client checkout --orphan b$i && test_commit -C client b$i.c0 done && - for j in $(seq 19) + for j in $(test_seq 19) do - for i in $(seq 8) + for i in $(test_seq 8) do git -C client checkout b$i && test_commit -C client b$i.c$j @@ -205,7 +205,7 @@ test_expect_success 'do not send "have" with ancestors of commits that server AC # fetch-pack should thus not send any more commits in the b1 branch, but # should still send the others (in this test, just check b2). - for i in $(seq 0 8) + for i in $(test_seq 0 8) do have_not_sent b1.c$i done &&
[PATCH v2] config: fix commit-graph related config docs
The core.commitGraph config setting was accidentally removed from the config documentation. In that same patch, the config setting that writes a commit-graph during garbage collection was incorrectly written to the doc as "gc.commitGraph" instead of "gc.writeCommitGraph". Reported-by: Szeder Gábor Signed-off-by: Derrick Stolee --- Documentation/config.txt | 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 1c42364988..6ee1890984 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -917,12 +917,10 @@ core.notesRef:: This setting defaults to "refs/notes/commits", and it can be overridden by the `GIT_NOTES_REF` environment variable. See linkgit:git-notes[1]. -gc.commitGraph:: - If true, then gc will rewrite the commit-graph file when - linkgit:git-gc[1] is run. When using linkgit:git-gc[1] - '--auto' the commit-graph will be updated if housekeeping is - required. Default is false. See linkgit:git-commit-graph[1] - for details. +core.commitGraph:: + If true, then git will read the commit-graph file (if it exists) + to parse the graph structure of commits. Defaults to false. See + linkgit:git-commit-graph[1] for more information. core.useReplaceRefs:: If set to `false`, behave as if the `--no-replace-objects` @@ -1763,6 +1761,13 @@ this configuration variable is ignored, all packs except the base pack will be repacked. After this the number of packs should go below gc.autoPackLimit and gc.bigPackThreshold should be respected again. +gc.writeCommitGraph:: + If true, then gc will rewrite the commit-graph file when + linkgit:git-gc[1] is run. When using linkgit:git-gc[1] + '--auto' the commit-graph will be updated if housekeeping is + required. Default is false. See linkgit:git-commit-graph[1] + for details. + gc.logExpiry:: If the file gc.log exists, then `git gc --auto` won't run unless that file is more than 'gc.logExpiry' old. Default is -- 2.19.0.rc0
[PATCH v1] read-cache: speed up index load through parallelization
This patch helps address the CPU cost of loading the index by creating multiple threads to divide the work of loading and converting the cache entries across all available CPU cores. It accomplishes this by having the primary thread loop across the index file tracking the offset and (for V4 indexes) expanding the name. It creates a thread to process each block of entries as it comes to them. Once the threads are complete and the cache entries are loaded, the rest of the extensions can be loaded and processed normally on the primary thread. Performance impact: read cache .git/index times on a synthetic repo with: 100,000 entries FALSE TRUESavings %Savings 0.014798767 0.009580433 0.005218333 35.26% 1,000,000 entries FALSE TRUESavings %Savings 0.240896533 0.1751243 0.065772233 27.30% read cache .git/index times on an actual repo with: ~3M entries FALSE TRUESavings %Savings 0.59898098 0.4513169 0.14766408 24.65% Signed-off-by: Ben Peart --- Notes: Base Ref: master Web-Diff: https://github.com/benpeart/git/commit/67a700419b Checkout: git fetch https://github.com/benpeart/git read-index-multithread-v1 && git checkout 67a700419b Documentation/config.txt | 8 ++ config.c | 13 +++ config.h | 1 + read-cache.c | 218 ++- 4 files changed, 216 insertions(+), 24 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 1c42364988..3344685cc4 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -899,6 +899,14 @@ relatively high IO latencies. When enabled, Git will do the index comparison to the filesystem data in parallel, allowing overlapping IO's. Defaults to true. +core.fastIndex:: + Enable parallel index loading ++ +This can speed up operations like 'git diff' and 'git status' especially +when the index is very large. When enabled, Git will do the index +loading from the on disk format to the in-memory format in parallel. +Defaults to true. + core.createObject:: You can set this to 'link', in which case a hardlink followed by a delete of the source are used to make sure that object creation diff --git a/config.c b/config.c index 9a0b10d4bc..883092fdd3 100644 --- a/config.c +++ b/config.c @@ -2289,6 +2289,19 @@ int git_config_get_fsmonitor(void) return 0; } +int git_config_get_fast_index(void) +{ + int val; + + if (!git_config_get_maybe_bool("core.fastindex", )) + return val; + + if (getenv("GIT_FASTINDEX_TEST")) + return 1; + + return -1; /* default value */ +} + NORETURN void git_die_config_linenr(const char *key, const char *filename, int linenr) { diff --git a/config.h b/config.h index ab46e0165d..74ca4e7db5 100644 --- a/config.h +++ b/config.h @@ -250,6 +250,7 @@ extern int git_config_get_untracked_cache(void); extern int git_config_get_split_index(void); extern int git_config_get_max_percent_split_change(void); extern int git_config_get_fsmonitor(void); +extern int git_config_get_fast_index(void); /* This dies if the configured or default date is in the future */ extern int git_config_get_expiry(const char *key, const char **output); diff --git a/read-cache.c b/read-cache.c index 7b1354d759..0fa7e1a04c 100644 --- a/read-cache.c +++ b/read-cache.c @@ -24,6 +24,10 @@ #include "utf8.h" #include "fsmonitor.h" +#ifndef min +#define min(a,b) (((a) < (b)) ? (a) : (b)) +#endif + /* Mask for the name length in ce_flags in the on-disk index */ #define CE_NAMEMASK (0x0fff) @@ -1889,16 +1893,203 @@ static size_t estimate_cache_size(size_t ondisk_size, unsigned int entries) return ondisk_size + entries * per_entry; } +static unsigned long load_cache_entry_block(struct index_state *istate, struct mem_pool *ce_mem_pool, int offset, int nr, void *mmap, unsigned long start_offset, struct strbuf *previous_name) +{ + int i; + unsigned long src_offset = start_offset; + + for (i = offset; i < offset + nr; i++) { + struct ondisk_cache_entry *disk_ce; + struct cache_entry *ce; + unsigned long consumed; + + disk_ce = (struct ondisk_cache_entry *)((char *)mmap + src_offset); + ce = create_from_disk(ce_mem_pool, disk_ce, , previous_name); + set_index_entry(istate, i, ce); + + src_offset += consumed; + } + return src_offset - start_offset; +} + +static unsigned long load_all_cache_entries(struct index_state *istate, void *mmap, size_t mmap_size, unsigned long src_offset) +{ + struct strbuf previous_name_buf = STRBUF_INIT, *previous_name; + unsigned long consumed; + + if (istate->version == 4) { + previous_name = _name_buf; + mem_pool_init(>ce_mem_pool, +
Re: [PATCH] tests: fix and add lint for non-portable head -c N
Ævar Arnfjörð Bjarmason writes: > Junio: Even though this isn't a 2.19.0-rc0 regression I think it makes > sense for 2.19.0. The fix is trivial, and it'll unbreak (at least some > of) the tests on stock git on OpenBSD. I would have been a bit more sympathetic if this were recent regression (say, 2.18 or so), even if it is not new in this cycle, and the patch applied to the target track cleanly (say, maint or maint-2.17), but seeing that some of these are quite old, dating back to 2009, I am not sure I am enthused. The changes certainly look trivial. Let's apply. Thanks. > > OpenBSD guys: If you CC the git mailing list when you find you need to > apply patches like these, we're happy to fix this more pro-actively. I > just happened to be testing the upcoming 2.19.0 on OpenBSD and spotted > this. > > t/check-non-portable-shell.pl | 1 + > t/t5310-pack-bitmaps.sh | 2 +- > t/test-lib.sh | 4 ++-- > 3 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl > index d5823f71d8..94a7e6165e 100755 > --- a/t/check-non-portable-shell.pl > +++ b/t/check-non-portable-shell.pl > @@ -35,6 +35,7 @@ sub err { > chomp; > } > > + /\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes > BYTES out)'; > /\bsed\s+-i/ and err 'sed -i is not portable'; > /\becho\s+-[neE]/ and err 'echo with option is not portable (use > printf)'; > /^\s*declare\s+/ and err 'arrays/declare not portable'; > diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh > index 557bd0d0c0..7bff7923f2 100755 > --- a/t/t5310-pack-bitmaps.sh > +++ b/t/t5310-pack-bitmaps.sh > @@ -335,7 +335,7 @@ test_expect_success 'truncated bitmap fails gracefully' ' > git rev-list --use-bitmap-index --count --all >expect && > bitmap=$(ls .git/objects/pack/*.bitmap) && > test_when_finished "rm -f $bitmap" && > - head -c 512 <$bitmap >$bitmap.tmp && > + test_copy_bytes 512 <$bitmap >$bitmap.tmp && > mv -f $bitmap.tmp $bitmap && > git rev-list --use-bitmap-index --count --all >actual 2>stderr && > test_cmp expect actual && > diff --git a/t/test-lib.sh b/t/test-lib.sh > index 8bb0f4348e..44288cbb59 100644 > --- a/t/test-lib.sh > +++ b/t/test-lib.sh > @@ -867,7 +867,7 @@ then > # handle only executables, unless they are shell libraries that > # need to be in the exec-path. > test -x "$1" || > - test "# " = "$(head -c 2 <"$1")" || > + test "# " = "$(test_copy_bytes 2 <"$1")" || > return; > > base=$(basename "$1") > @@ -882,7 +882,7 @@ then > # do not override scripts > if test -x "$symlink_target" && > test ! -d "$symlink_target" && > - test "#!" != "$(head -c 2 < "$symlink_target")" > + test "#!" != "$(test_copy_bytes 2 <"$symlink_target")" > then > symlink_target=../valgrind.sh > fi
Re: [PATCH/RFC] commit: new option to abort -a something is already staged
Duy Nguyen writes: > On Thu, Aug 23, 2018 at 4:15 AM Jonathan Nieder wrote: >> > The remaining question becomes scripts. A script might do >> > >> > ... modify old-file and create new-file ... >> > git add new-file >> > git commit -m "some great message" >> > >> > which would trip this error. For that matter, humans might do that, >> > too. Could the check detect this case (where the only changes in the >> > index are additions of new files) and treat it as non-destructive? >> >> (where the only changes in the index are additions of new files *that >> match the worktree*) > > I don't think my change would affect this case. If "git commit" does > not change the index by itself, there's no point in stopping it. I think the above example forgets "-a" on the final "git commit" step. With it added, I can understand the concern (and I am sure you would, too). The user is trying to add everything done in the working tree, and "commit -a" would catch all changes to paths that were already tracked, but a separate "add" is necessary for newly created paths. But adding a new path means the index no longer matches HEAD, and the "commit -a" at the final step sweeps the changes to already tracked paths---failing that because there "already is something staged" will break the workflow.
[PATCH v2 1/2] tests: fix and add lint for non-portable head -c N
The "head -c BYTES" option is non-portable (not in POSIX[1]). Change such invocations to use the test_copy_bytes wrapper added in 48860819e8 ("t9300: factor out portable "head -c" replacement", 2016-06-30). This fixes a test added in 9d2e330b17 ("ewah_read_mmap: bounds-check mmap reads", 2018-06-14), which has been breaking t5310-pack-bitmaps.sh on OpenBSD since 2.18.0. The OpenBSD ports already have a similar workaround after their upgrade to 2.18.0[2]. I have not tested this on IRIX, but according to 4de0bbd898 ("t9300: use perl "head -c" clone in place of "dd bs=1 count=16000" kluge", 2010-12-13) this invocation would have broken things there too. Also, change a valgrind-specific codepath in test-lib.sh to use this wrapper. Given where valgrind runs I don't think this would ever become a portability issue in practice, but it's easier to just use the wrapper than introduce some exception for the "make test-lint" check being added here. 1. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/head.html 2. https://github.com/openbsd/ports/commit/08d5d82eaefe5cf2f125ecc0c6a57df9cf91350c#diff-f7d3c4fabeed1691620d608f1534f5e5 Signed-off-by: Ævar Arnfjörð Bjarmason --- t/check-non-portable-shell.pl | 1 + t/t5310-pack-bitmaps.sh | 2 +- t/test-lib.sh | 4 ++-- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl index d5823f71d8..c8f10d40a1 100755 --- a/t/check-non-portable-shell.pl +++ b/t/check-non-portable-shell.pl @@ -41,6 +41,7 @@ sub err { /^\s*[^#]\s*which\s/ and err 'which is not portable (use type)'; /\btest\s+[^=]*==/ and err '"test a == b" is not portable (use =)'; /\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use test_line_count)'; + /\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes BYTES out)'; /\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable (use FOO=bar && export FOO)'; /^\s*([A-Z0-9_]+=(\w+|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and err '"FOO=bar shell_func" assignment extends beyond "shell_func"'; diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index 557bd0d0c0..7bff7923f2 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -335,7 +335,7 @@ test_expect_success 'truncated bitmap fails gracefully' ' git rev-list --use-bitmap-index --count --all >expect && bitmap=$(ls .git/objects/pack/*.bitmap) && test_when_finished "rm -f $bitmap" && - head -c 512 <$bitmap >$bitmap.tmp && + test_copy_bytes 512 <$bitmap >$bitmap.tmp && mv -f $bitmap.tmp $bitmap && git rev-list --use-bitmap-index --count --all >actual 2>stderr && test_cmp expect actual && diff --git a/t/test-lib.sh b/t/test-lib.sh index 8bb0f4348e..44288cbb59 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -867,7 +867,7 @@ then # handle only executables, unless they are shell libraries that # need to be in the exec-path. test -x "$1" || - test "# " = "$(head -c 2 <"$1")" || + test "# " = "$(test_copy_bytes 2 <"$1")" || return; base=$(basename "$1") @@ -882,7 +882,7 @@ then # do not override scripts if test -x "$symlink_target" && test ! -d "$symlink_target" && - test "#!" != "$(head -c 2 < "$symlink_target")" + test "#!" != "$(test_copy_bytes 2 <"$symlink_target")" then symlink_target=../valgrind.sh fi -- 2.18.0.865.gffc8e1a3cd6
[PATCH v2 2/2] tests: fix and add lint for non-portable seq
GNU seq is not a POSIX command, and doesn't exist on e.g. OpenBSD. We've had the test_seq wrapper since d17cf5f3a3 ("tests: Introduce test_seq", 2012-08-04), but use of it keeps coming back, e.g. in the recently added "fetch negotiator" tests being added here. So let's also add a check to "make test-lint". The regex is aiming to capture the likes of $(seq ..) and "seq" as a stand-alone command, without capturing some existing cases where we e.g. have files called "seq". Signed-off-by: Ævar Arnfjörð Bjarmason --- Now with a fix & check in v2 for the seq issue mentioned in https://public-inbox.org/git/87a7pdfltp@evledraar.gmail.com/ t/check-non-portable-shell.pl| 1 + t/t5552-skipping-fetch-negotiator.sh | 12 ++-- t/t5703-upload-pack-ref-in-want.sh | 4 ++-- 3 files changed, 9 insertions(+), 8 deletions(-) diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl index c8f10d40a1..75f38298d7 100755 --- a/t/check-non-portable-shell.pl +++ b/t/check-non-portable-shell.pl @@ -42,6 +42,7 @@ sub err { /\btest\s+[^=]*==/ and err '"test a == b" is not portable (use =)'; /\bwc -l.*"\s*=/ and err '`"$(wc -l)"` is not portable (use test_line_count)'; /\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes BYTES out)'; + /(?:\$\(seq|^\s*seq\b)/ and err 'seq is not portable (use test_seq)'; /\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable (use FOO=bar && export FOO)'; /^\s*([A-Z0-9_]+=(\w+|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and err '"FOO=bar shell_func" assignment extends beyond "shell_func"'; diff --git a/t/t5552-skipping-fetch-negotiator.sh b/t/t5552-skipping-fetch-negotiator.sh index 5ad5bece55..30857b84a8 100755 --- a/t/t5552-skipping-fetch-negotiator.sh +++ b/t/t5552-skipping-fetch-negotiator.sh @@ -46,7 +46,7 @@ test_expect_success 'commits with no parents are sent regardless of skip distanc test_commit -C server to_fetch && git init client && - for i in $(seq 7) + for i in $(test_seq 7) do test_commit -C client c$i done && @@ -89,7 +89,7 @@ test_expect_success 'when two skips collide, favor the larger one' ' test_commit -C server to_fetch && git init client && - for i in $(seq 11) + for i in $(test_seq 11) do test_commit -C client c$i done && @@ -168,14 +168,14 @@ test_expect_success 'do not send "have" with ancestors of commits that server AC test_commit -C server to_fetch && git init client && - for i in $(seq 8) + for i in $(test_seq 8) do git -C client checkout --orphan b$i && test_commit -C client b$i.c0 done && - for j in $(seq 19) + for j in $(test_seq 19) do - for i in $(seq 8) + for i in $(test_seq 8) do git -C client checkout b$i && test_commit -C client b$i.c$j @@ -205,7 +205,7 @@ test_expect_success 'do not send "have" with ancestors of commits that server AC # fetch-pack should thus not send any more commits in the b1 branch, but # should still send the others (in this test, just check b2). - for i in $(seq 0 8) + for i in $(test_seq 0 8) do have_not_sent b1.c$i done && diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh index a73c55a47e..d1ccc22331 100755 --- a/t/t5703-upload-pack-ref-in-want.sh +++ b/t/t5703-upload-pack-ref-in-want.sh @@ -176,7 +176,7 @@ test_expect_success 'setup repos for change-while-negotiating test' ' git clone "http://127.0.0.1:$LIB_HTTPD_PORT/smart/repo; "$LOCAL_PRISTINE" && cd "$LOCAL_PRISTINE" && git checkout -b side && - for i in $(seq 1 33); do test_commit s$i; done && + for i in $(test_seq 1 33); do test_commit s$i; done && # Add novel commits to upstream git checkout master && @@ -289,7 +289,7 @@ test_expect_success 'setup repos for fetching with ref-in-want tests' ' git clone "file://$REPO" "$LOCAL_PRISTINE" && cd "$LOCAL_PRISTINE" && git checkout -b side && - for i in $(seq 1 33); do test_commit s$i; done && + for i in $(test_seq 1 33); do test_commit s$i; done && # Add novel commits to upstream git checkout master && -- 2.18.0.865.gffc8e1a3cd6
Re: Questions about the hash function transition
On Thu, Aug 23 2018, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason writes: > >>> - The trailer consists of the following: >>> - A copy of the 20-byte SHA-256 checksum at the end of the >>> corresponding packfile. >>> >>> - 20-byte SHA-256 checksum of all of the above. >> >> We need to update both of these to 32 byte, right? Or are we planning to >> truncate the checksums? > > https://public-inbox.org/git/ca+55afwc7uq61ebnj36pfu_abcxgya4jut-tvppj21hkhre...@mail.gmail.com/ Thanks. Yeah for this checksum purpose even 10 or 5 characters would do, but since we'll need a new pack format anyway for SHA-256 why not just use the full length of the SHA-256 here? We're using the full length of the SHA-1. I don't see it mattering for security / corruption detection purposes, but just to avoid confusion. We'll have this one place left where something looks like a SHA-1, but is actually a trunctated SHA-256.
Re: [PATCH v2 1/1] t2024: mark a `checkout -p` test as requiring Perl
On Thu, Aug 23 2018, Johannes Schindelin via GitGitGadget wrote: > From: Johannes Schindelin > > A recently-added test case tries to verify that the output of `checkout > -p` contains a certain piece of advice. > > But if Git was built without Perl and therefore lacks support for `git > add -i`, the error output contains the hint that `-p` is not even > available instead. > > Let's just skip that test case altogether if Git was built with NO_PERL. > > Signed-off-by: Johannes Schindelin > --- > t/t2024-checkout-dwim.sh | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/t/t2024-checkout-dwim.sh b/t/t2024-checkout-dwim.sh > index 26dc3f1fc0..29e1e25300 100755 > --- a/t/t2024-checkout-dwim.sh > +++ b/t/t2024-checkout-dwim.sh > @@ -76,7 +76,8 @@ test_expect_success 'checkout of branch from multiple > remotes fails #1' ' > test_branch master > ' > > -test_expect_success 'checkout of branch from multiple remotes fails with > advice' ' > +test_expect_success NO_PERL \ > + 'checkout of branch from multiple remotes fails with advice' ' > git checkout -B master && > test_might_fail git branch -D foo && > test_must_fail git checkout foo 2>stderr && This issue is already fixed in master as 3338e9950e ("t2024: mark test using "checkout -p" with PERL prerequisite", 2018-08-18).
Re: [PATCH/RFC] commit: new option to abort -a something is already staged
On Thu, Aug 23, 2018 at 4:15 AM Jonathan Nieder wrote: > > The remaining question becomes scripts. A script might do > > > > ... modify old-file and create new-file ... > > git add new-file > > git commit -m "some great message" > > > > which would trip this error. For that matter, humans might do that, > > too. Could the check detect this case (where the only changes in the > > index are additions of new files) and treat it as non-destructive? > > (where the only changes in the index are additions of new files *that > match the worktree*) I don't think my change would affect this case. If "git commit" does not change the index by itself, there's no point in stopping it. -- Duy
[PATCH v2 0/1] Make t2024 NO_PERL-safe
While trying to run the build & test with NO_PERL, I noticed that t2024 had a failing test case. This patch works around that failing test case by skipping it when we know that the error message looks different than that test case would expect. Changes since v1 (which did not make it to the list due to https://github.com/gitgitgadget/gitgitgadget/issues/29): * reworded the commit message slightly. Johannes Schindelin (1): t2024: mark a `checkout -p` test as requiring Perl t/t2024-checkout-dwim.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) base-commit: 8d7b558baebe3abbbad4973ce1e1f87a7da17f47 Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-20%2Fdscho%2Fcheckout-default-remote-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-20/dscho/checkout-default-remote-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/20 Range-diff vs v1: 1: 619d7bcc31 ! 1: 8d46b31f5a t2024: mark a `checkout -p` test as requiring Perl @@ -3,9 +3,11 @@ t2024: mark a `checkout -p` test as requiring Perl A recently-added test case tries to verify that the output of `checkout --p` contains a certain piece of advice. But if Git was built without -Perl and therefore lacks support for `git add -i`, the error output -contains the hint that `-p` is not even available instead. +-p` contains a certain piece of advice. + +But if Git was built without Perl and therefore lacks support for `git +add -i`, the error output contains the hint that `-p` is not even +available instead. Let's just skip that test case altogether if Git was built with NO_PERL. -- gitgitgadget
[PATCH v2 1/1] t2024: mark a `checkout -p` test as requiring Perl
From: Johannes Schindelin A recently-added test case tries to verify that the output of `checkout -p` contains a certain piece of advice. But if Git was built without Perl and therefore lacks support for `git add -i`, the error output contains the hint that `-p` is not even available instead. Let's just skip that test case altogether if Git was built with NO_PERL. Signed-off-by: Johannes Schindelin --- t/t2024-checkout-dwim.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/t2024-checkout-dwim.sh b/t/t2024-checkout-dwim.sh index 26dc3f1fc0..29e1e25300 100755 --- a/t/t2024-checkout-dwim.sh +++ b/t/t2024-checkout-dwim.sh @@ -76,7 +76,8 @@ test_expect_success 'checkout of branch from multiple remotes fails #1' ' test_branch master ' -test_expect_success 'checkout of branch from multiple remotes fails with advice' ' +test_expect_success NO_PERL \ + 'checkout of branch from multiple remotes fails with advice' ' git checkout -B master && test_might_fail git branch -D foo && test_must_fail git checkout foo 2>stderr && -- gitgitgadget
Re: [PATCH] range-diff: update stale summary of --no-dual-color
Hi Jonathan, On Wed, 22 Aug 2018, Jonathan Nieder wrote: > OPT_INTEGER(0, "creation-factor", _factor, > N_("Percentage by which creation is weighted")), > - OPT_BOOL(0, "no-dual-color", _color, > - N_("color both diff and diff-between-diffs")), > + OPT_BOOL(0, "dual-color", _color, > + N_("color both diff and diff-between-diffs > (default)")), There is one very good reason *not* to do that. And that reason is the output of `git range-diff -h`. If anybody read that the option `--dual-color` exists, they are prone to believe that the default is *not* dual color. In contrast, when reading `--no-dual-color`, it is clear that dual color mode is the default. Ciao, Dscho > OPT_END() > }; > int i, j, res = 0; > @@ -63,8 +63,8 @@ int cmd_range_diff(int argc, const char **argv, const char > *prefix) >options + ARRAY_SIZE(options) - 1, /* OPT_END */ >builtin_range_diff_usage, 0); > > - if (simple_color < 1) { > - if (!simple_color) > + if (dual_color != 0) { > + if (dual_color > 0) > /* force color when --dual-color was used */ > diffopt.use_color = 1; > diffopt.flags.dual_color_diffed_diffs = 1; >
Re: What's cooking in git.git (Aug 2018, #05; Mon, 20)
Hi Junio, On Wed, 22 Aug 2018, Junio C Hamano wrote: > Johannes Schindelin writes: > > > FWIW I am a lot more bold about these builtins, and want to get them > > into Git for Windows v2.19.0, either as full replacements, or like I > > did with the difftool: by offering them as experimental options in the > > installer. > > > > Maybe this will remain a dream, but maybe, just maybe, I will manage > > to do this, with your help. > > Heh, I do not think you need much help from me to cut GfW releases. I know. And that's why I asked for your help with the three massive patch sets in flight that convert scripted things to builtins instead. > In any case, thanks for a quick "last mile" patch, which unblocked > 'pu' that has been in sorry state for about a week or so. I've > inserted a new topic branch between -5 and -6 that (1) merges the > other track into -5 and (2) applies your "call it in a new way" > patch, and then rebased -6 on top of the result. If you have a > chance, it is appreciated if you can give a quick eyeballing near > the tip of 'pu' today. It looks good. Thanks. Dscho
Re: Questions about the hash function transition
Ævar Arnfjörð Bjarmason writes: >> - The trailer consists of the following: >> - A copy of the 20-byte SHA-256 checksum at the end of the >> corresponding packfile. >> >> - 20-byte SHA-256 checksum of all of the above. > > We need to update both of these to 32 byte, right? Or are we planning to > truncate the checksums? https://public-inbox.org/git/ca+55afwc7uq61ebnj36pfu_abcxgya4jut-tvppj21hkhre...@mail.gmail.com/
Re: [PATCH] diff.c: pass sign_index to emit_line_ws_markup
Hi Stefan, On Wed, 22 Aug 2018, Stefan Beller wrote: > Instead of passing the sign directly to emit_line_ws_markup, pass only the > index to lookup the sign in diff_options->output_indicators. > > Signed-off-by: Stefan Beller Looks good to me! > --- > diff.c | 12 +--- > 1 file changed, 5 insertions(+), 7 deletions(-) > > So something like this on top of sb/range-diff-colors ? If a resend is > needed I'll squash this in (or carry it as a cleanup patch early in the > series), otherwise we could put this on top. I'd leave this as a separate commit. Ciao, Dscho > > Thanks, > Stefan > > > diff --git a/diff.c b/diff.c > index 03486c35b75..17e33d506a1 100644 > --- a/diff.c > +++ b/diff.c > @@ -1199,10 +1199,11 @@ static void dim_moved_lines(struct diff_options *o) > static void emit_line_ws_markup(struct diff_options *o, > const char *set_sign, const char *set, > const char *reset, > - char sign, const char *line, int len, > + int sign_index, const char *line, int len, > unsigned ws_rule, int blank_at_eof) > { > const char *ws = NULL; > + int sign = o->output_indicators[sign_index]; > > if (o->ws_error_highlight & ws_rule) { > ws = diff_get_color_opt(o, DIFF_WHITESPACE); > @@ -1282,8 +1283,7 @@ static void emit_diff_symbol_from_struct(struct > diff_options *o, > set = diff_get_color_opt(o, DIFF_FILE_OLD); > } > emit_line_ws_markup(o, set_sign, set, reset, > - > o->output_indicators[OUTPUT_INDICATOR_CONTEXT], > - line, len, > + OUTPUT_INDICATOR_CONTEXT, line, len, > flags & (DIFF_SYMBOL_CONTENT_WS_MASK), 0); > break; > case DIFF_SYMBOL_PLUS: > @@ -1327,8 +1327,7 @@ static void emit_diff_symbol_from_struct(struct > diff_options *o, > flags &= ~DIFF_SYMBOL_CONTENT_WS_MASK; > } > emit_line_ws_markup(o, set_sign, set, reset, > - o->output_indicators[OUTPUT_INDICATOR_NEW], > - line, len, > + OUTPUT_INDICATOR_NEW, line, len, > flags & DIFF_SYMBOL_CONTENT_WS_MASK, > flags & DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF); > break; > @@ -1372,8 +1371,7 @@ static void emit_diff_symbol_from_struct(struct > diff_options *o, > set = diff_get_color_opt(o, DIFF_CONTEXT_DIM); > } > emit_line_ws_markup(o, set_sign, set, reset, > - o->output_indicators[OUTPUT_INDICATOR_OLD], > - line, len, > + OUTPUT_INDICATOR_OLD, line, len, > flags & DIFF_SYMBOL_CONTENT_WS_MASK, 0); > break; > case DIFF_SYMBOL_WORDS_PORCELAIN: > -- > 2.18.0.265.g16de1b435c9.dirty > >
Questions about the hash function transition
I wanted to send another series to clarify things in hash-function-transition.txt, but for some of the issues I don't know the answer, and I had some questions after giving this another read. So let's discuss that here first. Quoting from the document (available at https://github.com/git/git/blob/v2.19.0-rc0/Documentation/technical/hash-function-transition.txt) > Git hash function transition > > > Objective > - > Migrate Git from SHA-1 to a stronger hash function. Should way say "Migrate Git from SHA-1 to SHA-256" here instead? Maybe it's overly specific, i.e. really we're also describnig how /any/ hash function transition might happen, but having just read this now from start to finish it takes us a really long time to mention (and at first, only offhand) that SHA-256 is the new hash. > [...] > Goals > - > 1. The transition to SHA-256 can be done one local repository at a time. >a. Requiring no action by any other party. >b. A SHA-256 repository can communicate with SHA-1 Git servers > (push/fetch). >c. Users can use SHA-1 and SHA-256 identifiers for objects > interchangeably (see "Object names on the command line", below). >d. New signed objects make use of a stronger hash function than > SHA-1 for their security guarantees. > 2. Allow a complete transition away from SHA-1. >a. Local metadata for SHA-1 compatibility can be removed from a > repository if compatibility with SHA-1 is no longer needed. > 3. Maintainability throughout the process. >a. The object format is kept simple and consistent. >b. Creation of a generalized repository conversion tool. > > Non-Goals > - > 1. Add SHA-256 support to Git protocol. This is valuable and the >logical next step but it is out of scope for this initial design. This is a non-goal according to the docs, but now that we have protocol v2 in git, perhaps we could start specifying or describing how this protocol extension will work? > [...] > 3. Intermixing objects using multiple hash functions in a single >repository. But isn't that the goal now per "Translation table" & writing both SHA-1 and SHA-256 versions of objects? > [...] > Pack index > ~~ > Pack index (.idx) files use a new v3 format that supports multiple > hash functions. They have the following format (all integers are in > network byte order): > > - A header appears at the beginning and consists of the following: > - The 4-byte pack index signature: '\377t0c' > - 4-byte version number: 3 > - 4-byte length of the header section, including the signature and > version number > - 4-byte number of objects contained in the pack > - 4-byte number of object formats in this pack index: 2 > - For each object format: > - 4-byte format identifier (e.g., 'sha1' for SHA-1) So, given that we have 4-byte limit and have decided on SHA-256 are we just going to call this 'sha2'? That might be confusingly ambiguous since SHA2 is a standard with more than just SHA-256, maybe 's256', or maybe we should give this 8 bytes with trailing \0s so we can have "SHA-1\0\0\0" and "SHA-256\0"? > [...] > - The trailer consists of the following: > - A copy of the 20-byte SHA-256 checksum at the end of the > corresponding packfile. > > - 20-byte SHA-256 checksum of all of the above. We need to update both of these to 32 byte, right? Or are we planning to truncate the checksums? This seems like just a mistake when we did s/NewHash/SHA-256/g, but then again it was originally "20-byte NewHash checksum" ever since 752414ae43 ("technical doc: add a design doc for hash function transition", 2017-09-27), so what do we mean here? > Loose object index > ~~ > A new file $GIT_OBJECT_DIR/loose-object-idx contains information about > all loose objects. Its format is > > # loose-object-idx > (sha256-name SP sha1-name LF)* > > where the object names are in hexadecimal format. The file is not > sorted. > > The loose object index is protected against concurrent writes by a > lock file $GIT_OBJECT_DIR/loose-object-idx.lock. To add a new loose > object: > > 1. Write the loose object to a temporary file, like today. > 2. Open loose-object-idx.lock with O_CREAT | O_EXCL to acquire the lock. > 3. Rename the loose object into place. > 4. Open loose-object-idx with O_APPEND and write the new object > 5. Unlink loose-object-idx.lock to release the lock. > > To remove entries (e.g. in "git pack-refs" or "git-prune"): > > 1. Open loose-object-idx.lock with O_CREAT | O_EXCL to acquire the >lock. > 2. Write the new content to loose-object-idx.lock. > 3. Unlink any loose objects being removed. > 4. Rename to replace loose-object-idx, releasing the lock. Do we expect multiple concurrent writers to poll the lock if they can't aquire it right away? I.e. concurrent "git commit" would block? Has this overall approach been benchmarked somewhere? I wonder if some lock-less variant of this would
Re: [ANNOUNCE] Git v2.19.0-rc0
Jeff King writes: > Here are numbers for p0001.2 run against linux.git on a few > versions. This is using -O2 with gcc 8.2.0. > > Test v2.18.0 v2.19.0-rc0 HEAD > > -- > 0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) > -1.0% I see what you did to the formatting here, which is a topic of another thread ;-). Thanks, as Derrick also noted, I agree this is an appropriate workaround for the upcoming release and we may want to explore hasheq() and other solutions as part of the effort during the next cycle (which can start now if people are bored---after all working on the codebase is the best way to hunt for recent bugs). Thanks, will queue. > diff --git a/cache.h b/cache.h > index b1fd3d58ab..4d014541ab 100644 > --- a/cache.h > +++ b/cache.h > @@ -1023,6 +1023,16 @@ extern const struct object_id null_oid; > > static inline int hashcmp(const unsigned char *sha1, const unsigned char > *sha2) > { > + /* > + * This is a temporary optimization hack. By asserting the size here, > + * we let the compiler know that it's always going to be 20, which lets > + * it turn this fixed-size memcmp into a few inline instructions. > + * > + * This will need to be extended or ripped out when we learn about > + * hashes of different sizes. > + */ > + if (the_hash_algo->rawsz != 20) > + BUG("hash size not yet supported by hashcmp"); > return memcmp(sha1, sha2, the_hash_algo->rawsz); > }
Re: [ANNOUNCE] Git v2.19.0-rc0
Derrick Stolee writes: > I was thinking that having a mitigation for 2.19 is best, and then we > can focus as part of the 2.20 cycle how we can properly avoid this > cost, especially when 32 is a valid option. > ... > ... to > take a 'hasheq' approach [2] like Peff suggested [3]. Since that > requires auditing all callers of hashcmp to see if hasheq is > appropriate, it is not a good solution for 2.19 but (in my opinion) > should be evaluated as part of the 2.20 cycle. Thanks for thoughtful comments. I think it makes sense to go with the "tell compiler hashcmp() currently is only about 20-byte array" for now, as it is trivial to see why it cannot break things. During 2.20 cycle, we may find out that hasheq() abstraction performs better than hashcmp() with multiple lengths, and end up doing that "audit and replace to use hasheq()". But as you said, it probably isnot a good idea to rush it in this cycle.
[PATCH v2] range-diff: update stale summary of --no-dual-color
275267937b (range-diff: make dual-color the default mode, 2018-08-13) replaced --dual-color with --no-dual-color but left the option's summary untouched. Rewrite the summary to describe --no-dual-color rather than dual-color. Helped-by: Jonathan Nieder Signed-off-by: Kyle Meyer --- builtin/range-diff.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/range-diff.c b/builtin/range-diff.c index f52d45d9d6..057afdbf46 100644 --- a/builtin/range-diff.c +++ b/builtin/range-diff.c @@ -25,7 +25,7 @@ int cmd_range_diff(int argc, const char **argv, const char *prefix) OPT_INTEGER(0, "creation-factor", _factor, N_("Percentage by which creation is weighted")), OPT_BOOL(0, "no-dual-color", _color, - N_("color both diff and diff-between-diffs")), + N_("color only based on the diff-between-diffs")), OPT_END() }; int i, j, res = 0; -- 2.18.0
Re: [PATCH v3 7/7] submodule: support reading .gitmodules even when it's not checked out
On Wed, 22 Aug 2018 08:29:25 -0700 Junio C Hamano wrote: > Antonio Ospite writes: > [...] > >> > > + else if (get_oid(GITMODULES_HEAD, ) >= 0) > >> > > + config_source.blob = GITMODULES_HEAD; > >> > > >> Would using ":.gitmodules" instead of "HEAD:.gitmodules" be enough? > > Yeah, either "instead of", or "in addition" (i.e. "try the index > version in addition, before falling further back to the HEAD > version"), would be more consistent with the remainder of the system > (or, at least where the remainder of the system wants to go). > OK, I now tested with both "rm .gitmodules" and "git rm .gitmodules" and I see why one would want to try _both_ ":.gitmodules" and "HEAD:.gitmodules". I'll go with "in addition" then, adding tests for both the scenarios. > >> If so, what name should I use instead of GITMODULES_HEAD? > >> GITMODULES_BLOB is already taken for something different, maybe > >> GITMODULES_REF or GITMODULES_OBJECT? > > I do not know why you want to refrain from spelling them out as > "HEAD:.gitmodules" and ":.gitmodules"; at least to me the extra > layer of names do not look like they are making the code easier > to understand that much. > This is in the spirit of commit 4c0eeafe47 (cache.h: add GITMODULES_FILE macro, 2017-08-02), IIRC this was done mainly to get help from the preprocessor to spot typos: I caught myself writing ".gitmdoules" several times; GITMDOULES_FILE would not compile. If this makes sense I'll use GITMODULES_INDEX and GITMODULES_HEAD. Thanks, Antonio -- Antonio Ospite https://ao2.it https://twitter.com/ao2it A: Because it messes up the order in which people normally read text. See http://en.wikipedia.org/wiki/Posting_style Q: Why is top-posting such a bad thing?
Re: [feature] how to output absolute paths in git diff? => --show-abs-path
> Wanting such a feature seems sensible happy to hear that! > So this seems to work for --no-index, or if it doesn't what situations > doesn't it work in? cases where path arguments are already absolute (as I showed earlier) > Is this a mistake, or would you only like --show-abs-paths to implicitly > supply --src-prefix, but not --dst-prefix? If so, why? notice I didn't use `--show-abs-paths` in that example; I'm showing what `git diff` currently outputs (the `could show` meant depending on your use case; eg when `get_file1` returns an absolute path and `get_file2` returns a relative one) > Ah, so it's about supplying both the prefix *and* absolute paths, whereas I > see without --no-index we seem to handle this sort of thing just fine: indeed, without `--no-index` things work just fine as I noted in https://stackoverflow.com/questions/22698505/how-to-show-full-paths-in-git-diff. The problem is with `--no-index` I tried messing around with `--src-prefix` and `--dst-prefix` to remedy this but as I showed, it can't work currently. > without --no-index we seem to handle this sort of thing just fine: you also need `--relative` in case you're not at repo root in your snippet (that' what I'm using, without `--no-index`) > This is because the default prefixes are a/ and b/, respectively that seems buggy: with default options I get: --- a/Users/timothee/help0.txt +++ b/help1.txt with `--src-prefix=FOO ` and `--dst-prefix=FOO ` I get: --- FOOUsers/timothee/help0.txt +++ FOOhelp1.txt this seems buggy because there's not good option for FOO: when FOO = /, relative paths become a broken absolute path (/help1.txt) when FOO = ./, absolute paths become a broken relative path (./Users/timothee/help0.txt) I propose instead to show: with `--src-prefix=FOO ` and `--dst-prefix=FOO ` I get: --- join(FOO,path1) +++ join(FOO,path2) where join(prefix, path) simply appends prefix to path, taking care of avoiding a double `//` in case prefix ends in / and path starts with /, that way, the defauls (with a/, b/) are unchanged and we can have: with `--src-prefix=` and `--dst-prefix=` (empty FOO): --- /Users/timothee/help0.txt +++ help1.txt => the paths are not broken ## summary: * `--show-abs-paths` would be useful * `--src-prefix=FOO ` and `--dst-prefix=FOO ` could use join(FOO,path) instead of the currently used join(FOO,path1.removeLeadingSlash) On Thu, Aug 23, 2018 at 2:42 AM Ævar Arnfjörð Bjarmason wrote: > > On Thu, Aug 23, 2018 at 11:16 AM Timothee Cour > wrote: > > > > This has all the context: > > https://stackoverflow.com/questions/22698505/how-to-show-full-paths-in-git-diff > > It's helpful to copy it anyway, so we can discuss it here: > > QUOTE > > How do I show full paths in git diff? One can use '--dst-prefix=$PWD' > and '--src-prefix=$PWD' but this is fragile as it won't work in many > cases, eg with --no-index, or when running the commond from a > subdirectory without using --relative=realpath_to_cwd > > END QUOTE > > Wanting such a feature seems sensible. But I'm unclear on the details. > > You say that --{src,dst}-prefix is fragile and doesn't work for > --no-index. But if I do this: > > ( > cd /tmp && > echo foo >a && > echo bar >b && > git --no-pager diff --src-prefix=$PWD/ --dst-prefix=$PWD/ a b > ) > > I get this diff: > > diff --git /tmp/a /tmp/b > new file mode 100644 > index 257cc56..5716ca5 100644 > --- /tmp/a > +++ /tmp/b > @@ -1 +1 @@ > -foo > +bar > > So this seems to work for --no-index, or if it doesn't what situations > doesn't it work in? > > > I'd like `--show-abs-path` to show absolute paths in: > > git diff --show-abs-path args... > > > > eg: > > git diff --no-index `get_file1` `get_file2` > > could show: > > --- a/Users/timothee/temp/ripgrep/help0.txt > > +++ b/help1.txt > > Is this a mistake, or would you only like --show-abs-paths to > implicitly supply --src-prefix, but not --dst-prefix? If so, why? > > > * passing '--dst-prefix=$PWD' and '--src-prefix=$PWD' doesn't help > > because path arguments could be absolute, so it'll create > > $PWD/Users/timothee/temp/ripgrep/help0.txt (wrong) > > Ah, so it's about supplying both the prefix *and* absolute paths, > whereas I see without --no-index we seem to handle this sort of thing > just fine: > > git diff --src-prefix=$PWD/ --dst-prefix=$PWD HEAD~.. $PWD/some-file > > > * passing '--dst-prefix=.' will behave weirdly, replacing leading `/` > > by `.` (seems wrong) > > diff --git .Users/timothee/temp/ripgrep/help0.txt b/help1.txt > > This is because the default prefixes are a/ and b/, respectively, and > the option allows you to entirely replace them. E.g. imagine needing > "../some-relative-path/" > > > NOTE: I'm invoking the `git diff` command via a more complicated case > > (with multiple arguments including git diff flags and git diff files), > > so it's awkward for me to parse which arguments correspond to a file > > vs a flag (ie prevents easily converting
Re: [ANNOUNCE] Git v2.19.0-rc0
On 8/23/2018 1:04 AM, Jeff King wrote: On Thu, Aug 23, 2018 at 03:47:07AM +, brian m. carlson wrote: I expect that's going to be the case as well. I have patches that wire up actual SHA-256 support in my hash-impl branch. However, having said that, I'm happy to defer to whatever everyone else thinks is best for 2.19. The assert solution would be fine with me in this situation, and if we need to pull it out in the future, that's okay with me. I don't really have a strong opinion on this either way, so if someone else does, please say so. I have somewhat more limited availability over the next couple days, as I'm travelling on business, but I'm happy to review a patch (and it seems like Peff has one minus the actual commit message). I just posted the patch elsewhere in the thread. Thank you for that! I think you can safely ignore the rest of it if you are otherwise occupied. Even if v2.19 ships without some mitigation, I don't know that it's all that big a deal, given the numbers I generated (which for some reason are less dramatic than Stolee's). My numbers may be more dramatic because my Linux environment is a virtual machine. I was thinking that having a mitigation for 2.19 is best, and then we can focus as part of the 2.20 cycle how we can properly avoid this cost, especially when 32 is a valid option. Around the time that my proposed approaches were getting vetoed for alignment issues, I figured I was out of my depth here. I reached out to Daniel Lemire (of EWAH bitmap fame) on Twitter [1]. His blog is full of posts of word-based approaches to different problems, so I thought he might know something off the top of his head that would be applicable. His conclusion (after looking only a short time) was to take a 'hasheq' approach [2] like Peff suggested [3]. Since that requires auditing all callers of hashcmp to see if hasheq is appropriate, it is not a good solution for 2.19 but (in my opinion) should be evaluated as part of the 2.20 cycle. Of course, if someone with knowledge of word-alignment issues across the platforms we support knows how to enforce an alignment for object_id, then something word-based like [4] could be reconsidered. Thanks, everyone! -Stolee [1] https://twitter.com/stolee/status/1032312965754748930 [2] https://lemire.me/blog/2018/08/22/avoid-lexicographical-comparisons-when-testing-for-string-equality/ [3] https://public-inbox.org/git/20180822030344.ga14...@sigill.intra.peff.net/ [4] https://public-inbox.org/git/7ea416cf-b043-1274-e161-85a8780b8...@gmail.com/
[PATCH] t/lib-rebase.sh: support explicit 'pick' commands in 'fake_editor.sh'
The verbose output of the test 'reword without issues functions as intended' in 't3423-rebase-reword.sh', added in a9279c6785 (sequencer: do not squash 'reword' commits when we hit conflicts, 2018-06-19), contains the following error output: sed: -e expression #1, char 2: extra characters after command This error comes from within the 'fake-editor.sh' script created by 'lib-rebase.sh's set_fake_editor() function, and the root cause is the FAKE_LINES="pick 1 reword 2" variable in the test in question, in particular the "pick" word. 'fake-editor.sh' assumes 'pick' to be the default rebase command and doesn't support an explicit 'pick' command in FAKE_LINES. As a result, 'pick' will be used instead of a line number when assembling the following 'sed' script: sed -n picks/^pick/pick/p which triggers the aforementioned error. Luckily, this didn't affect the test's correctness: the erroring 'sed' command doesn't write anything to the todo script, and processing the rest of FAKE_LINES generates the desired todo script, as if that 'pick' command were not there at all. The minimal fix would be to remove the 'pick' word from FAKE_LINES, but that would leave us susceptible to similar issues in the future. Instead, teach the fake-editor script to recognize an explicit 'pick' command, which is still a fairly trivial change. In the future we might want to consider reinforcing this fake editor script with an &&-chain and stricter parsing of the FAKE_LINES variable (e.g. to error out when encountering unknown rebase commands or commands and line numbers in the wrong order). Signed-off-by: SZEDER Gábor --- t/lib-rebase.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/lib-rebase.sh b/t/lib-rebase.sh index 25a77ee5cb..592865d019 100644 --- a/t/lib-rebase.sh +++ b/t/lib-rebase.sh @@ -47,7 +47,7 @@ set_fake_editor () { action=pick for line in $FAKE_LINES; do case $line in - squash|fixup|edit|reword|drop) + pick|squash|fixup|edit|reword|drop) action="$line";; exec*) echo "$line" | sed 's/_/ /g' >> "$1";; -- 2.19.0.rc0.136.gd2dd172e64
Test failures on OpenBSD
This is on OpenBSD 6.2 amd64 with my "tests: fix and add lint for non-portable head -c N" patch (which fixes one failure). $ for t in t1305-config-include.sh t1308-config-set.sh t5004-archive-corner-cases.sh t5552-skipping-fetch-negotiator.sh; do ./$t --no-color -v -x 2>&1 | grep -B10 "^not ok"; done + cd bar + echo [includeIf "gitdir:bar/"]path=bar7 + >> .git/config + echo [test]seven=7 + > .git/bar7 + echo 7 + > expect + git config test.seven + > actual error: last command exited with $?=1 not ok 27 - conditional include, gitdir matching symlink -- + cd bar + echo [includeIf "gitdir/i:BAR/"]path=bar8 + >> .git/config + echo [test]eight=8 + > .git/bar8 + echo 8 + > expect + git config test.eight + > actual error: last command exited with $?=1 not ok 28 - conditional include, gitdir matching symlink, icase test_cmp expect actual + echo Error (-1) reading configuration file a-directory. + > expect + mkdir a-directory + test_expect_code 2 test-tool config configset_get_value foo.bar a-directory + 2> output Value not found for "foo.bar" test_expect_code: command exited with 1, we wanted 2 test-tool config configset_get_value foo.bar a-directory error: last command exited with $?=1 not ok 23 - proper error on directory "files" + make_dir extract + tar xf subtree-all.tar -C extract tar: Cannot identify format. Searching... tar: End of archive volume 1 reached tar: Sorry, unable to determine archive format. error: last command exited with $?=1 + rm -rf extract + exit 1 + eval_ret=1 + : not ok 9 - archive empty subtree with no pathspec -- + make_dir extract + tar xf subtree-path.tar -C extract tar: Cannot identify format. Searching... tar: End of archive volume 1 reached tar: Sorry, unable to determine archive format. error: last command exited with $?=1 + rm -rf extract + exit 1 + eval_ret=1 + : not ok 10 - archive empty subtree by direct pathspec 'git [...] -- [...]' fatal: ambiguous argument 'c7': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git [...] -- [...]' No have c7 (c7) error: last command exited with $?=1 + test_unconfig -C client fetch.negotiationalgorithm + exit 1 + eval_ret=1 + : not ok 1 - commits with no parents are sent regardless of skip distance -- Author: A U Thor 1 file changed, 1 insertion(+) create mode 100644 to_fetch.t + git init client Initialized empty Git repository in /home/avar/g/git/t/trash directory.t5552-skipping-fetch-negotiator/client/.git/ + seq 11 ./t5552-skipping-fetch-negotiator.sh: seq: not found + git -C client checkout c5 error: pathspec 'c5' did not match any file(s) known to git error: last command exited with $?=1 not ok 3 - when two skips collide, favor the larger one -- + git init client Initialized empty Git repository in /home/avar/g/git/t/trash directory.t5552-skipping-fetch-negotiator/client/.git/ + seq 8 ./t5552-skipping-fetch-negotiator.sh: seq: not found + seq 19 ./t5552-skipping-fetch-negotiator.sh: seq: not found + pwd + git -C server fetch --no-tags /home/avar/g/git/t/trash directory.t5552-skipping-fetch-negotiator/client b1:refs/heads/b1 fatal: Couldn't find remote ref b1 error: last command exited with $?=128 not ok 6 - do not send "have" with ancestors of commits that server ACKed Full output at https://gitlab.com/snippets/1747801 Some of this, like the t5552-skipping-fetch-negotiator.sh failures are new in 2.19.0 (those go away with s/seq/test_seq). But some are older. E.g. the archive corner cases failure is becuse that test wants unzip & GNU tar, which OpenBSD upstream has patched already: https://github.com/openbsd/ports/blob/master/devel/git/patches/patch-t_test-lib_sh
Re: [feature] how to output absolute paths in git diff? => --show-abs-path
On Thu, Aug 23, 2018 at 11:16 AM Timothee Cour wrote: > > This has all the context: > https://stackoverflow.com/questions/22698505/how-to-show-full-paths-in-git-diff It's helpful to copy it anyway, so we can discuss it here: QUOTE How do I show full paths in git diff? One can use '--dst-prefix=$PWD' and '--src-prefix=$PWD' but this is fragile as it won't work in many cases, eg with --no-index, or when running the commond from a subdirectory without using --relative=realpath_to_cwd END QUOTE Wanting such a feature seems sensible. But I'm unclear on the details. You say that --{src,dst}-prefix is fragile and doesn't work for --no-index. But if I do this: ( cd /tmp && echo foo >a && echo bar >b && git --no-pager diff --src-prefix=$PWD/ --dst-prefix=$PWD/ a b ) I get this diff: diff --git /tmp/a /tmp/b new file mode 100644 index 257cc56..5716ca5 100644 --- /tmp/a +++ /tmp/b @@ -1 +1 @@ -foo +bar So this seems to work for --no-index, or if it doesn't what situations doesn't it work in? > I'd like `--show-abs-path` to show absolute paths in: > git diff --show-abs-path args... > > eg: > git diff --no-index `get_file1` `get_file2` > could show: > --- a/Users/timothee/temp/ripgrep/help0.txt > +++ b/help1.txt Is this a mistake, or would you only like --show-abs-paths to implicitly supply --src-prefix, but not --dst-prefix? If so, why? > * passing '--dst-prefix=$PWD' and '--src-prefix=$PWD' doesn't help > because path arguments could be absolute, so it'll create > $PWD/Users/timothee/temp/ripgrep/help0.txt (wrong) Ah, so it's about supplying both the prefix *and* absolute paths, whereas I see without --no-index we seem to handle this sort of thing just fine: git diff --src-prefix=$PWD/ --dst-prefix=$PWD HEAD~.. $PWD/some-file > * passing '--dst-prefix=.' will behave weirdly, replacing leading `/` > by `.` (seems wrong) > diff --git .Users/timothee/temp/ripgrep/help0.txt b/help1.txt This is because the default prefixes are a/ and b/, respectively, and the option allows you to entirely replace them. E.g. imagine needing "../some-relative-path/" > NOTE: I'm invoking the `git diff` command via a more complicated case > (with multiple arguments including git diff flags and git diff files), > so it's awkward for me to parse which arguments correspond to a file > vs a flag (ie prevents easily converting input file arguments to > absolute paths), but `git` could do it easily via a flag, eg > `--show-abs-path`
[feature] how to output absolute paths in git diff? => --show-abs-path
This has all the context: https://stackoverflow.com/questions/22698505/how-to-show-full-paths-in-git-diff I'd like `--show-abs-path` to show absolute paths in: git diff --show-abs-path args... eg: git diff --no-index `get_file1` `get_file2` could show: --- a/Users/timothee/temp/ripgrep/help0.txt +++ b/help1.txt * passing '--dst-prefix=$PWD' and '--src-prefix=$PWD' doesn't help because path arguments could be absolute, so it'll create $PWD/Users/timothee/temp/ripgrep/help0.txt (wrong) * passing '--dst-prefix=.' will behave weirdly, replacing leading `/` by `.` (seems wrong) diff --git .Users/timothee/temp/ripgrep/help0.txt b/help1.txt NOTE: I'm invoking the `git diff` command via a more complicated case (with multiple arguments including git diff flags and git diff files), so it's awkward for me to parse which arguments correspond to a file vs a flag (ie prevents easily converting input file arguments to absolute paths), but `git` could do it easily via a flag, eg `--show-abs-path`
[PATCH] tests: fix and add lint for non-portable head -c N
The "head -c BYTES" option is non-portable (not in POSIX[1]). Change such invocations to use the test_copy_bytes wrapper added in 48860819e8 ("t9300: factor out portable "head -c" replacement", 2016-06-30). This fixes a test added in 9d2e330b17 ("ewah_read_mmap: bounds-check mmap reads", 2018-06-14), which has been breaking t5310-pack-bitmaps.sh on OpenBSD since 2.18.0. The OpenBSD ports already have a similar workaround after their upgrade to 2.18.0[2]. I have not tested this on IRIX, but according to 4de0bbd898 ("t9300: use perl "head -c" clone in place of "dd bs=1 count=16000" kluge", 2010-12-13) this invocation would have broken things there too. Also, change a valgrind-specific codepath in test-lib.sh to use this wrapper. Given where valgrind runs I don't think this would ever become a portability issue in practice, but it's easier to just use the wrapper than introduce some exception for the "make test-lint" check being added here. 1. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/head.html 2. https://github.com/openbsd/ports/commit/08d5d82eaefe5cf2f125ecc0c6a57df9cf91350c#diff-f7d3c4fabeed1691620d608f1534f5e5 Signed-off-by: Ævar Arnfjörð Bjarmason --- Junio: Even though this isn't a 2.19.0-rc0 regression I think it makes sense for 2.19.0. The fix is trivial, and it'll unbreak (at least some of) the tests on stock git on OpenBSD. OpenBSD guys: If you CC the git mailing list when you find you need to apply patches like these, we're happy to fix this more pro-actively. I just happened to be testing the upcoming 2.19.0 on OpenBSD and spotted this. t/check-non-portable-shell.pl | 1 + t/t5310-pack-bitmaps.sh | 2 +- t/test-lib.sh | 4 ++-- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl index d5823f71d8..94a7e6165e 100755 --- a/t/check-non-portable-shell.pl +++ b/t/check-non-portable-shell.pl @@ -35,6 +35,7 @@ sub err { chomp; } + /\bhead\s+-c\b/ and err 'head -c is not portable (use test_copy_bytes BYTES out)'; /\bsed\s+-i/ and err 'sed -i is not portable'; /\becho\s+-[neE]/ and err 'echo with option is not portable (use printf)'; /^\s*declare\s+/ and err 'arrays/declare not portable'; diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index 557bd0d0c0..7bff7923f2 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -335,7 +335,7 @@ test_expect_success 'truncated bitmap fails gracefully' ' git rev-list --use-bitmap-index --count --all >expect && bitmap=$(ls .git/objects/pack/*.bitmap) && test_when_finished "rm -f $bitmap" && - head -c 512 <$bitmap >$bitmap.tmp && + test_copy_bytes 512 <$bitmap >$bitmap.tmp && mv -f $bitmap.tmp $bitmap && git rev-list --use-bitmap-index --count --all >actual 2>stderr && test_cmp expect actual && diff --git a/t/test-lib.sh b/t/test-lib.sh index 8bb0f4348e..44288cbb59 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -867,7 +867,7 @@ then # handle only executables, unless they are shell libraries that # need to be in the exec-path. test -x "$1" || - test "# " = "$(head -c 2 <"$1")" || + test "# " = "$(test_copy_bytes 2 <"$1")" || return; base=$(basename "$1") @@ -882,7 +882,7 @@ then # do not override scripts if test -x "$symlink_target" && test ! -d "$symlink_target" && - test "#!" != "$(head -c 2 < "$symlink_target")" + test "#!" != "$(test_copy_bytes 2 <"$symlink_target")" then symlink_target=../valgrind.sh fi -- 2.18.0.865.gffc8e1a3cd6
how to output absolute paths in git diff?
This has all the context: https://stackoverflow.com/questions/22698505/how-to-show-full-paths-in-git-diff I'd like `--show-abs-path` to show absolute paths in: git diff --show-abs-path args... eg: git diff --no-index `get_file1` `get_file2` could show: --- a/Users/timothee/temp/ripgrep/help0.txt +++ b/help1.txt * passing '--dst-prefix=$PWD' and '--src-prefix=$PWD' doesn't help because path arguments could be absolute, so it'll create $PWD/Users/timothee/temp/ripgrep/help0.txt (wrong) * passing '--dst-prefix=.' will behave weirdly, replacing leading `/` by `.` (seems wrong) diff --git .Users/timothee/temp/ripgrep/help0.txt b/help1.txt NOTE: I'm invoking the `git diff` command via a more complicated case (with multiple arguments including git diff flags and git diff files), so it's awkward for me to parse which arguments correspond to a file vs a flag (ie prevents easily converting input file arguments to absolute paths), but `git` could do it easily via a flag, eg `--show-abs-path`
Re: [PATCH] range-diff: update stale summary of --no-dual-color
Kyle Meyer wrote: > Jonathan Nieder writes: >> I think what you mean is something like "color only based on the >> diff-between-diffs". > > Yeah, I that sounds OK to me. Thanks. Want to prepare a patch along those lines to finish this off? I'm off to sleep now, can look at this again tomorrow morning. Thanks, Jonathan