Re: [PATCH v1] Configure Git contribution guidelines for github.com
On Mon, Jun 12, 2017 at 5:52 PM, Junio C Hamano wrote: > "Philip Oakley" writes: > >> From: "Lars Schneider" >>> Many open source projects use github.com for their contribution process. >>> Although we mirror the Git core repository to github.com [1] we do not >>> use any other github.com service. This is unknown/unexpected to a >>> number of (potential) contributors and consequently they create Pull >>> Requests against our mirror with their contributions. These Pull >>> Requests become stall [2]. This is frustrating to them as they think we >>> ignore them and it is also unsatisfactory for us as we miss potential >>> code improvements and/or new contributors. >>> >>> GitHub offers a way to notify Pull Request contributors about the >>> contribution guidelines for a project [3]. Let's make use of this! >>> >>> [1] https://github.com/git/git >>> [2] https://github.com/git/git/pulls >>> [3] >>> https://help.github.com/articles/creating-a-pull-request-template-for-your-repository/ >>> >>> Signed-off-by: Lars Schneider >>> --- [The latest patch in this thread looks good to me to, thanks Lars] >> I see there are currently 84 open PRs (13 in the last 14 days), so it >> is real. > > Not so insignificant fraction of these are done purely for the > purpose of using submitgit, though. In other words, if submitgit > were improved not to require a pull request against [1] (instead, it > could be pointed at a branch in a contributor's repository and do > the fromatting), these numbers will go down. There are things we get out of this that would regress if submitGit were changed this way: * Now when you submit a pull request you get a Travis build for git/git, I don't get this if I push to any random branch in my avar/git, and although I could probably set it up, it's extra work etc. * I like being able to "git fetch" patches I'm reviewing. Now I can just set "+refs/pull/*:refs/remotes/origin/pull/*" in the refspec for git/git, if it were pulling from target repos I'd need to "git remote add" for each one, not a big deal, but less convenient. * There would be no single place to list submitGit requests, which you can do now through the pull page, although I think this is largely stale now because nothing auto-closes them if they're merged (by which point they'll have different sha1s), but that could be improved with some bot... There's probably ways to do this without git/git pull requests, but I don't see what problem would be solved by moving this off git/git, even if there's open requests submitGit is the only thing using these, and any confusion about that pull UI would remain if it wasn't (AFAIK there's no way to completely disable pull requests on github, but I may be wrong).
Re: [PATCH] wt-status.c: Modified status message shown for a parent-less branch
On Mon, 2017-06-12 at 17:37 -0400, Jeff King wrote: > On Mon, Jun 12, 2017 at 02:31:25PM -0700, Junio C Hamano wrote: > > > I think "staged for commit" still makes perfect sense even when > > > we are > > > just asking "what's the current status" and not "what would it > > > look like > > > if I were to commit". > > > > > > And avoiding the word "index" is worth-while here, I think. I am > > > not in > > > general of the "let's hide the index" camp" but it is a technical > > > term. > > > If we can say the same thing in a way that is understood both by > > > people > > > who know what the index is and people who do not, that seems like > > > a win. > > > > I do not mind "Changes not staged yet:". The point was not about > > getting rid of "stage" but about not mentioning "commit", because > > stepping back a bit, if the readers are prepared to accept these > > messages in the mindset that they are guiding them toward their > > next > > commit, "I find 'Initial commit' confusing" would not have been an > > issue in the first place. > > I think the difference is that "Initial commit" is talking about a > _specific_ commit. If we're not working on one, it doesn't make much > sense. But "staged for commit" is not necessarily talking about a > specific commit; we are talking about the purpose of staging > something > in general. You could equally well say "staged for committing" > (though I > think the shorter word sounds more natural). > > Likewise with "to be committed". > > > If we can get rid of 'yet' and 'already' from the above two, that > > would be even better. The point of the exercise is to be > > understood > > by those who do not think in terms of 'preparing for the next > > commit', > > so 'yet', 'already', 'to be committed' are all counter-productive > > for that goal. Those who accept the 'description of the current > > state in the context of preparing for the next commit' are not the > > ones we are trying to help with the suggested three changes. > > I agree that is the goal. My point was that the existing messages are > OK > even if you aren't thinking of preparing for the next commit. Saying > "this is in the index" and "this is staged" are synonyms. Saying > "this > is staged for commit" is likewise a synonym, unless there is some > other > reason we stage things. > > -Peff What about, "not making any assumptions" about what the user would think when he views the output of `git status` ? Why not try some general messages like, * Staged changes * Unstaged changes instead of the messages such as * Changes to be committed, Changes already in the index * Changes not staged for commit, Changes not yet in the index which seem to make assumptions about the user who view the output ? A typical patch could be found below, diff --git a/builtin/commit.c b/builtin/commit.c index 1d805f5da..3ed8e40bc 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1364,6 +1364,7 @@ int cmd_status(int argc, const char **argv, const char *prefix) usage_with_options(builtin_status_usage, builtin_status_options); status_init_config(&s, git_status_config); + s.commit_template = 1; argc = parse_options(argc, argv, prefix, builtin_status_options, builtin_status_usage, 0); diff --git a/wt-status.c b/wt-status.c index 037548496..55a7bd757 100644 --- a/wt-status.c +++ b/wt-status.c @@ -196,9 +196,14 @@ static void wt_longstatus_print_cached_header(struct wt_status *s) { const char *c = color(WT_STATUS_HEADER, s); - status_printf_ln(s, c, _("Changes to be committed:")); + if (s->commit_template) + status_printf_ln(s, c, _("Changes to be committed:")); + else + status_printf_ln(s, c, _("Staged changes:")); + if (!s->hints) return; + if (s->whence != FROM_COMMIT) ; /* NEEDSWORK: use "git reset --unresolve"??? */ else if (!s->is_initial) @@ -214,7 +219,11 @@ static void wt_longstatus_print_dirty_header(struct wt_status *s, { const char *c = color(WT_STATUS_HEADER, s); - status_printf_ln(s, c, _("Changes not staged for commit:")); + if (s->commit_template) + status_printf_ln(s, c, _("Changes not staged for commit:")); + else + status_printf_ln(s, c, _("Unstaged changes:")); + if (!s->hints) return; if (!has_deleted) @@ -1576,7 +1585,10 @@ static void wt_longstatus_print(struct wt_status *s) if (s->is_initial) { status_printf_ln(s, color(WT_STATUS_HEADER, s), "%s", ""); - status_printf_ln(s, color(WT_STATUS_HEADER, s), _("Initial commit")); + status_printf_ln(s, color(WT_STATUS_HEADER, s), + s->commit_template + ? _("Initial commit") + : _("No commits yet on the branch")
Re: [PATCH] wt-status.c: Modified status message shown for a parent-less branch
On Thu, Jun 15, 2017 at 01:49:20PM +0530, Kaartic Sivaraam wrote: > What about, "not making any assumptions" about what the user would > think when he views the output of `git status` ? Why not try some > general messages like, > > * Staged changes > * Unstaged changes > > instead of the messages such as > > * Changes to be committed, Changes already in the index > * Changes not staged for commit, Changes not yet in the index > > which seem to make assumptions about the user who view the output ? Saying just "staged changes" is definitely accurate. I don't know if some users would find that too terse, too. The phrase "not staged for commit" gives more information if you don't know what "staged" means in the Git world. -Peff
[PATCH v2] strbuf: let strbuf_addftime handle %z and %Z itself
There is no portable way to pass timezone information to strftime. Add parameters for timezone offset and name to strbuf_addftime and let it handle the timezone-related format specifiers %z and %Z internally. Callers can opt out for %Z by passing NULL as timezone name. %z is always handled internally -- this helps on Windows, where strftime would expand it to a timezone name (same as %Z), in violation of POSIX. Modifiers are not handled, e.g. %Ez is still passed to strftime. Use an empty string as timezone name in show_date (the only current caller) for now because we only have the timezone offset in non-local mode. POSIX allows %Z to resolve to an empty string in case of missing information. Helped-by: Ulrich Mueller Helped-by: Jeff King Signed-off-by: Rene Scharfe --- Changes from v1: - Always handle %z internally. - Move tests from t6006 to t0006, as that's a more appropriate place. - Changed tests to only use %%, %z and %Z to avoid incompatibilities. - Tested on mingw (applies there with patch and some fuzz). date.c | 2 +- strbuf.c| 41 + strbuf.h| 11 --- t/t0006-date.sh | 6 ++ 4 files changed, 52 insertions(+), 8 deletions(-) diff --git a/date.c b/date.c index 63fa99685e..5580577334 100644 --- a/date.c +++ b/date.c @@ -246,7 +246,7 @@ const char *show_date(timestamp_t time, int tz, const struct date_mode *mode) month_names[tm->tm_mon], tm->tm_year + 1900, tm->tm_hour, tm->tm_min, tm->tm_sec, tz); else if (mode->type == DATE_STRFTIME) - strbuf_addftime(&timebuf, mode->strftime_fmt, tm); + strbuf_addftime(&timebuf, mode->strftime_fmt, tm, tz, ""); else strbuf_addf(&timebuf, "%.3s %.3s %d %02d:%02d:%02d %d%c%+05d", weekday_names[tm->tm_wday], diff --git a/strbuf.c b/strbuf.c index 00457940cf..be3b9e37b1 100644 --- a/strbuf.c +++ b/strbuf.c @@ -785,14 +785,48 @@ char *xstrfmt(const char *fmt, ...) return ret; } -void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm) +void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm, +int tz_offset, const char *tz_name) { + struct strbuf munged_fmt = STRBUF_INIT; size_t hint = 128; size_t len; if (!*fmt) return; + /* +* There is no portable way to pass timezone information to +* strftime, so we handle %z and %Z here. +*/ + for (;;) { + const char *percent = strchrnul(fmt, '%'); + strbuf_add(&munged_fmt, fmt, percent - fmt); + if (!*percent) + break; + fmt = percent + 1; + switch (*fmt) { + case '%': + strbuf_addstr(&munged_fmt, "%%"); + fmt++; + break; + case 'z': + strbuf_addf(&munged_fmt, "%+05d", tz_offset); + fmt++; + break; + case 'Z': + if (tz_name) { + strbuf_addstr(&munged_fmt, tz_name); + fmt++; + break; + } + /* FALLTHROUGH */ + default: + strbuf_addch(&munged_fmt, '%'); + } + } + fmt = munged_fmt.buf; + strbuf_grow(sb, hint); len = strftime(sb->buf + sb->len, sb->alloc - sb->len, fmt, tm); @@ -804,17 +838,16 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm) * output contains at least one character, and then drop the extra * character before returning. */ - struct strbuf munged_fmt = STRBUF_INIT; - strbuf_addf(&munged_fmt, "%s ", fmt); + strbuf_addch(&munged_fmt, ' '); while (!len) { hint *= 2; strbuf_grow(sb, hint); len = strftime(sb->buf + sb->len, sb->alloc - sb->len, munged_fmt.buf, tm); } - strbuf_release(&munged_fmt); len--; /* drop munged space */ } + strbuf_release(&munged_fmt); strbuf_setlen(sb, sb->len + len); } diff --git a/strbuf.h b/strbuf.h index 80047b1bb7..39d5836abd 100644 --- a/strbuf.h +++ b/strbuf.h @@ -339,9 +339,14 @@ __attribute__((format (printf,2,0))) extern void strbuf_vaddf(struct strbuf *sb, const char *fmt, va_list ap); /** - * Add the time specified by `tm`, as formatted by `strftime`. - */ -extern void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm); + * Add the time specified by `tm`, as formatte
Re: rs/strbuf-addftime-zZ, was Re: What's cooking in git.git (Jun 2017, #04; Tue, 13)
Am 15.06.2017 um 07:42 schrieb Jeff King: On Thu, Jun 15, 2017 at 01:03:29AM +0200, René Scharfe wrote: But there's more. strftime on Windows doesn't support common POSIX- defined tokens like %F (%Y-%m-%d) and %T (%H:%M:%S). We could handle them as well. Do we want that? At least we'd have to update the added test that uses them.. Here's the full list of tokens in POSIX [1], but not supported by Windows [2]: %C, %D, %F, %G, %R, %T, %V, %e, %g, %h, %n, %r, %t, %u plus the modifiers %E and %O. I don't have a real opinion on that. The point of adding strftime was always to give the user access to whatever their system supports. In particular "%c" which we cannot emulate ourselves. If people want support for those other things on platforms that don't have it, I have no real objection. But I also don't know that it's worth spending time on if nobody is asking for it. Agreed; let's make the tests more focused (i.e. not exercise %F and %T needlessly). René
Re:
I would need your partnership in a transaction and details will disclose to you once i receive your reply. Thanks Saif.
[BUG] git cherry-pick segfaults with local changes in working directory
Hi, This is with git 2.11.0 (Debian 2.11.0-4) and can be reproduced with the packed checkout here: https://people.freedesktop.org/~slomo/git-cherry-pick-segfault_gst-plugins-good.tar.xz $ tar xf git-cherry-pick-segfault_gst-plugins-good.tar.xz $ cd gst-plugins-good $ git cherry-pick 0421fb04470af90e8810e7e5e69955d3192896ba Segmentation fault (core dumped) Without local changes (e.g. after "git stash") it works fine. Also note that the commit that is cherry-picked does not contain any changes to files that have local changes. Backtrace below: Thread 1 "git" received signal SIGSEGV, Segmentation fault. add_index_entry_with_check (option=0, ce=0x0, istate=0x55993900 ) at read-cache.c:1012 1012read-cache.c: No such file or directory. (gdb) bt #0 add_index_entry_with_check (option=0, ce=0x0, istate=0x55993900 ) at read-cache.c:1012 #1 add_index_entry (istate=0x55993900 , ce=0x0, option=0) at read-cache.c:1061 #2 0x55646df6 in merge_content (o=o@entry=0x7fffd550, path=path@entry=0x55a461c0 "docs/plugins/inspect/plugin-shout2send.xml", o_oid=, o_oid@entry=0x55a42adc, o_mode=o_mode@entry=33188, a_oid=a_oid@entry=0x55a42af4, a_mode=a_mode@entry=33188, b_oid=0x55a42b0c, b_mode=33188, rename_conflict_info=0x55a45fa0) at merge-recursive.c:1727 #3 0x55647a1d in process_entry (entry=, path=, o=0x7fffd550) at merge-recursive.c:1885 #4 merge_trees (o=o@entry=0x7fffd550, head=head@entry=0x559b7c98, merge=, merge@entry=0x559b7c70, common=, common@entry=0x559b7cc0, result=result@entry=0x7fffd530) at merge-recursive.c:1948 #5 0x5568a248 in do_recursive_merge (opts=0x7fffdcc0, msgbuf=0x7fffd510, head=0x7fffd620 "\246\256՝s8J\273ݵ\241y\303#j\277\211\266\\\n\377\177", next_label=, base_label=, next=, base=) at sequencer.c:389 #6 do_pick_commit (command=TODO_PICK, commit=commit@entry=0x559a7c60, opts=opts@entry=0x7fffdcc0) at sequencer.c:757 #7 0x5568baa3 in single_pick (opts=0x7fffdcc0, cmit=0x559a7c60) at sequencer.c:1329 #8 sequencer_pick_revisions (opts=0x7fffdcc0) at sequencer.c:1378 #9 0x555d34cf in run_sequencer (argc=1, argc@entry=2, argv=argv@entry=0x7fffe0c0, opts=, opts@entry=0x7fffdcc0) at builtin/revert.c:178 #10 0x555d3928 in cmd_cherry_pick (argc=2, argv=0x7fffe0c0, prefix=) at builtin/revert.c:203 #11 0x55566205 in run_builtin (argv=, argc=, p=) at git.c:373 #12 handle_builtin (argc=2, argv=0x7fffe0c0) at git.c:572 #13 0x555665a2 in run_argv (argv=0x7fffde60, argcp=0x7fffde6c) at git.c:630 #14 cmd_main (argc=, argv=) at git.c:707 #15 0x555655d2 in main (argc=3, argv=0x7fffe0b8) at common-main.c:40 -- Sebastian Dröge, Centricular Ltd · http://www.centricular.com signature.asc Description: This is a digitally signed message part
Which hash function to use, was Re: RFC: Another proposed hash function transition plan
Hi, I thought it better to revive this old thread rather than start a new thread, so as to automatically reach everybody who chimed in originally. On Mon, 6 Mar 2017, Brandon Williams wrote: > On 03/06, brian m. carlson wrote: > > > On Sat, Mar 04, 2017 at 06:35:38PM -0800, Linus Torvalds wrote: > > > > > Btw, I do think the particular choice of hash should still be on the > > > table. sha-256 may be the obvious first choice, but there are > > > definitely a few reasons to consider alternatives, especially if > > > it's a complete switch-over like this. > > > > > > One is large-file behavior - a parallel (or tree) mode could improve > > > on that noticeably. BLAKE2 does have special support for that, for > > > example. And SHA-256 does have known attacks compared to SHA-3-256 > > > or BLAKE2 - whether that is due to age or due to more effort, I > > > can't really judge. But if we're switching away from SHA1 due to > > > known attacks, it does feel like we should be careful. > > > > I agree with Linus on this. SHA-256 is the slowest option, and it's > > the one with the most advanced cryptanalysis. SHA-3-256 is faster on > > 64-bit machines (which, as we've seen on the list, is the overwhelming > > majority of machines using Git), and even BLAKE2b-256 is stronger. > > > > Doing this all over again in another couple years should also be a > > non-goal. > > I agree that when we decide to move to a new algorithm that we should > select one which we plan on using for as long as possible (much longer > than a couple years). While writing the document we simply used > "sha256" because it was more tangible and easier to reference. The SHA-1 transition *requires* a knob telling Git that the current repository uses a hash function different from SHA-1. It would make *a whole of a lot of sense* to make that knob *not* Boolean, but to specify *which* hash function is in use. That way, it will be easier to switch another time when it becomes necessary. And it will also make it easier for interested parties to use a different hash function in their infrastructure if they want. And it lifts part of that burden that we have to consider *very carefully* which function to pick. We still should be more careful than in 2005, when Git was born, and when, incidentally, when the first attacks on SHA-1 became known, of course. We were just lucky for almost 12 years. Now, with Dunning-Kruger in mind, I feel that my degree in mathematics equips me with *just enough* competence to know just how little *even I* know about cryptography. The smart thing to do, hence, was to get involved in this discussion and act as Lt Tawney Madison between us Git developers and experts in cryptography. It just so happens that I work at a company with access to excellent cryptographers, and as we own the largest Git repository on the planet, we have a vested interest in ensuring Git's continued success. After a couple of conversations with a couple of experts who I cannot thank enough for their time and patience, let alone their knowledge about this matter, it would appear that we may not have had a complete enough picture yet to even start to make the decision on the hash function to use. >From what I read, pretty much everybody who participated in the discussion was aware that the essential question is: performance vs security. It turns out that we can have essentially both. SHA-256 is most likely the best-studied hash function we currently know about (*maybe* SHA3-256 has been studied slightly more, but only slightly). All the experts in the field banged on it with multiple sticks and other weapons. And so far, they only found one weakness that does not even apply to Git's usage [*1*]. For cryptography experts, this is the ultimate measure of security: if something has been attacked that intensely, by that many experts, for that long, with that little effect, it is the best we got at the time. And since SHA-256 has become the standard, and more importantly: since SHA-256 was explicitly designed to allow for relatively inexpensive hardware acceleration, this is what we will soon have: hardware support in the form of, say, special CPU instructions. (That is what I meant by: we can have performance *and* security.) This is a rather important point to stress, by the way: BLAKE's design is apparently *not* friendly to CPU instruction implementations. Meaning that SHA-256 will be faster than BLAKE (and even than BLAKE2) once the Intel and AMD CPUs with hardware support for SHA-256 become common. I also heard something really worrisome about BLAKE2 that makes me want to stay away from it (in addition to the difficulty it poses for hardware acceleration): to compete in the SHA-3 contest, BLAKE added complexity so that it would be roughly on par with its competitors. To allow for faster execution in software, this complexity was *removed* from BLAKE to create BLAKE2, making it weaker than SHA-256. Another important point to conside
Re: [BUG] git cherry-pick segfaults with local changes in working directory
On Thu, Jun 15, 2017 at 12:11:50PM +0300, Sebastian Dröge wrote: > This is with git 2.11.0 (Debian 2.11.0-4) and can be reproduced with > the packed checkout here: > > https://people.freedesktop.org/~slomo/git-cherry-pick-segfault_gst-plugins-good.tar.xz > > $ tar xf git-cherry-pick-segfault_gst-plugins-good.tar.xz > $ cd gst-plugins-good > $ git cherry-pick 0421fb04470af90e8810e7e5e69955d3192896ba > Segmentation fault (core dumped) Note that the tarball doesn't have all the necessary objects. Its .git/objects/info/alternates points to another full clone of git://anongit.freedesktop.org/gstreamer/gst-plugins-good. The segfault was fixed in 55e9f0e5c (merge-recursive: handle NULL in add_cacheinfo() correctly, 2016-11-26), which is in v2.11.1. Curiously, after the fix we print: error: addinfo_cache failed for path 'docs/plugins/inspect/plugin-shout2send.xml' but the cherry-pick succeeds anyway. I'm not sure if that's a bug or not. -Peff
Re: preserve untracked cache, was Re: What's cooking in git.git (Jun 2017, #01; Thu, 1)
Hi Junio, On Sat, 3 Jun 2017, Junio C Hamano wrote: > Johannes Schindelin writes: > > > On Fri, 2 Jun 2017, Junio C Hamano wrote: > > > >> Samuel Lijin writes: > >> > >> >> What is holding this topic up? Anything Ben or I can do to move this > >> >> closer to `next` or even `master`? > >> > > >> > It's in `next` right now (3196d093d6). > >> > >> Thanks for pinging and checking ;-) > >> > >> I think the topic was merged to 'next' on the 23rd of last month and > >> graduated to 'master' in the past few days, together with other > >> topics. > > > > Okay. I never saw any "Will merge to" message, so I got worried. > > Well, I cannot quite help if you are not reading them ;-) Well, I cannot quite help when I am expected to read long "What's cooking" mails with out-of-context information as opposed to replies to the threads discussing the actual patch series. :-P I understand that it would be hard on you if I expected you to untangle your What's cooking discussions into the mail threads with the corresponding patches, of course. That just proves my point about mailing lists being an inadequate means to perform code review... > Issue #06 of May marked it to be merged to 'next': > https://public-inbox.org/git/ > > Issue #07 of May marked it for 'master': > https://public-inbox.org/git/ > > Issue #08 of May kept it (i.e. no issues discovered in the > meantime): > https://public-inbox.org/git/ > > Issue #01 of June reports it in 'master': > https://public-inbox.org/git/ Thank you, Dscho
[PATCH] docs/pretty-formats: stress that %- removes all preceding line-feeds
Signed-off-by: SZEDER Gábor --- A mere plural "line-feeds" was too subtle for me to grasp on first (and second...) reading. Documentation/pretty-formats.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt index 38040e95b..a48d267e2 100644 --- a/Documentation/pretty-formats.txt +++ b/Documentation/pretty-formats.txt @@ -213,8 +213,8 @@ If you add a `+` (plus sign) after '%' of a placeholder, a line-feed is inserted immediately before the expansion if and only if the placeholder expands to a non-empty string. -If you add a `-` (minus sign) after '%' of a placeholder, line-feeds that -immediately precede the expansion are deleted if and only if the +If you add a `-` (minus sign) after '%' of a placeholder, all consecutive +line-feeds immediately preceding the expansion are deleted if and only if the placeholder expands to an empty string. If you add a ` ` (space) after '%' of a placeholder, a space -- 2.13.1.505.g7cc9fcafb
Re: [BUG] git cherry-pick segfaults with local changes in working directory
Hi Jeff, Thanks for the fast reply! On Thu, 2017-06-15 at 06:32 -0400, Jeff King wrote: > On Thu, Jun 15, 2017 at 12:11:50PM +0300, Sebastian Dröge wrote: > > > This is with git 2.11.0 (Debian 2.11.0-4) and can be reproduced with > > the packed checkout here: > > > > https://people.freedesktop.org/~slomo/git-cherry-pick-segfault_gst-plugins-good.tar.xz > > > > $ tar xf git-cherry-pick-segfault_gst-plugins-good.tar.xz > > $ cd gst-plugins-good > > $ git cherry-pick 0421fb04470af90e8810e7e5e69955d3192896ba > > Segmentation fault (core dumped) > > Note that the tarball doesn't have all the necessary objects. Its > .git/objects/info/alternates points to another full clone of > git://anongit.freedesktop.org/gstreamer/gst-plugins-good. Ah good to know, I thought this only happens if you clone with --reference and not otherwise. > The segfault was fixed in 55e9f0e5c (merge-recursive: handle NULL in > add_cacheinfo() correctly, 2016-11-26), which is in v2.11.1. I can confirm that this also fixes my specific problem. Thanks! -- Sebastian Dröge, Centricular Ltd · http://www.centricular.com signature.asc Description: This is a digitally signed message part
Re: pushing for a new hash, was Re: [PATCH 2/3] rebase: Add tests for console output
Hi Ævar, On Fri, 2 Jun 2017, Ævar Arnfjörð Bjarmason wrote: > On Fri, Jun 2, 2017 at 7:54 PM, Jonathan Nieder wrote: > > > > Johannes Schindelin wrote: > >> On Thu, 1 Jun 2017, Stefan Beller wrote: > > > >>> We had a discussion off list how much of the test suite is in bad > >>> shape, and "$ git grep ^index" points out a lot of places as well. > >> > >> Maybe we should call out a specific month (or even a longer period) > >> during which we try to push toward that new hash function, and focus > >> more on those tasks (and on critical bug fixes, if any) than anything > >> else. > > > > Thanks for offering. ;-) > > > > Here's a rough list of some useful tasks, in no particular order: > > > > 1. bc/object-id: This patch series continues, eliminating assumptions > >about the size of object ids by encapsulating them in a struct. > >One straightforward way to find code that still needs to be > >converted is to grep for "sha" --- often the conversion patches > >change function and variable names to refer to oid_ where they used > >to use sha1_, making the stragglers easier to spot. > > > > 2. Hard-coded object ids in tests: As Stefan hinted, many tests beyond > >t00* make assumptions about the exact values of object ids. That's > >bad for maintainability for other reasons beyond the hash function > >transition, too. > > > >It should be possible to suss them out by patching git's sha1 > >routine to use the ones-complement of sha1 (~sha1) instead and > >seeing which tests fail. > > I just hacked this up locally. Would you mind turning that into a patch? I figure this could be a compile-time switch and applied to Git's `master` so that it is easier to find those assumptions, as well as verify fixes on multiple platforms. > > 4. When choosing a hash function, people may argue about performance. > >It would be useful for run some benchmarks for git (running > >the test suite, t/perf tests, etc) using a variety of hash > >functions as input to such a discussion. > > To the extent that such benchmarks matter, it seems prudent to heavily > weigh them in favor of whatever seems to be likely to be the more > common hash function going forward, since those are likely to get > faster through future hardware acceleration. Exactly. As I just mentioned elsewhere, the cryptographers I talked to expect SHA-256 and SHA3-256 to see the most focus of all hash functions, by far. > E.g. Intel announced Goldmont last year which according to one SHA-1 > implementation improved from 9.5 cycles per byte to 2.7 cpb[1]. They > only have acceleration for SHA-1 and SHA-256[2] > > 1. https://github.com/weidai11/cryptopp/issues/139#issuecomment-264283385 > > 2. https://en.wikipedia.org/wiki/Goldmont Thanks for digging that up! Very valuable information. Ciao, Dscho
Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan
On Thu, Jun 15, 2017 at 12:30:46PM +0200, Johannes Schindelin wrote: > Footnote *1*: SHA-256, as all hash functions whose output is essentially > the entire internal state, are susceptible to a so-called "length > extension attack", where the hash of a secret+message can be used to > generate the hash of secret+message+piggyback without knowing the secret. > This is not the case for Git: only visible data are hashed. The type of > attacks Git has to worry about is very different from the length extension > attacks, and it is highly unlikely that that weakness of SHA-256 leads to, > say, a collision attack. What do the experts think or SHA512/256, which completely removes the concerns over length extension attack? (which I'd argue is better than sweeping them under the carpet) Mike
Re: [PATCH v2] strbuf: let strbuf_addftime handle %z and %Z itself
> On Thu, 15 Jun 2017, René Scharfe wrote: > Callers can opt out for %Z by passing NULL as timezone name. %z is > always handled internally -- this helps on Windows, where strftime would > expand it to a timezone name (same as %Z), in violation of POSIX. > Modifiers are not handled, e.g. %Ez is still passed to strftime. POSIX would also allow other things, like a field width: http://pubs.opengroup.org/onlinepubs/9699919799/functions/strftime.html $ date '+%8z' +200 (But I believe that's not very useful, and supporting it might require duplicating much of strftime's code.) > Changes from v1: > - Always handle %z internally. Minor nitpick: Shouldn't the comment in strbuf.h be updated to reflect that change? > + * Add the time specified by `tm`, as formatted by `strftime`. `tz_offset` > + * and `tz_name` are used to expand %z and %Z internally, unless `tz_name` > + * is NULL. `tz_offset` is in decimal hhmm format, e.g. -600 means six > + * hours west of Greenwich. Ulrich
[PATCH] checkout: don't write merge results into the object database
Merge results need to be written to the worktree, of course, but we don't necessarily need object entries for them, especially if they contain conflict markers. Use pretend_sha1_file() to create fake blobs to pass to make_cache_entry() and checkout_entry() instead. Signed-off-by: Rene Scharfe --- builtin/checkout.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index 1624eed7e7..f51c15af9f 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -215,7 +215,7 @@ static int checkout_merged(int pos, const struct checkout *state) /* * NEEDSWORK: -* There is absolutely no reason to write this as a blob object +* There is absolutely no reason to build a fake blob object * and create a phony cache entry. This hack is primarily to get * to the write_entry() machinery that massages the contents to * work-tree format and writes out which only allows it for a @@ -225,8 +225,8 @@ static int checkout_merged(int pos, const struct checkout *state) * (it also writes the merge result to the object database even * when it may contain conflicts). */ - if (write_sha1_file(result_buf.ptr, result_buf.size, - blob_type, oid.hash)) + if (pretend_sha1_file(result_buf.ptr, result_buf.size, + OBJ_BLOB, oid.hash)) die(_("Unable to add merge result for '%s'"), path); free(result_buf.ptr); ce = make_cache_entry(mode, oid.hash, path, 2, 0); -- 2.13.0
Re: [BUG] add_again() off-by-one error in custom format
Am 15.06.2017 um 07:56 schrieb Jeff King: One interesting thing is that the cost of finding short hashes very much depends on your loose object setup. I timed: git log --format=%H >/dev/null versus git log --format=%h >/dev/null on git.git. It went from about 400ms to about 800ms. But then I noticed I had a lot of loose object directories, and ran "git gc --prune=now". Afterwards, my timings were more like 380ms and 460ms. The difference is that in the "before" case, we actually opened each directory and ran getdents(). But after gc, the directories are gone totally and open() fails. We also have to do a linear walk through the objects in each directory, since the contents are sorted. Do you mean "unsorted"? So I wonder if it is worth trying to optimize the short-sha1 computation in the first place. Double-%h aside, that would make _everything_ faster, including --oneline. Right. I'm not really sure how, though, short of caching the directory contents. That opens up questions of whether and when to invalidate the cache. If the cache were _just_ about short hashes, it might be OK to just assume that it remains valid through the length of the program (so worst case, a simultaneous write might mean that we generate a sha1 which just became ambiguous, but that's generally going to be racy anyway). The other downside of course is that we'd spend RAM on it. We could bound the size of the cache, I suppose. You mean like an in-memory pack index for loose objects? In order to avoid the readdir call in sha1_name.c::find_short_object_filename()? We'd only need to keep the hashes of found objects. An oid_array would be quite compact. Non-racy writes inside a process should not be ignored (write, read later) -- e.g. checkout does something like that. Can we trust object directory time stamps for cache invalidation? René
Re: [PATCH] wt-status.c: Modified status message shown for a parent-less branch
On Thu, Jun 15, 2017 at 4:42 AM, Jeff King wrote: > > On Thu, Jun 15, 2017 at 01:49:20PM +0530, Kaartic Sivaraam wrote: > > > What about, "not making any assumptions" about what the user would > > think when he views the output of `git status` ? Why not try some > > general messages like, > > > > * Staged changes > > * Unstaged changes > > > > instead of the messages such as > > > > * Changes to be committed, Changes already in the index > > * Changes not staged for commit, Changes not yet in the index > > > > which seem to make assumptions about the user who view the output ? > > Saying just "staged changes" is definitely accurate. I don't know if > some users would find that too terse, too. The phrase "not staged for > commit" gives more information if you don't know what "staged" means in > the Git world. Perhaps there should be a message pointing people at documentation explaining the index and staging terminology? Offhand, this is something I was wondering about the other day - has there ever been a discussion of what level of proficiency Git expects of its users? > -Peff
[BUG] GITK don't show unstaged changes
Details: https://github.com/git-for-windows/git/issues/1203 Version with bug: 2.13.1 Normal: 2.13.0 CCFelix
Re: [PATCH v2] strbuf: let strbuf_addftime handle %z and %Z itself
Am 15.06.2017 um 13:27 schrieb Ulrich Mueller: On Thu, 15 Jun 2017, René Scharfe wrote: Callers can opt out for %Z by passing NULL as timezone name. %z is always handled internally -- this helps on Windows, where strftime would expand it to a timezone name (same as %Z), in violation of POSIX. Modifiers are not handled, e.g. %Ez is still passed to strftime. POSIX would also allow other things, like a field width: http://pubs.opengroup.org/onlinepubs/9699919799/functions/strftime.html $ date '+%8z' +200 (But I believe that's not very useful, and supporting it might require duplicating much of strftime's code.) Windows doesn't support that (unsurprisingly), but it accepts %#z, which does the same as %z. Let's wait for someone to request support for modifiers and just document the behavior for now. Changes from v1: - Always handle %z internally. Minor nitpick: Shouldn't the comment in strbuf.h be updated to reflect that change? + * Add the time specified by `tm`, as formatted by `strftime`. `tz_offset` + * and `tz_name` are used to expand %z and %Z internally, unless `tz_name` + * is NULL. `tz_offset` is in decimal hhmm format, e.g. -600 means six + * hours west of Greenwich. Yes, it should. Thanks for paying attention! :) René
[PATCH v3] strbuf: let strbuf_addftime handle %z and %Z itself
There is no portable way to pass timezone information to strftime. Add parameters for timezone offset and name to strbuf_addftime and let it handle the timezone-related format specifiers %z and %Z internally. Callers can opt out for %Z by passing NULL as timezone name. %z is always handled internally -- this helps on Windows, where strftime would expand it to a timezone name (same as %Z), in violation of POSIX. Modifiers are not handled, e.g. %Ez is still passed to strftime. Use an empty string as timezone name in show_date (the only current caller) for now because we only have the timezone offset in non-local mode. POSIX allows %Z to resolve to an empty string in case of missing information. Helped-by: Ulrich Mueller Helped-by: Jeff King Signed-off-by: Rene Scharfe --- Changes from v3: - Updated developer documentation in strbuf.h. - Added short note to user documentation. Documentation/rev-list-options.txt | 3 ++- date.c | 2 +- strbuf.c | 41 ++ strbuf.h | 10 -- t/t0006-date.sh| 6 ++ 5 files changed, 54 insertions(+), 8 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index a46f70c2b1..34ae2553f1 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -768,7 +768,8 @@ timezone value. 1970). As with `--raw`, this is always in UTC and therefore `-local` has no effect. + -`--date=format:...` feeds the format `...` to your system `strftime`. +`--date=format:...` feeds the format `...` to your system `strftime`, +except for %z and %Z, which are handled internally. Use `--date=format:%c` to show the date in your system locale's preferred format. See the `strftime` manual for a complete list of format placeholders. When using `-local`, the correct syntax is diff --git a/date.c b/date.c index 63fa99685e..5580577334 100644 --- a/date.c +++ b/date.c @@ -246,7 +246,7 @@ const char *show_date(timestamp_t time, int tz, const struct date_mode *mode) month_names[tm->tm_mon], tm->tm_year + 1900, tm->tm_hour, tm->tm_min, tm->tm_sec, tz); else if (mode->type == DATE_STRFTIME) - strbuf_addftime(&timebuf, mode->strftime_fmt, tm); + strbuf_addftime(&timebuf, mode->strftime_fmt, tm, tz, ""); else strbuf_addf(&timebuf, "%.3s %.3s %d %02d:%02d:%02d %d%c%+05d", weekday_names[tm->tm_wday], diff --git a/strbuf.c b/strbuf.c index 00457940cf..be3b9e37b1 100644 --- a/strbuf.c +++ b/strbuf.c @@ -785,14 +785,48 @@ char *xstrfmt(const char *fmt, ...) return ret; } -void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm) +void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm, +int tz_offset, const char *tz_name) { + struct strbuf munged_fmt = STRBUF_INIT; size_t hint = 128; size_t len; if (!*fmt) return; + /* +* There is no portable way to pass timezone information to +* strftime, so we handle %z and %Z here. +*/ + for (;;) { + const char *percent = strchrnul(fmt, '%'); + strbuf_add(&munged_fmt, fmt, percent - fmt); + if (!*percent) + break; + fmt = percent + 1; + switch (*fmt) { + case '%': + strbuf_addstr(&munged_fmt, "%%"); + fmt++; + break; + case 'z': + strbuf_addf(&munged_fmt, "%+05d", tz_offset); + fmt++; + break; + case 'Z': + if (tz_name) { + strbuf_addstr(&munged_fmt, tz_name); + fmt++; + break; + } + /* FALLTHROUGH */ + default: + strbuf_addch(&munged_fmt, '%'); + } + } + fmt = munged_fmt.buf; + strbuf_grow(sb, hint); len = strftime(sb->buf + sb->len, sb->alloc - sb->len, fmt, tm); @@ -804,17 +838,16 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm) * output contains at least one character, and then drop the extra * character before returning. */ - struct strbuf munged_fmt = STRBUF_INIT; - strbuf_addf(&munged_fmt, "%s ", fmt); + strbuf_addch(&munged_fmt, ' '); while (!len) { hint *= 2; strbuf_grow(sb, hint); len = strftime(sb->buf + sb->len, sb->alloc - sb->len,
Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan
On Thu, Jun 15, 2017 at 08:05:18PM +0900, Mike Hommey wrote: > On Thu, Jun 15, 2017 at 12:30:46PM +0200, Johannes Schindelin wrote: > > Footnote *1*: SHA-256, as all hash functions whose output is essentially > > the entire internal state, are susceptible to a so-called "length > > extension attack", where the hash of a secret+message can be used to > > generate the hash of secret+message+piggyback without knowing the secret. > > This is not the case for Git: only visible data are hashed. The type of > > attacks Git has to worry about is very different from the length extension > > attacks, and it is highly unlikely that that weakness of SHA-256 leads to, > > say, a collision attack. > > What do the experts think or SHA512/256, which completely removes the > concerns over length extension attack? (which I'd argue is better than > sweeping them under the carpet) I don't think it's sweeping them under the carpet. Git does not use the hash as a MAC, so length extension attacks aren't a thing (and even if we later wanted to use the same algorithm as a MAC, the HMAC construction is a well-studied technique for dealing with it). That said, SHA-512 is typically a little faster than SHA-256 on 64-bit platforms. I don't know if that will change with the advent of hardware instructions oriented towards SHA-256. -Peff
Re: [BUG] git cherry-pick segfaults with local changes in working directory
On Thu, Jun 15, 2017 at 01:37:36PM +0300, Sebastian Dröge wrote: > > Note that the tarball doesn't have all the necessary objects. Its > > .git/objects/info/alternates points to another full clone of > > git://anongit.freedesktop.org/gstreamer/gst-plugins-good. > > Ah good to know, I thought this only happens if you clone with > --reference and not otherwise. If you do a local-filesystem clone of a repository with alternates, the clone will have the same alternates. So I'm guessing you may have done such a clone of your --reference repository as part of preparing the tarball. -Peff
Re: [PATCH] wt-status.c: Modified status message shown for a parent-less branch
On Thu, Jun 15, 2017 at 07:43:17AM -0400, Samuel Lijin wrote: > > Saying just "staged changes" is definitely accurate. I don't know if > > some users would find that too terse, too. The phrase "not staged for > > commit" gives more information if you don't know what "staged" means in > > the Git world. Oops, I meant to say "too terse, though". But it sounds like you got my meaning. > Perhaps there should be a message pointing people at documentation > explaining the index and staging terminology? Maybe. I wouldn't want this message to get too verbose. People see it a lot. There advice.statusHints message is already pretty verbose (though I turned it off myself years ago). > Offhand, this is something I was wondering about the other day - has > there ever been a discussion of what level of proficiency Git expects > of its users? There have been lots of discussions, but none that I can think of as definitive. I think the general strategy these days is to try to give hints via advise() for confusing situations, and to make it possible for expert users to turn those off. In general, I think using words from "git help glossary" is OK, but when we can use plainer language without loss of precision, that seems like a good idea. That's just my personal opinion, though. -Peff
Re: [BUG] add_again() off-by-one error in custom format
On Thu, Jun 15, 2017 at 01:33:34PM +0200, René Scharfe wrote: > > The difference is that in the "before" case, we actually opened each > > directory and ran getdents(). But after gc, the directories are gone > > totally and open() fails. We also have to do a linear walk through the > > objects in each directory, since the contents are sorted. > > Do you mean "unsorted"? Er yeah, sorry, I meant to say "aren't". > > I'm not really sure how, though, short of caching the directory > > contents. That opens up questions of whether and when to invalidate the > > cache. If the cache were _just_ about short hashes, it might be OK to > > just assume that it remains valid through the length of the program (so > > worst case, a simultaneous write might mean that we generate a sha1 > > which just became ambiguous, but that's generally going to be racy > > anyway). > > > > The other downside of course is that we'd spend RAM on it. We could > > bound the size of the cache, I suppose. > > You mean like an in-memory pack index for loose objects? In order to > avoid the readdir call in sha1_name.c::find_short_object_filename()? > We'd only need to keep the hashes of found objects. An oid_array > would be quite compact. Yes, that's what I was thinking of. > Non-racy writes inside a process should not be ignored (write, read > later) -- e.g. checkout does something like that. Ideally, yes. Though thinking on this more, it's racy even today, because we rely on the in-memory packed_git list. We usually re-check the directory only when we look for an object and it's missing. So any new packs which have been added while the process runs will be missed when doing short-sha1 lookups (and actually, find_short_packed_object does not even seem to do the usual retry-on-miss). A normal process like "checkout" isn't writing new packs, but a simultaneous repack could be migrating its new objects to a pack behind its back. (It also _can_ write packs, but only for large blobs). Given that we are talking about finding "unique" abbreviations here, and that those abbreviations can become invalidated immediately anyway as more objects are added (even by the same process), I'm not sure we need to strive for absolute accuracy. > Can we trust object directory time stamps for cache invalidation? I think it would work on POSIX-ish systems, since loose object changes always involve new files, not modifications of existing ones. I don't know if there are systems that don't update directory mtimes even for newly added files. That would still be a stat() per request. I'd be curious how the timing compares to the opendir/readdir that happens now. -Peff
Re: [PATCH v3] strbuf: let strbuf_addftime handle %z and %Z itself
On Thu, Jun 15, 2017 at 02:29:53PM +0200, René Scharfe wrote: > There is no portable way to pass timezone information to strftime. Add > parameters for timezone offset and name to strbuf_addftime and let it > handle the timezone-related format specifiers %z and %Z internally. > > Callers can opt out for %Z by passing NULL as timezone name. %z is > always handled internally -- this helps on Windows, where strftime would > expand it to a timezone name (same as %Z), in violation of POSIX. > Modifiers are not handled, e.g. %Ez is still passed to strftime. > > Use an empty string as timezone name in show_date (the only current > caller) for now because we only have the timezone offset in non-local > mode. POSIX allows %Z to resolve to an empty string in case of missing > information. > > Helped-by: Ulrich Mueller > Helped-by: Jeff King > Signed-off-by: Rene Scharfe > --- > Changes from v3: > - Updated developer documentation in strbuf.h. > - Added short note to user documentation. This looks good to me overall. > diff --git a/t/t0006-date.sh b/t/t0006-date.sh > index 42d4ea61ef..71082008f0 100755 > --- a/t/t0006-date.sh > +++ b/t/t0006-date.sh > @@ -51,6 +51,12 @@ check_show iso-local "$TIME" '2016-06-15 14:13:20 +' > check_show raw-local "$TIME" '146600 +' > check_show unix-local "$TIME" '146600' > > +check_show 'format:%z' "$TIME" '+0200' > +check_show 'format-local:%z' "$TIME" '+' > +check_show 'format:%Z' "$TIME" '' > +check_show 'format:%%z' "$TIME" '%z' > +check_show 'format-local:%%z' "$TIME" '%z' These check that the zone output is correct, but I don't think we ever check that the value we feed to strftime is actually in the correct zone in the first place (i.e., that %H shows the correct time). I think that should go in a separate test from the %z/%Z handling, as there are some subtleties. So here are two patches on top of yours: more tests, and then the format-local handling of %Z. [1/2]: t0006: check --date=format zone offsets [2/2]: date: use localtime() for "-local" time formats date.c | 14 -- t/t0006-date.sh | 10 -- 2 files changed, 20 insertions(+), 4 deletions(-) -Peff
[PATCH 1/2] t0006: check --date=format zone offsets
We already test that "%z" and "%Z" show the right thing, but we don't actually check that the time we display is the correct one. Let's add two new tests: 1. Test that "format:" shows the time in the author's timezone, just like the other time formats. 2. Test that "format-local:" shows time in the local timezone. We don't want to use our normal UTC for this, because its offset is zero (so the result would be "correct" even if the code forgot to apply the offset or applied it in the wrong direction). We'll use the EST5 zone, which is already used elsewhere in the script (and so is assumed to be available everywhere). Signed-off-by: Jeff King --- t/t0006-date.sh | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/t/t0006-date.sh b/t/t0006-date.sh index 71082008f..9f81bec7a 100755 --- a/t/t0006-date.sh +++ b/t/t0006-date.sh @@ -31,9 +31,11 @@ check_show () { format=$1 time=$2 expect=$3 - test_expect_success $4 "show date ($format:$time)" ' + prereqs=$4 + zone=$5 + test_expect_success $prereqs "show date ($format:$time)" ' echo "$time -> $expect" >expect && - test-date show:$format "$time" >actual && + TZ=${zone:-$TZ} test-date show:"$format" "$time" >actual && test_cmp expect actual ' } @@ -57,6 +59,9 @@ check_show 'format:%Z' "$TIME" '' check_show 'format:%%z' "$TIME" '%z' check_show 'format-local:%%z' "$TIME" '%z' +check_show 'format:%Y-%m-%d %H:%M:%S' "$TIME" '2016-06-15 16:13:20' +check_show 'format-local:%Y-%m-%d %H:%M:%S' "$TIME" '2016-06-15 09:13:20' '' EST5 + # arbitrary time absurdly far in the future FUTURE="5758122296 -0400" check_show iso "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT,TIME_T_IS_64BIT -- 2.13.1.766.g6bea926c5
[PATCH 2/2] date: use localtime() for "-local" time formats
When we convert seconds-since-epochs timestamps into a broken-down "struct tm", we do so by adjusting the timestamp according to the known offset and then using gmtime() to break down the result. This means that the resulting struct "knows" that it's in GMT, even though the time it represents is adjusted for a different zone. The fields where it stores this data are not portably accessible, so we have no way to override them to tell them the real zone info. For the most part, this works. Our date-formatting routines don't pay attention to these inaccessible fields, and use the same tz info we provided for adjustment. The one exception is when we call strftime(), whose %Z format reveals this hidden timezone data. We solved that by always showing the empty string for %Z. This is allowed by POSIX, but not very helpful to the user. We can't make this work in the general case, as there's no portable function for setting an arbitrary timezone (and anyway, we don't have the zone name for the author zones, only their offsets). But for the special case of the "-local" formats, we can just skip the adjustment and use localtime() instead of gmtime(). This makes --date=format-local:%Z work correctly, showing the local timezone instead of an empty string. The new test checks the result for "UTC", our default test-lib value for $TZ. Using something like EST5 might be more interesting, but the actual zone string is system-dependent (for instance, on my system it expands to just EST). Hopefully "UTC" is vanilla enough that every system treats it the same. Signed-off-by: Jeff King --- I don't have a Windows system to test this on, but from the output Dscho provided earlier, I believe this should pass. date.c | 14 -- t/t0006-date.sh | 1 + 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/date.c b/date.c index 558057733..1fd6d6637 100644 --- a/date.c +++ b/date.c @@ -70,6 +70,12 @@ static struct tm *time_to_tm(timestamp_t time, int tz) return gmtime(&t); } +static struct tm *time_to_tm_local(timestamp_t time) +{ + time_t t = time; + return localtime(&t); +} + /* * What value of "tz" was in effect back then at "time" in the * local timezone? @@ -214,7 +220,10 @@ const char *show_date(timestamp_t time, int tz, const struct date_mode *mode) return timebuf.buf; } - tm = time_to_tm(time, tz); + if (mode->local) + tm = time_to_tm_local(time); + else + tm = time_to_tm(time, tz); if (!tm) { tm = time_to_tm(0, 0); tz = 0; @@ -246,7 +255,8 @@ const char *show_date(timestamp_t time, int tz, const struct date_mode *mode) month_names[tm->tm_mon], tm->tm_year + 1900, tm->tm_hour, tm->tm_min, tm->tm_sec, tz); else if (mode->type == DATE_STRFTIME) - strbuf_addftime(&timebuf, mode->strftime_fmt, tm, tz, ""); + strbuf_addftime(&timebuf, mode->strftime_fmt, tm, tz, + mode->local ? NULL : ""); else strbuf_addf(&timebuf, "%.3s %.3s %d %02d:%02d:%02d %d%c%+05d", weekday_names[tm->tm_wday], diff --git a/t/t0006-date.sh b/t/t0006-date.sh index 9f81bec7a..7ac9466d5 100755 --- a/t/t0006-date.sh +++ b/t/t0006-date.sh @@ -56,6 +56,7 @@ check_show unix-local "$TIME" '146600' check_show 'format:%z' "$TIME" '+0200' check_show 'format-local:%z' "$TIME" '+' check_show 'format:%Z' "$TIME" '' +check_show 'format-local:%Z' "$TIME" 'UTC' check_show 'format:%%z' "$TIME" '%z' check_show 'format-local:%%z' "$TIME" '%z' -- 2.13.1.766.g6bea926c5
Re: [PATCH] checkout: don't write merge results into the object database
On Thu, Jun 15, 2017 at 01:33:42PM +0200, René Scharfe wrote: > Merge results need to be written to the worktree, of course, but we > don't necessarily need object entries for them, especially if they > contain conflict markers. Use pretend_sha1_file() to create fake > blobs to pass to make_cache_entry() and checkout_entry() instead. Conceptually this makes sense, although I have a comment below. Out of curiosity, did this come up when looking into the cherry-pick segfault/error from a few hours ago? > @@ -225,8 +225,8 @@ static int checkout_merged(int pos, const struct checkout > *state) >* (it also writes the merge result to the object database even >* when it may contain conflicts). >*/ > - if (write_sha1_file(result_buf.ptr, result_buf.size, > - blob_type, oid.hash)) > + if (pretend_sha1_file(result_buf.ptr, result_buf.size, > + OBJ_BLOB, oid.hash)) > die(_("Unable to add merge result for '%s'"), path); > free(result_buf.ptr); I wondered if pretend_sha1_file() makes a copy of the buffer, and indeed it does. So this is correct. But that raises an interesting question: how big are these objects, and is it a good idea to store them in RAM? Obviously they're in RAM already, but I'm not sure if it's one-at-a-time. We could be bumping the peak RAM used if there's a large number of these entries. Touching the on-disk odb does feel hacky, but it also means they behave like other objects. If it is a concern, I think it could be solved by "unpretending" after our call to checkout_entry completes. That would need a new call in sha1_file.c, but it should be easy to write. -Peff
Re: [PATCH] sub-process: fix comment about api-sub-process.txt
On 6/14/2017 2:26 PM, Jonathan Nieder wrote: Christian Couder wrote: Subject: sub-process: fix comment about api-sub-process.txt nit: this one-line description doesn't describe what was wrong and is being fixed. I think something like sub-process: correct path to API docs in comment Looks good to me. Thanks for finding/fixing this. would be easier to understand in "git log" output. Signed-off-by: Christian Couder --- sub-process.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) With or without such a tweak, Reviewed-by: Jonathan Nieder diff --git a/sub-process.h b/sub-process.h index 7d451e1cde..d9a45cd359 100644 --- a/sub-process.h +++ b/sub-process.h @@ -7,7 +7,7 @@ /* * Generic implementation of background process infrastructure. - * See Documentation/technical/api-background-process.txt. + * See: Documentation/technical/api-sub-process.txt */ /* data structures */
[PATCH 00/28] Create a reference backend for packed refs
This patch series continues the saga of picking apart the code for handling packed references from the code for handling loose references, all in preparation for making big changes to how the packed-ref reading and writing works as described in [1]. As a reminder, the final goal is to read the "packed-refs" file using mmap, parsing it on the fly instead of storing it into an in-memory `ref_cache`, and to read and parse only the parts of the file that are actually needed, giving a big speedup for many operations in repositories that have lots of refs. In this episode, we create a `packed_ref_store` class, implementing part of the `ref_store` API, that represents the packed references within a repository. The `files_ref_store` now contains an instance of `packed_ref_store` and delegates to it for the operations that have to touch the packed refs. After this patch series, `packed_ref_store` supports: * Iteration * `peel_ref` * `pack_refs` (they're already packed, so it's a NOOP) * `read_raw_ref` A future patch series will add support for: * Reference transactions (`transaction_prepare`, `transaction_finish`, `transaction_abort`, `initial_transaction_commit`) * `delete_refs` Operations that `packed_ref_store` will probably never support: * `create_symref` * `rename_ref` (could be supported, but is probably not useful) * Reflog-related operations In addition, all of the packed-refs related code has been moved to a new module, "refs/packed-backend.{c,h}". This includes some functions that are still called by `files_ref_store` directly to update the packed refs. The patch series is long, but I think relatively straightforward. In particular, patches 2-14 are quite mechanical. Its main point is to separate concerns, but it does bring one end-user advantage: if there is a problem parsing the "packed-refs" file, we now report an error and die. The old code just ignored lines that it didn't understand. I've developed these patches on top of master plus the following patches, which are followups to mh/packed-refs-store-prep: * lock_packed_refs(): fix cache validity check * for_each_bisect_ref(): don't trim refnames The patches can also be obtained from my GitHub fork [2] as branch "packed-ref-store". Michael [1] http://public-inbox.org/git/cover.1490026594.git.mhag...@alum.mit.edu/ [2] https://github.com/mhagger/git Michael Haggerty (28): add_packed_ref(): teach function to overwrite existing refs packed_ref_store: new struct packed_ref_store: move `packed_refs_path` here packed_ref_store: move `packed_refs_lock` member here clear_packed_ref_cache(): take a `packed_ref_store *` parameter validate_packed_ref_cache(): take a `packed_ref_store *` parameter get_packed_ref_cache(): take a `packed_ref_store *` parameter get_packed_refs(): take a `packed_ref_store *` parameter add_packed_ref(): take a `packed_ref_store *` parameter lock_packed_refs(): take a `packed_ref_store *` parameter commit_packed_refs(): take a `packed_ref_store *` parameter rollback_packed_refs(): take a `packed_ref_store *` parameter get_packed_ref(): take a `packed_ref_store *` parameter repack_without_refs(): take a `packed_ref_store *` parameter packed_peel_ref(): new function, extracted from `files_peel_ref()` packed_ref_store: support iteration packed_read_raw_ref(): new function, replacing `resolve_packed_ref()` packed-backend: new module for handling packed references packed_ref_store: make class into a subclass of `ref_store` commit_packed_refs(): report errors rather than dying commit_packed_refs(): use a staging file separate from the lockfile packed_refs_lock(): function renamed from lock_packed_refs() packed_refs_lock(): report errors via a `struct strbuf *err` packed_refs_unlock(), packed_refs_is_locked(): new functions clear_packed_ref_cache(): don't protest if the lock is held commit_packed_refs(): remove call to `packed_refs_unlock()` repack_without_refs(): don't lock or unlock the packed refs read_packed_refs(): die if `packed-refs` contains bogus data Makefile | 1 + refs.c| 18 ++ refs/files-backend.c | 619 --- refs/packed-backend.c | 868 ++ refs/packed-backend.h | 25 ++ refs/refs-internal.h | 10 + 6 files changed, 981 insertions(+), 560 deletions(-) create mode 100644 refs/packed-backend.c create mode 100644 refs/packed-backend.h -- 2.11.0
[PATCH 01/28] add_packed_ref(): teach function to overwrite existing refs
Teach `add_packed_ref()` to overwrite an existing entry if one already exists for the specified `refname`. This means that we can call it from `files_pack_refs()`, thereby reducing the amount that the latter function needs to know about the internals of packed-reference handling. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 40 ++-- 1 file changed, 18 insertions(+), 22 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index b040bb3b0a..87cecde231 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -413,15 +413,16 @@ static struct ref_dir *get_packed_refs(struct files_ref_store *refs) } /* - * Add a reference to the in-memory packed reference cache. This may - * only be called while the packed-refs file is locked (see - * lock_packed_refs()). To actually write the packed-refs file, call - * commit_packed_refs(). + * Add or overwrite a reference in the in-memory packed reference + * cache. This may only be called while the packed-refs file is locked + * (see lock_packed_refs()). To actually write the packed-refs file, + * call commit_packed_refs(). */ static void add_packed_ref(struct files_ref_store *refs, const char *refname, const struct object_id *oid) { - struct packed_ref_cache *packed_ref_cache = get_packed_ref_cache(refs); + struct ref_dir *packed_refs; + struct ref_entry *packed_entry; if (!is_lock_file_locked(&refs->packed_refs_lock)) die("BUG: packed refs not locked"); @@ -429,8 +430,17 @@ static void add_packed_ref(struct files_ref_store *refs, if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) die("Reference has invalid format: '%s'", refname); - add_ref_entry(get_packed_ref_dir(packed_ref_cache), - create_ref_entry(refname, oid, REF_ISPACKED)); + packed_refs = get_packed_refs(refs); + packed_entry = find_ref_entry(packed_refs, refname); + if (packed_entry) { + /* Overwrite the existing entry: */ + oidcpy(&packed_entry->u.value.oid, oid); + packed_entry->flag = REF_ISPACKED; + oidclr(&packed_entry->u.value.peeled); + } else { + packed_entry = create_ref_entry(refname, oid, REF_ISPACKED); + add_ref_entry(packed_refs, packed_entry); + } } /* @@ -1526,12 +1536,10 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) files_downcast(ref_store, REF_STORE_WRITE | REF_STORE_ODB, "pack_refs"); struct ref_iterator *iter; - struct ref_dir *packed_refs; int ok; struct ref_to_prune *refs_to_prune = NULL; lock_packed_refs(refs, LOCK_DIE_ON_ERROR); - packed_refs = get_packed_refs(refs); iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0); while ((ok = ref_iterator_advance(iter)) == ITER_OK) { @@ -1540,8 +1548,6 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) * in the packed ref cache. If the reference should be * pruned, also add it to refs_to_prune. */ - struct ref_entry *packed_entry; - if (!should_pack_ref(iter->refname, iter->oid, iter->flags, flags)) continue; @@ -1552,17 +1558,7 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) * we don't copy the peeled status, because we want it * to be re-peeled. */ - packed_entry = find_ref_entry(packed_refs, iter->refname); - if (packed_entry) { - /* Overwrite existing packed entry with info from loose entry */ - packed_entry->flag = REF_ISPACKED; - oidcpy(&packed_entry->u.value.oid, iter->oid); - } else { - packed_entry = create_ref_entry(iter->refname, iter->oid, - REF_ISPACKED); - add_ref_entry(packed_refs, packed_entry); - } - oidclr(&packed_entry->u.value.peeled); + add_packed_ref(refs, iter->refname, iter->oid); /* Schedule the loose reference for pruning if requested. */ if ((flags & PACK_REFS_PRUNE)) { -- 2.11.0
[PATCH 03/28] packed_ref_store: move `packed_refs_path` here
Move `packed_refs_path` from `files_ref_store` to `packed_ref_store`, and rename it to `path` since its meaning is clear from its new context. Inline `files_packed_refs_path()`. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 25 - 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 2efb71cee9..c4b8e2f63b 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -54,6 +54,9 @@ struct packed_ref_cache { struct packed_ref_store { unsigned int store_flags; + /* The path of the "packed-refs" file: */ + char *path; + /* * A cache of the values read from the `packed-refs` file, if * it might still be current; otherwise, NULL. @@ -61,11 +64,13 @@ struct packed_ref_store { struct packed_ref_cache *cache; }; -static struct packed_ref_store *packed_ref_store_create(unsigned int store_flags) +static struct packed_ref_store *packed_ref_store_create( + const char *path, unsigned int store_flags) { struct packed_ref_store *refs = xcalloc(1, sizeof(*refs)); refs->store_flags = store_flags; + refs->path = xstrdup(path); return refs; } @@ -79,7 +84,6 @@ struct files_ref_store { char *gitdir; char *gitcommondir; - char *packed_refs_path; struct ref_cache *loose; @@ -154,8 +158,8 @@ static struct ref_store *files_ref_store_create(const char *gitdir, get_common_dir_noenv(&sb, gitdir); refs->gitcommondir = strbuf_detach(&sb, NULL); strbuf_addf(&sb, "%s/packed-refs", refs->gitcommondir); - refs->packed_refs_path = strbuf_detach(&sb, NULL); - refs->packed_ref_store = packed_ref_store_create(flags); + refs->packed_ref_store = packed_ref_store_create(sb.buf, flags); + strbuf_release(&sb); return ref_store; } @@ -343,11 +347,6 @@ static struct packed_ref_cache *read_packed_refs(const char *packed_refs_file) return packed_refs; } -static const char *files_packed_refs_path(struct files_ref_store *refs) -{ - return refs->packed_refs_path; -} - static void files_reflog_path(struct files_ref_store *refs, struct strbuf *sb, const char *refname) @@ -401,7 +400,7 @@ static void validate_packed_ref_cache(struct files_ref_store *refs) { if (refs->packed_ref_store->cache && !stat_validity_check(&refs->packed_ref_store->cache->validity, -files_packed_refs_path(refs))) +refs->packed_ref_store->path)) clear_packed_ref_cache(refs); } @@ -415,7 +414,7 @@ static void validate_packed_ref_cache(struct files_ref_store *refs) */ static struct packed_ref_cache *get_packed_ref_cache(struct files_ref_store *refs) { - const char *packed_refs_file = files_packed_refs_path(refs); + const char *packed_refs_file = refs->packed_ref_store->path; if (!is_lock_file_locked(&refs->packed_refs_lock)) validate_packed_ref_cache(refs); @@ -1352,7 +1351,7 @@ static int lock_packed_refs(struct files_ref_store *refs, int flags) } if (hold_lock_file_for_update_timeout( - &refs->packed_refs_lock, files_packed_refs_path(refs), + &refs->packed_refs_lock, refs->packed_ref_store->path, flags, timeout_value) < 0) return -1; @@ -1633,7 +1632,7 @@ static int repack_without_refs(struct files_ref_store *refs, return 0; /* no refname exists in packed refs */ if (lock_packed_refs(refs, 0)) { - unable_to_lock_message(files_packed_refs_path(refs), errno, err); + unable_to_lock_message(refs->packed_ref_store->path, errno, err); return -1; } packed = get_packed_refs(refs); -- 2.11.0
[PATCH 07/28] get_packed_ref_cache(): take a `packed_ref_store *` parameter
It only cares about the packed-refs part of the reference store. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 26 -- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index f061506bf0..b2ef7b3bb9 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -404,24 +404,22 @@ static void validate_packed_ref_cache(struct packed_ref_store *refs) } /* - * Get the packed_ref_cache for the specified files_ref_store, + * Get the packed_ref_cache for the specified packed_ref_store, * creating and populating it if it hasn't been read before or if the * file has been changed (according to its `validity` field) since it * was last read. On the other hand, if we hold the lock, then assume * that the file hasn't been changed out from under us, so skip the * extra `stat()` call in `stat_validity_check()`. */ -static struct packed_ref_cache *get_packed_ref_cache(struct files_ref_store *refs) +static struct packed_ref_cache *get_packed_ref_cache(struct packed_ref_store *refs) { - const char *packed_refs_file = refs->packed_ref_store->path; + if (!is_lock_file_locked(&refs->lock)) + validate_packed_ref_cache(refs); - if (!is_lock_file_locked(&refs->packed_ref_store->lock)) - validate_packed_ref_cache(refs->packed_ref_store); - - if (!refs->packed_ref_store->cache) - refs->packed_ref_store->cache = read_packed_refs(packed_refs_file); + if (!refs->cache) + refs->cache = read_packed_refs(refs->path); - return refs->packed_ref_store->cache; + return refs->cache; } static struct ref_dir *get_packed_ref_dir(struct packed_ref_cache *packed_ref_cache) @@ -431,7 +429,7 @@ static struct ref_dir *get_packed_ref_dir(struct packed_ref_cache *packed_ref_ca static struct ref_dir *get_packed_refs(struct files_ref_store *refs) { - return get_packed_ref_dir(get_packed_ref_cache(refs)); + return get_packed_ref_dir(get_packed_ref_cache(refs->packed_ref_store)); } /* @@ -1151,7 +1149,7 @@ static struct ref_iterator *files_ref_iterator_begin( loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), prefix, 1); - iter->packed_ref_cache = get_packed_ref_cache(refs); + iter->packed_ref_cache = get_packed_ref_cache(refs->packed_ref_store); acquire_packed_ref_cache(iter->packed_ref_cache); packed_iter = cache_ref_iterator_begin(iter->packed_ref_cache->cache, prefix, 0); @@ -1365,7 +1363,7 @@ static int lock_packed_refs(struct files_ref_store *refs, int flags) */ validate_packed_ref_cache(refs->packed_ref_store); - packed_ref_cache = get_packed_ref_cache(refs); + packed_ref_cache = get_packed_ref_cache(refs->packed_ref_store); /* Increment the reference count to prevent it from being freed: */ acquire_packed_ref_cache(packed_ref_cache); return 0; @@ -1380,7 +1378,7 @@ static int lock_packed_refs(struct files_ref_store *refs, int flags) static int commit_packed_refs(struct files_ref_store *refs) { struct packed_ref_cache *packed_ref_cache = - get_packed_ref_cache(refs); + get_packed_ref_cache(refs->packed_ref_store); int ok, error = 0; int save_errno = 0; FILE *out; @@ -1426,7 +1424,7 @@ static int commit_packed_refs(struct files_ref_store *refs) static void rollback_packed_refs(struct files_ref_store *refs) { struct packed_ref_cache *packed_ref_cache = - get_packed_ref_cache(refs); + get_packed_ref_cache(refs->packed_ref_store); files_assert_main_repository(refs, "rollback_packed_refs"); -- 2.11.0
[PATCH 09/28] add_packed_ref(): take a `packed_ref_store *` parameter
It only cares about the packed-refs part of the reference store. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index bc5c0de84e..4943207098 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -438,19 +438,19 @@ static struct ref_dir *get_packed_refs(struct packed_ref_store *refs) * (see lock_packed_refs()). To actually write the packed-refs file, * call commit_packed_refs(). */ -static void add_packed_ref(struct files_ref_store *refs, +static void add_packed_ref(struct packed_ref_store *refs, const char *refname, const struct object_id *oid) { struct ref_dir *packed_refs; struct ref_entry *packed_entry; - if (!is_lock_file_locked(&refs->packed_ref_store->lock)) + if (!is_lock_file_locked(&refs->lock)) die("BUG: packed refs not locked"); if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) die("Reference has invalid format: '%s'", refname); - packed_refs = get_packed_refs(refs->packed_ref_store); + packed_refs = get_packed_refs(refs); packed_entry = find_ref_entry(packed_refs, refname); if (packed_entry) { /* Overwrite the existing entry: */ @@ -1579,7 +1579,7 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) * we don't copy the peeled status, because we want it * to be re-peeled. */ - add_packed_ref(refs, iter->refname, iter->oid); + add_packed_ref(refs->packed_ref_store, iter->refname, iter->oid); /* Schedule the loose reference for pruning if requested. */ if ((flags & PACK_REFS_PRUNE)) { @@ -3210,7 +3210,7 @@ static int files_initial_transaction_commit(struct ref_store *ref_store, if ((update->flags & REF_HAVE_NEW) && !is_null_oid(&update->new_oid)) - add_packed_ref(refs, update->refname, + add_packed_ref(refs->packed_ref_store, update->refname, &update->new_oid); } -- 2.11.0
[PATCH 08/28] get_packed_refs(): take a `packed_ref_store *` parameter
It only cares about the packed-refs part of the reference store. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index b2ef7b3bb9..bc5c0de84e 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -427,9 +427,9 @@ static struct ref_dir *get_packed_ref_dir(struct packed_ref_cache *packed_ref_ca return get_ref_dir(packed_ref_cache->cache->root); } -static struct ref_dir *get_packed_refs(struct files_ref_store *refs) +static struct ref_dir *get_packed_refs(struct packed_ref_store *refs) { - return get_packed_ref_dir(get_packed_ref_cache(refs->packed_ref_store)); + return get_packed_ref_dir(get_packed_ref_cache(refs)); } /* @@ -450,7 +450,7 @@ static void add_packed_ref(struct files_ref_store *refs, if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) die("Reference has invalid format: '%s'", refname); - packed_refs = get_packed_refs(refs); + packed_refs = get_packed_refs(refs->packed_ref_store); packed_entry = find_ref_entry(packed_refs, refname); if (packed_entry) { /* Overwrite the existing entry: */ @@ -592,7 +592,7 @@ static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs) static struct ref_entry *get_packed_ref(struct files_ref_store *refs, const char *refname) { - return find_ref_entry(get_packed_refs(refs), refname); + return find_ref_entry(get_packed_refs(refs->packed_ref_store), refname); } /* @@ -1633,7 +1633,7 @@ static int repack_without_refs(struct files_ref_store *refs, unable_to_lock_message(refs->packed_ref_store->path, errno, err); return -1; } - packed = get_packed_refs(refs); + packed = get_packed_refs(refs->packed_ref_store); /* Remove refnames from the cache */ for_each_string_list_item(refname, refnames) -- 2.11.0
[PATCH 04/28] packed_ref_store: move `packed_refs_lock` member here
Move the `packed_refs_lock` member from `files_ref_store` to `packed_ref_store`, and rename it to `lock` since it's now more obvious what it is locking. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 31 --- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index c4b8e2f63b..de8293493f 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -62,6 +62,12 @@ struct packed_ref_store { * it might still be current; otherwise, NULL. */ struct packed_ref_cache *cache; + + /* +* Lock used for the "packed-refs" file. Note that this (and +* thus the enclosing `packed_ref_store`) must not be freed. +*/ + struct lock_file lock; }; static struct packed_ref_store *packed_ref_store_create( @@ -87,12 +93,6 @@ struct files_ref_store { struct ref_cache *loose; - /* -* Lock used for the "packed-refs" file. Note that this (and -* thus the enclosing `files_ref_store`) must not be freed. -*/ - struct lock_file packed_refs_lock; - struct packed_ref_store *packed_ref_store; }; @@ -125,7 +125,7 @@ static void clear_packed_ref_cache(struct files_ref_store *refs) if (refs->packed_ref_store->cache) { struct packed_ref_cache *packed_refs = refs->packed_ref_store->cache; - if (is_lock_file_locked(&refs->packed_refs_lock)) + if (is_lock_file_locked(&refs->packed_ref_store->lock)) die("BUG: packed-ref cache cleared while locked"); refs->packed_ref_store->cache = NULL; release_packed_ref_cache(packed_refs); @@ -416,7 +416,7 @@ static struct packed_ref_cache *get_packed_ref_cache(struct files_ref_store *ref { const char *packed_refs_file = refs->packed_ref_store->path; - if (!is_lock_file_locked(&refs->packed_refs_lock)) + if (!is_lock_file_locked(&refs->packed_ref_store->lock)) validate_packed_ref_cache(refs); if (!refs->packed_ref_store->cache) @@ -447,7 +447,7 @@ static void add_packed_ref(struct files_ref_store *refs, struct ref_dir *packed_refs; struct ref_entry *packed_entry; - if (!is_lock_file_locked(&refs->packed_refs_lock)) + if (!is_lock_file_locked(&refs->packed_ref_store->lock)) die("BUG: packed refs not locked"); if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) @@ -1351,7 +1351,8 @@ static int lock_packed_refs(struct files_ref_store *refs, int flags) } if (hold_lock_file_for_update_timeout( - &refs->packed_refs_lock, refs->packed_ref_store->path, + &refs->packed_ref_store->lock, + refs->packed_ref_store->path, flags, timeout_value) < 0) return -1; @@ -1388,10 +1389,10 @@ static int commit_packed_refs(struct files_ref_store *refs) files_assert_main_repository(refs, "commit_packed_refs"); - if (!is_lock_file_locked(&refs->packed_refs_lock)) + if (!is_lock_file_locked(&refs->packed_ref_store->lock)) die("BUG: packed-refs not locked"); - out = fdopen_lock_file(&refs->packed_refs_lock, "w"); + out = fdopen_lock_file(&refs->packed_ref_store->lock, "w"); if (!out) die_errno("unable to fdopen packed-refs descriptor"); @@ -1409,7 +1410,7 @@ static int commit_packed_refs(struct files_ref_store *refs) if (ok != ITER_DONE) die("error while iterating over references"); - if (commit_lock_file(&refs->packed_refs_lock)) { + if (commit_lock_file(&refs->packed_ref_store->lock)) { save_errno = errno; error = -1; } @@ -1430,9 +1431,9 @@ static void rollback_packed_refs(struct files_ref_store *refs) files_assert_main_repository(refs, "rollback_packed_refs"); - if (!is_lock_file_locked(&refs->packed_refs_lock)) + if (!is_lock_file_locked(&refs->packed_ref_store->lock)) die("BUG: packed-refs not locked"); - rollback_lock_file(&refs->packed_refs_lock); + rollback_lock_file(&refs->packed_ref_store->lock); release_packed_ref_cache(packed_ref_cache); clear_packed_ref_cache(refs); } -- 2.11.0
[PATCH 16/28] packed_ref_store: support iteration
Add the infrastructure to iterate over a `packed_ref_store`. It's a lot of boilerplate, but it's all part of a campaign to make `packed_ref_store` implement `ref_store`. In the future, this iterator will work much differently. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 107 ++- 1 file changed, 98 insertions(+), 9 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 185d05e1d6..e57cdeba36 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1062,10 +1062,102 @@ static int files_peel_ref(struct ref_store *ref_store, return peel_object(base, sha1); } +struct packed_ref_iterator { + struct ref_iterator base; + + struct packed_ref_cache *cache; + struct ref_iterator *iter0; + unsigned int flags; +}; + +static int packed_ref_iterator_advance(struct ref_iterator *ref_iterator) +{ + struct packed_ref_iterator *iter = + (struct packed_ref_iterator *)ref_iterator; + int ok; + + while ((ok = ref_iterator_advance(iter->iter0)) == ITER_OK) { + if (iter->flags & DO_FOR_EACH_PER_WORKTREE_ONLY && + ref_type(iter->iter0->refname) != REF_TYPE_PER_WORKTREE) + continue; + + if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) && + !ref_resolves_to_object(iter->iter0->refname, + iter->iter0->oid, + iter->iter0->flags)) + continue; + + iter->base.refname = iter->iter0->refname; + iter->base.oid = iter->iter0->oid; + iter->base.flags = iter->iter0->flags; + return ITER_OK; + } + + iter->iter0 = NULL; + if (ref_iterator_abort(ref_iterator) != ITER_DONE) + ok = ITER_ERROR; + + return ok; +} + +static int packed_ref_iterator_peel(struct ref_iterator *ref_iterator, + struct object_id *peeled) +{ + struct packed_ref_iterator *iter = + (struct packed_ref_iterator *)ref_iterator; + + return ref_iterator_peel(iter->iter0, peeled); +} + +static int packed_ref_iterator_abort(struct ref_iterator *ref_iterator) +{ + struct packed_ref_iterator *iter = + (struct packed_ref_iterator *)ref_iterator; + int ok = ITER_DONE; + + if (iter->iter0) + ok = ref_iterator_abort(iter->iter0); + + release_packed_ref_cache(iter->cache); + base_ref_iterator_free(ref_iterator); + return ok; +} + +static struct ref_iterator_vtable packed_ref_iterator_vtable = { + packed_ref_iterator_advance, + packed_ref_iterator_peel, + packed_ref_iterator_abort +}; + +static struct ref_iterator *packed_ref_iterator_begin( + struct packed_ref_store *refs, + const char *prefix, unsigned int flags) +{ + struct packed_ref_iterator *iter; + struct ref_iterator *ref_iterator; + + iter = xcalloc(1, sizeof(*iter)); + ref_iterator = &iter->base; + base_ref_iterator_init(ref_iterator, &packed_ref_iterator_vtable); + + /* +* Note that get_packed_ref_cache() internally checks whether +* the packed-ref cache is up to date with what is on disk, +* and re-reads it if not. +*/ + + iter->cache = get_packed_ref_cache(refs); + acquire_packed_ref_cache(iter->cache); + iter->iter0 = cache_ref_iterator_begin(iter->cache->cache, prefix, 0); + + iter->flags = flags; + + return ref_iterator; +} + struct files_ref_iterator { struct ref_iterator base; - struct packed_ref_cache *packed_ref_cache; struct ref_iterator *iter0; unsigned int flags; }; @@ -1118,7 +1210,6 @@ static int files_ref_iterator_abort(struct ref_iterator *ref_iterator) if (iter->iter0) ok = ref_iterator_abort(iter->iter0); - release_packed_ref_cache(iter->packed_ref_cache); base_ref_iterator_free(ref_iterator); return ok; } @@ -1160,18 +1251,16 @@ static struct ref_iterator *files_ref_iterator_begin( * (If they've already been read, that's OK; we only need to * guarantee that they're read before the packed refs, not * *how much* before.) After that, we call -* get_packed_ref_cache(), which internally checks whether the -* packed-ref cache is up to date with what is on disk, and -* re-reads it if not. +* packed_ref_iterator_begin(), which internally checks +* whether the packed-ref cache is up to date with what is on +* disk, and re-reads it if not. */ loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), prefix, 1); - iter->packed_ref_cache = get_packed_ref_cache(refs->packed_ref_store); -
[PATCH 10/28] lock_packed_refs(): take a `packed_ref_store *` parameter
It only cares about the packed-refs part of the reference store. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 31 ++- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 4943207098..0d8f39089f 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -80,6 +80,19 @@ static struct packed_ref_store *packed_ref_store_create( return refs; } +/* + * Die if refs is not the main ref store. caller is used in any + * necessary error messages. + */ +static void packed_assert_main_repository(struct packed_ref_store *refs, + const char *caller) +{ + if (refs->store_flags & REF_STORE_MAIN) + return; + + die("BUG: operation %s only allowed for main ref store", caller); +} + /* * Future: need to be in "struct repository" * when doing a full libification. @@ -1334,13 +1347,13 @@ static void write_packed_entry(FILE *fh, const char *refname, * hold_lock_file_for_update(). Return 0 on success. On errors, set * errno appropriately and return a nonzero value. */ -static int lock_packed_refs(struct files_ref_store *refs, int flags) +static int lock_packed_refs(struct packed_ref_store *refs, int flags) { static int timeout_configured = 0; static int timeout_value = 1000; struct packed_ref_cache *packed_ref_cache; - files_assert_main_repository(refs, "lock_packed_refs"); + packed_assert_main_repository(refs, "lock_packed_refs"); if (!timeout_configured) { git_config_get_int("core.packedrefstimeout", &timeout_value); @@ -1348,8 +1361,8 @@ static int lock_packed_refs(struct files_ref_store *refs, int flags) } if (hold_lock_file_for_update_timeout( - &refs->packed_ref_store->lock, - refs->packed_ref_store->path, + &refs->lock, + refs->path, flags, timeout_value) < 0) return -1; @@ -1361,9 +1374,9 @@ static int lock_packed_refs(struct files_ref_store *refs, int flags) * cache is still valid. We've just locked the file, but it * might have changed the moment *before* we locked it. */ - validate_packed_ref_cache(refs->packed_ref_store); + validate_packed_ref_cache(refs); - packed_ref_cache = get_packed_ref_cache(refs->packed_ref_store); + packed_ref_cache = get_packed_ref_cache(refs); /* Increment the reference count to prevent it from being freed: */ acquire_packed_ref_cache(packed_ref_cache); return 0; @@ -1560,7 +1573,7 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) int ok; struct ref_to_prune *refs_to_prune = NULL; - lock_packed_refs(refs, LOCK_DIE_ON_ERROR); + lock_packed_refs(refs->packed_ref_store, LOCK_DIE_ON_ERROR); iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0); while ((ok = ref_iterator_advance(iter)) == ITER_OK) { @@ -1629,7 +1642,7 @@ static int repack_without_refs(struct files_ref_store *refs, if (!needs_repacking) return 0; /* no refname exists in packed refs */ - if (lock_packed_refs(refs, 0)) { + if (lock_packed_refs(refs->packed_ref_store, 0)) { unable_to_lock_message(refs->packed_ref_store->path, errno, err); return -1; } @@ -3198,7 +3211,7 @@ static int files_initial_transaction_commit(struct ref_store *ref_store, } } - if (lock_packed_refs(refs, 0)) { + if (lock_packed_refs(refs->packed_ref_store, 0)) { strbuf_addf(err, "unable to lock packed-refs file: %s", strerror(errno)); ret = TRANSACTION_GENERIC_ERROR; -- 2.11.0
[PATCH 15/28] packed_peel_ref(): new function, extracted from `files_peel_ref()`
This will later become a method of `packed_ref_store`. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 26 +++--- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index c206791b91..185d05e1d6 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1013,6 +1013,18 @@ static int lock_raw_ref(struct files_ref_store *refs, return ret; } +static int packed_peel_ref(struct packed_ref_store *refs, + const char *refname, unsigned char *sha1) +{ + struct ref_entry *r = get_packed_ref(refs, refname); + + if (!r || peel_entry(r, 0)) + return -1; + + hashcpy(sha1, r->u.value.peeled.hash); + return 0; +} + static int files_peel_ref(struct ref_store *ref_store, const char *refname, unsigned char *sha1) { @@ -1043,17 +1055,9 @@ static int files_peel_ref(struct ref_store *ref_store, * be expensive and (b) loose references anyway usually do not * have REF_KNOWS_PEELED. */ - if (flag & REF_ISPACKED) { - struct ref_entry *r = - get_packed_ref(refs->packed_ref_store, refname); - - if (r) { - if (peel_entry(r, 0)) - return -1; - hashcpy(sha1, r->u.value.peeled.hash); - return 0; - } - } + if (flag & REF_ISPACKED && + !packed_peel_ref(refs->packed_ref_store, refname, sha1)) + return 0; return peel_object(base, sha1); } -- 2.11.0
[PATCH 21/28] commit_packed_refs(): use a staging file separate from the lockfile
We will want to be able to hold the lockfile for `packed-refs` even after we have activated the new values. So use a separate tempfile, `packed-refs.new`, as a place to stage the new contents of the `packed-refs` file. For now this is all done within `commit_packed_refs()`, but that will change shortly. Signed-off-by: Michael Haggerty --- refs/packed-backend.c | 40 1 file changed, 32 insertions(+), 8 deletions(-) diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 5bee49d497..6619052e96 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -68,6 +68,13 @@ struct packed_ref_store { * thus the enclosing `packed_ref_store`) must not be freed. */ struct lock_file lock; + + /* +* Temporary file used when rewriting new contents to the +* "packed-refs" file. Note that this (and thus the enclosing +* `packed_ref_store`) must not be freed. +*/ + struct tempfile tempfile; }; struct ref_store *packed_ref_store_create(const char *path, @@ -522,10 +529,16 @@ int lock_packed_refs(struct ref_store *ref_store, int flags) timeout_configured = 1; } + /* +* Note that we close the lockfile immediately because we +* don't write new content to it, but rather to a separate +* tempfile. +*/ if (hold_lock_file_for_update_timeout( &refs->lock, refs->path, - flags, timeout_value) < 0) + flags, timeout_value) < 0 || + close_lock_file(&refs->lock)) return -1; /* @@ -567,13 +580,23 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) get_packed_ref_cache(refs); int ok; int ret = -1; + struct strbuf sb = STRBUF_INIT; FILE *out; struct ref_iterator *iter; if (!is_lock_file_locked(&refs->lock)) die("BUG: commit_packed_refs() called when unlocked"); - out = fdopen_lock_file(&refs->lock, "w"); + strbuf_addf(&sb, "%s.new", refs->path); + if (create_tempfile(&refs->tempfile, sb.buf) < 0) { + strbuf_addf(err, "unable to create file %s: %s", + sb.buf, strerror(errno)); + strbuf_release(&sb); + goto out; + } + strbuf_release(&sb); + + out = fdopen_tempfile(&refs->tempfile, "w"); if (!out) { strbuf_addf(err, "unable to fdopen packed-refs tempfile: %s", strerror(errno)); @@ -582,7 +605,7 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) if (fprintf(out, "%s", PACKED_REFS_HEADER) < 0) { strbuf_addf(err, "error writing to %s: %s", - get_lock_file_path(&refs->lock), strerror(errno)); + get_tempfile_path(&refs->tempfile), strerror(errno)); goto error; } @@ -594,7 +617,7 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) if (write_packed_entry(out, iter->refname, iter->oid->hash, peel_error ? NULL : peeled.hash)) { strbuf_addf(err, "error writing to %s: %s", - get_lock_file_path(&refs->lock), + get_tempfile_path(&refs->tempfile), strerror(errno)); ref_iterator_abort(iter); goto error; @@ -602,13 +625,13 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) } if (ok != ITER_DONE) { - strbuf_addf(err, "unable to write packed-refs file: " + strbuf_addf(err, "unable to rewrite packed-refs file: " "error iterating over old contents"); goto error; } - if (commit_lock_file(&refs->lock)) { - strbuf_addf(err, "error overwriting %s: %s", + if (rename_tempfile(&refs->tempfile, refs->path)) { + strbuf_addf(err, "error replacing %s: %s", refs->path, strerror(errno)); goto out; } @@ -617,9 +640,10 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) goto out; error: - rollback_lock_file(&refs->lock); + delete_tempfile(&refs->tempfile); out: + rollback_lock_file(&refs->lock); release_packed_ref_cache(packed_ref_cache); return ret; } -- 2.11.0
[PATCH 22/28] packed_refs_lock(): function renamed from lock_packed_refs()
Rename `lock_packed_refs()` to `packed_refs_lock()` for consistency with how other methods are named. Also, it's about to get some companions. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 4 ++-- refs/packed-backend.c | 10 +- refs/packed-backend.h | 2 +- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 4323394a52..3fc2254e33 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1084,7 +1084,7 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) struct ref_to_prune *refs_to_prune = NULL; struct strbuf err = STRBUF_INIT; - lock_packed_refs(refs->packed_ref_store, LOCK_DIE_ON_ERROR); + packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR); iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0); while ((ok = ref_iterator_advance(iter)) == ITER_OK) { @@ -2667,7 +2667,7 @@ static int files_initial_transaction_commit(struct ref_store *ref_store, } } - if (lock_packed_refs(refs->packed_ref_store, 0)) { + if (packed_refs_lock(refs->packed_ref_store, 0)) { strbuf_addf(err, "unable to lock packed-refs file: %s", strerror(errno)); ret = TRANSACTION_GENERIC_ERROR; diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 6619052e96..eb9da04576 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -321,7 +321,7 @@ static struct ref_dir *get_packed_refs(struct packed_ref_store *refs) /* * Add or overwrite a reference in the in-memory packed reference * cache. This may only be called while the packed-refs file is locked - * (see lock_packed_refs()). To actually write the packed-refs file, + * (see packed_refs_lock()). To actually write the packed-refs file, * call commit_packed_refs(). */ void add_packed_ref(struct ref_store *ref_store, @@ -515,11 +515,11 @@ static int write_packed_entry(FILE *fh, const char *refname, return 0; } -int lock_packed_refs(struct ref_store *ref_store, int flags) +int packed_refs_lock(struct ref_store *ref_store, int flags) { struct packed_ref_store *refs = packed_downcast(ref_store, REF_STORE_WRITE | REF_STORE_MAIN, - "lock_packed_refs"); + "packed_refs_lock"); static int timeout_configured = 0; static int timeout_value = 1000; struct packed_ref_cache *packed_ref_cache; @@ -567,7 +567,7 @@ static const char PACKED_REFS_HEADER[] = /* * Write the current version of the packed refs cache from memory to * disk. The packed-refs file must already be locked for writing (see - * lock_packed_refs()). Return zero on success. On errors, rollback + * packed_refs_lock()). Return zero on success. On errors, rollback * the lockfile, write an error message to `err`, and return a nonzero * value. */ @@ -698,7 +698,7 @@ int repack_without_refs(struct ref_store *ref_store, if (!needs_repacking) return 0; /* no refname exists in packed refs */ - if (lock_packed_refs(&refs->base, 0)) { + if (packed_refs_lock(&refs->base, 0)) { unable_to_lock_message(refs->path, errno, err); return -1; } diff --git a/refs/packed-backend.h b/refs/packed-backend.h index 3d4057b65b..dbc00d3396 100644 --- a/refs/packed-backend.h +++ b/refs/packed-backend.h @@ -9,7 +9,7 @@ struct ref_store *packed_ref_store_create(const char *path, * hold_lock_file_for_update(). Return 0 on success. On errors, set * errno appropriately and return a nonzero value. */ -int lock_packed_refs(struct ref_store *ref_store, int flags); +int packed_refs_lock(struct ref_store *ref_store, int flags); void add_packed_ref(struct ref_store *ref_store, const char *refname, const struct object_id *oid); -- 2.11.0
[PATCH 20/28] commit_packed_refs(): report errors rather than dying
Report errors via a `struct strbuf *err` rather than by calling `die()`. To enable this goal, change `write_packed_entry()` to report errors via a return value and `errno` rather than dying. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 10 +++--- refs/packed-backend.c | 85 +-- refs/packed-backend.h | 2 +- 3 files changed, 61 insertions(+), 36 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 039d0343cb..4323394a52 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1082,6 +1082,7 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) struct ref_iterator *iter; int ok; struct ref_to_prune *refs_to_prune = NULL; + struct strbuf err = STRBUF_INIT; lock_packed_refs(refs->packed_ref_store, LOCK_DIE_ON_ERROR); @@ -1116,10 +1117,11 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) if (ok != ITER_DONE) die("error while iterating over references"); - if (commit_packed_refs(refs->packed_ref_store)) - die_errno("unable to overwrite old ref-pack file"); + if (commit_packed_refs(refs->packed_ref_store, &err)) + die("unable to overwrite old ref-pack file: %s", err.buf); prune_refs(refs, refs_to_prune); + strbuf_release(&err); return 0; } @@ -2681,9 +2683,7 @@ static int files_initial_transaction_commit(struct ref_store *ref_store, &update->new_oid); } - if (commit_packed_refs(refs->packed_ref_store)) { - strbuf_addf(err, "unable to commit packed-refs file: %s", - strerror(errno)); + if (commit_packed_refs(refs->packed_ref_store, err)) { ret = TRANSACTION_GENERIC_ERROR; goto cleanup; } diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 5dbe4e4e59..5bee49d497 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -493,15 +493,19 @@ static struct ref_iterator *packed_ref_iterator_begin( /* * Write an entry to the packed-refs file for the specified refname. - * If peeled is non-NULL, write it as the entry's peeled value. + * If peeled is non-NULL, write it as the entry's peeled value. On + * error, return a nonzero value and leave errno set at the value left + * by the failing call to `fprintf()`. */ -static void write_packed_entry(FILE *fh, const char *refname, - const unsigned char *sha1, - const unsigned char *peeled) +static int write_packed_entry(FILE *fh, const char *refname, + const unsigned char *sha1, + const unsigned char *peeled) { - fprintf_or_die(fh, "%s %s\n", sha1_to_hex(sha1), refname); - if (peeled) - fprintf_or_die(fh, "^%s\n", sha1_to_hex(peeled)); + if (fprintf(fh, "%s %s\n", sha1_to_hex(sha1), refname) < 0 || + (peeled && fprintf(fh, "^%s\n", sha1_to_hex(peeled)) < 0)) + return -1; + + return 0; } int lock_packed_refs(struct ref_store *ref_store, int flags) @@ -550,49 +554,74 @@ static const char PACKED_REFS_HEADER[] = /* * Write the current version of the packed refs cache from memory to * disk. The packed-refs file must already be locked for writing (see - * lock_packed_refs()). Return zero on success. On errors, set errno - * and return a nonzero value. + * lock_packed_refs()). Return zero on success. On errors, rollback + * the lockfile, write an error message to `err`, and return a nonzero + * value. */ -int commit_packed_refs(struct ref_store *ref_store) +int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) { struct packed_ref_store *refs = packed_downcast(ref_store, REF_STORE_WRITE | REF_STORE_MAIN, "commit_packed_refs"); struct packed_ref_cache *packed_ref_cache = get_packed_ref_cache(refs); - int ok, error = 0; - int save_errno = 0; + int ok; + int ret = -1; FILE *out; struct ref_iterator *iter; if (!is_lock_file_locked(&refs->lock)) - die("BUG: packed-refs not locked"); + die("BUG: commit_packed_refs() called when unlocked"); out = fdopen_lock_file(&refs->lock, "w"); - if (!out) - die_errno("unable to fdopen packed-refs descriptor"); + if (!out) { + strbuf_addf(err, "unable to fdopen packed-refs tempfile: %s", + strerror(errno)); + goto error; + } - fprintf_or_die(out, "%s", PACKED_REFS_HEADER); + if (fprintf(out, "%s", PACKED_REFS_HEADER) < 0) { + strbuf_addf(err, "error writing to %s: %s", + get_lock_file_path(&refs->lock
[PATCH 11/28] commit_packed_refs(): take a `packed_ref_store *` parameter
It only cares about the packed-refs part of the reference store. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 0d8f39089f..5d159620f0 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1388,21 +1388,21 @@ static int lock_packed_refs(struct packed_ref_store *refs, int flags) * lock_packed_refs()). Return zero on success. On errors, set errno * and return a nonzero value */ -static int commit_packed_refs(struct files_ref_store *refs) +static int commit_packed_refs(struct packed_ref_store *refs) { struct packed_ref_cache *packed_ref_cache = - get_packed_ref_cache(refs->packed_ref_store); + get_packed_ref_cache(refs); int ok, error = 0; int save_errno = 0; FILE *out; struct ref_iterator *iter; - files_assert_main_repository(refs, "commit_packed_refs"); + packed_assert_main_repository(refs, "commit_packed_refs"); - if (!is_lock_file_locked(&refs->packed_ref_store->lock)) + if (!is_lock_file_locked(&refs->lock)) die("BUG: packed-refs not locked"); - out = fdopen_lock_file(&refs->packed_ref_store->lock, "w"); + out = fdopen_lock_file(&refs->lock, "w"); if (!out) die_errno("unable to fdopen packed-refs descriptor"); @@ -1420,7 +1420,7 @@ static int commit_packed_refs(struct files_ref_store *refs) if (ok != ITER_DONE) die("error while iterating over references"); - if (commit_lock_file(&refs->packed_ref_store->lock)) { + if (commit_lock_file(&refs->lock)) { save_errno = errno; error = -1; } @@ -1606,7 +1606,7 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) if (ok != ITER_DONE) die("error while iterating over references"); - if (commit_packed_refs(refs)) + if (commit_packed_refs(refs->packed_ref_store)) die_errno("unable to overwrite old ref-pack file"); prune_refs(refs, refs_to_prune); @@ -1662,7 +1662,7 @@ static int repack_without_refs(struct files_ref_store *refs, } /* Write what remains */ - ret = commit_packed_refs(refs); + ret = commit_packed_refs(refs->packed_ref_store); if (ret) strbuf_addf(err, "unable to overwrite old ref-pack file: %s", strerror(errno)); @@ -3227,7 +3227,7 @@ static int files_initial_transaction_commit(struct ref_store *ref_store, &update->new_oid); } - if (commit_packed_refs(refs)) { + if (commit_packed_refs(refs->packed_ref_store)) { strbuf_addf(err, "unable to commit packed-refs file: %s", strerror(errno)); ret = TRANSACTION_GENERIC_ERROR; -- 2.11.0
[PATCH 14/28] repack_without_refs(): take a `packed_ref_store *` parameter
It only cares about the packed-refs part of the reference store. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 2b9d93d3b6..c206791b91 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1621,19 +1621,19 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) * * The refs in 'refnames' needn't be sorted. `err` must not be NULL. */ -static int repack_without_refs(struct files_ref_store *refs, +static int repack_without_refs(struct packed_ref_store *refs, struct string_list *refnames, struct strbuf *err) { struct ref_dir *packed; struct string_list_item *refname; int ret, needs_repacking = 0, removed = 0; - files_assert_main_repository(refs, "repack_without_refs"); + packed_assert_main_repository(refs, "repack_without_refs"); assert(err); /* Look for a packed ref */ for_each_string_list_item(refname, refnames) { - if (get_packed_ref(refs->packed_ref_store, refname->string)) { + if (get_packed_ref(refs, refname->string)) { needs_repacking = 1; break; } @@ -1643,11 +1643,11 @@ static int repack_without_refs(struct files_ref_store *refs, if (!needs_repacking) return 0; /* no refname exists in packed refs */ - if (lock_packed_refs(refs->packed_ref_store, 0)) { - unable_to_lock_message(refs->packed_ref_store->path, errno, err); + if (lock_packed_refs(refs, 0)) { + unable_to_lock_message(refs->path, errno, err); return -1; } - packed = get_packed_refs(refs->packed_ref_store); + packed = get_packed_refs(refs); /* Remove refnames from the cache */ for_each_string_list_item(refname, refnames) @@ -1658,12 +1658,12 @@ static int repack_without_refs(struct files_ref_store *refs, * All packed entries disappeared while we were * acquiring the lock. */ - rollback_packed_refs(refs->packed_ref_store); + rollback_packed_refs(refs); return 0; } /* Write what remains */ - ret = commit_packed_refs(refs->packed_ref_store); + ret = commit_packed_refs(refs); if (ret) strbuf_addf(err, "unable to overwrite old ref-pack file: %s", strerror(errno)); @@ -1681,7 +1681,7 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg, if (!refnames->nr) return 0; - result = repack_without_refs(refs, refnames, &err); + result = repack_without_refs(refs->packed_ref_store, refnames, &err); if (result) { /* * If we failed to rewrite the packed-refs file, then @@ -3101,7 +3101,7 @@ static int files_transaction_finish(struct ref_store *ref_store, } } - if (repack_without_refs(refs, &refs_to_delete, err)) { + if (repack_without_refs(refs->packed_ref_store, &refs_to_delete, err)) { ret = TRANSACTION_GENERIC_ERROR; goto cleanup; } -- 2.11.0
[PATCH 13/28] get_packed_ref(): take a `packed_ref_store *` parameter
It only cares about the packed-refs part of the reference store. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index a08d3fbadf..2b9d93d3b6 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -602,10 +602,10 @@ static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs) * Return the ref_entry for the given refname from the packed * references. If it does not exist, return NULL. */ -static struct ref_entry *get_packed_ref(struct files_ref_store *refs, +static struct ref_entry *get_packed_ref(struct packed_ref_store *refs, const char *refname) { - return find_ref_entry(get_packed_refs(refs->packed_ref_store), refname); + return find_ref_entry(get_packed_refs(refs), refname); } /* @@ -621,7 +621,7 @@ static int resolve_packed_ref(struct files_ref_store *refs, * The loose reference file does not exist; check for a packed * reference. */ - entry = get_packed_ref(refs, refname); + entry = get_packed_ref(refs->packed_ref_store, refname); if (entry) { hashcpy(sha1, entry->u.value.oid.hash); *flags |= REF_ISPACKED; @@ -1044,7 +1044,9 @@ static int files_peel_ref(struct ref_store *ref_store, * have REF_KNOWS_PEELED. */ if (flag & REF_ISPACKED) { - struct ref_entry *r = get_packed_ref(refs, refname); + struct ref_entry *r = + get_packed_ref(refs->packed_ref_store, refname); + if (r) { if (peel_entry(r, 0)) return -1; @@ -1631,7 +1633,7 @@ static int repack_without_refs(struct files_ref_store *refs, /* Look for a packed ref */ for_each_string_list_item(refname, refnames) { - if (get_packed_ref(refs, refname->string)) { + if (get_packed_ref(refs->packed_ref_store, refname->string)) { needs_repacking = 1; break; } -- 2.11.0
[PATCH 19/28] packed_ref_store: make class into a subclass of `ref_store`
Add the infrastructure to make `packed_ref_store` implement `ref_store`, at least formally (few of the methods are actually implemented yet). Change the functions in its interface to take `ref_store *` arguments. Change `files_ref_store` to store a pointer to `ref_store *` and to call functions via the virtual `ref_store` interface where possible. This also means that a few `packed_ref_store` functions can become static. This is a work in progress. Some more `ref_store` methods will soon be implemented (e.g., those having to do with reference transactions). But some of them will never be implemented (e.g., those having to do with symrefs or reflogs). Signed-off-by: Michael Haggerty --- refs/files-backend.c | 16 ++-- refs/packed-backend.c | 231 +- refs/packed-backend.h | 23 ++--- refs/refs-internal.h | 1 + 4 files changed, 226 insertions(+), 45 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 83ea773728..039d0343cb 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -28,7 +28,7 @@ struct files_ref_store { struct ref_cache *loose; - struct packed_ref_store *packed_ref_store; + struct ref_store *packed_ref_store; }; static void clear_loose_ref_cache(struct files_ref_store *refs) @@ -311,8 +311,8 @@ static int files_read_raw_ref(struct ref_store *ref_store, if (lstat(path, &st) < 0) { if (errno != ENOENT) goto out; - if (packed_read_raw_ref(refs->packed_ref_store, refname, - sha1, referent, type)) { + if (refs_read_raw_ref(refs->packed_ref_store, refname, + sha1, referent, type)) { errno = ENOENT; goto out; } @@ -351,8 +351,8 @@ static int files_read_raw_ref(struct ref_store *ref_store, * ref is supposed to be, there could still be a * packed ref: */ - if (packed_read_raw_ref(refs->packed_ref_store, refname, - sha1, referent, type)) { + if (refs_read_raw_ref(refs->packed_ref_store, refname, + sha1, referent, type)) { errno = EISDIR; goto out; } @@ -683,7 +683,7 @@ static int files_peel_ref(struct ref_store *ref_store, * have REF_KNOWS_PEELED. */ if (flag & REF_ISPACKED && - !packed_peel_ref(refs->packed_ref_store, refname, sha1)) + !refs_peel_ref(refs->packed_ref_store, refname, sha1)) return 0; return peel_object(base, sha1); @@ -793,8 +793,8 @@ static struct ref_iterator *files_ref_iterator_begin( loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), prefix, 1); - packed_iter = packed_ref_iterator_begin(refs->packed_ref_store, - prefix, 0); + packed_iter = refs_ref_iterator_begin(refs->packed_ref_store, + prefix, 0, 0); iter->iter0 = overlay_ref_iterator_begin(loose_iter, packed_iter); iter->flags = flags; diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 6fa988c953..5dbe4e4e59 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -50,6 +50,8 @@ static int release_packed_ref_cache(struct packed_ref_cache *packed_refs) * `ref_store`. */ struct packed_ref_store { + struct ref_store base; + unsigned int store_flags; /* The path of the "packed-refs" file: */ @@ -68,14 +70,17 @@ struct packed_ref_store { struct lock_file lock; }; -struct packed_ref_store *packed_ref_store_create( - const char *path, unsigned int store_flags) +struct ref_store *packed_ref_store_create(const char *path, + unsigned int store_flags) { struct packed_ref_store *refs = xcalloc(1, sizeof(*refs)); + struct ref_store *ref_store = (struct ref_store *)refs; + base_ref_store_init(ref_store, &refs_be_packed); refs->store_flags = store_flags; + refs->path = xstrdup(path); - return refs; + return ref_store; } /* @@ -91,6 +96,31 @@ static void packed_assert_main_repository(struct packed_ref_store *refs, die("BUG: operation %s only allowed for main ref store", caller); } +/* + * Downcast `ref_store` to `packed_ref_store`. Die if `ref_store` is + * not a `packed_ref_store`. Also die if `packed_ref_store` doesn't + * support at least the flags specified in `required_flags`. `caller` + * is used in any necessary error messages. + */ +static struct packed_ref_store *packed_downcast(struct ref_store *ref_store, + unsi
[PATCH 24/28] packed_refs_unlock(), packed_refs_is_locked(): new functions
Add two new public functions, `packed_refs_unlock()` and `packed_refs_is_locked()`, with which callers can manage and query the `packed-refs` lock externally. Call `packed_refs_unlock()` from `commit_packed_refs()` and `rollback_packed_refs()`. Signed-off-by: Michael Haggerty --- refs/packed-backend.c | 31 +-- refs/packed-backend.h | 3 +++ 2 files changed, 28 insertions(+), 6 deletions(-) diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 24451343b8..ff6326ddb8 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -563,6 +563,29 @@ int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf *err) return 0; } +void packed_refs_unlock(struct ref_store *ref_store) +{ + struct packed_ref_store *refs = packed_downcast( + ref_store, + REF_STORE_READ | REF_STORE_WRITE, + "packed_refs_unlock"); + + if (!is_lock_file_locked(&refs->lock)) + die("BUG: packed_refs_unlock() called when not locked"); + rollback_lock_file(&refs->lock); + release_packed_ref_cache(refs->cache); +} + +int packed_refs_is_locked(struct ref_store *ref_store) +{ + struct packed_ref_store *refs = packed_downcast( + ref_store, + REF_STORE_READ | REF_STORE_WRITE, + "packed_refs_is_locked"); + + return is_lock_file_locked(&refs->lock); +} + /* * The packed-refs header line that we write out. Perhaps other * traits will be added later. The trailing space is required. @@ -649,8 +672,7 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) delete_tempfile(&refs->tempfile); out: - rollback_lock_file(&refs->lock); - release_packed_ref_cache(packed_ref_cache); + packed_refs_unlock(ref_store); return ret; } @@ -661,14 +683,11 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) */ static void rollback_packed_refs(struct packed_ref_store *refs) { - struct packed_ref_cache *packed_ref_cache = get_packed_ref_cache(refs); - packed_assert_main_repository(refs, "rollback_packed_refs"); if (!is_lock_file_locked(&refs->lock)) die("BUG: packed-refs not locked"); - rollback_lock_file(&refs->lock); - release_packed_ref_cache(packed_ref_cache); + packed_refs_unlock(&refs->base); clear_packed_ref_cache(refs); } diff --git a/refs/packed-backend.h b/refs/packed-backend.h index 210e3f35ce..03b7c1de95 100644 --- a/refs/packed-backend.h +++ b/refs/packed-backend.h @@ -11,6 +11,9 @@ struct ref_store *packed_ref_store_create(const char *path, */ int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf *err); +void packed_refs_unlock(struct ref_store *ref_store); +int packed_refs_is_locked(struct ref_store *ref_store); + void add_packed_ref(struct ref_store *ref_store, const char *refname, const struct object_id *oid); -- 2.11.0
[PATCH 17/28] packed_read_raw_ref(): new function, replacing `resolve_packed_ref()`
Add a new function, `packed_read_raw_ref()`, which is nearly a `read_raw_ref_fn`. Use it in place of `resolve_packed_ref()`. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 36 +--- 1 file changed, 17 insertions(+), 19 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index e57cdeba36..ac4764f6f7 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -608,27 +608,23 @@ static struct ref_entry *get_packed_ref(struct packed_ref_store *refs, return find_ref_entry(get_packed_refs(refs), refname); } -/* - * A loose ref file doesn't exist; check for a packed ref. - */ -static int resolve_packed_ref(struct files_ref_store *refs, - const char *refname, - unsigned char *sha1, unsigned int *flags) +static int packed_read_raw_ref(struct packed_ref_store *refs, + const char *refname, unsigned char *sha1, + struct strbuf *referent, unsigned int *type) { struct ref_entry *entry; - /* -* The loose reference file does not exist; check for a packed -* reference. -*/ - entry = get_packed_ref(refs->packed_ref_store, refname); - if (entry) { - hashcpy(sha1, entry->u.value.oid.hash); - *flags |= REF_ISPACKED; - return 0; + *type = 0; + + entry = get_packed_ref(refs, refname); + if (!entry) { + errno = ENOENT; + return -1; } - /* refname is not a packed reference. */ - return -1; + + hashcpy(sha1, entry->u.value.oid.hash); + *type = REF_ISPACKED; + return 0; } static int files_read_raw_ref(struct ref_store *ref_store, @@ -674,7 +670,8 @@ static int files_read_raw_ref(struct ref_store *ref_store, if (lstat(path, &st) < 0) { if (errno != ENOENT) goto out; - if (resolve_packed_ref(refs, refname, sha1, type)) { + if (packed_read_raw_ref(refs->packed_ref_store, refname, + sha1, referent, type)) { errno = ENOENT; goto out; } @@ -713,7 +710,8 @@ static int files_read_raw_ref(struct ref_store *ref_store, * ref is supposed to be, there could still be a * packed ref: */ - if (resolve_packed_ref(refs, refname, sha1, type)) { + if (packed_read_raw_ref(refs->packed_ref_store, refname, + sha1, referent, type)) { errno = EISDIR; goto out; } -- 2.11.0
[PATCH 18/28] packed-backend: new module for handling packed references
Now that the interface between `files_ref_store` and `packed_ref_store` is relatively narrow, move the latter into a new module, "refs/packed-backend.h" and "refs/packed-backend.c". It still doesn't quite implement the `ref_store` interface, but it will soon. This commit moves code around and adjusts its visibility, but doesn't change anything. Signed-off-by: Michael Haggerty --- Makefile | 1 + refs.c| 18 ++ refs/files-backend.c | 640 +- refs/packed-backend.c | 624 refs/packed-backend.h | 33 +++ refs/refs-internal.h | 9 + 6 files changed, 686 insertions(+), 639 deletions(-) create mode 100644 refs/packed-backend.c create mode 100644 refs/packed-backend.h diff --git a/Makefile b/Makefile index 33b888730f..478a8eac5d 100644 --- a/Makefile +++ b/Makefile @@ -821,6 +821,7 @@ LIB_OBJS += reflog-walk.o LIB_OBJS += refs.o LIB_OBJS += refs/files-backend.o LIB_OBJS += refs/iterator.o +LIB_OBJS += refs/packed-backend.o LIB_OBJS += refs/ref-cache.o LIB_OBJS += ref-filter.o LIB_OBJS += remote.o diff --git a/refs.c b/refs.c index 32177969f0..5880c12372 100644 --- a/refs.c +++ b/refs.c @@ -173,6 +173,24 @@ int refname_is_safe(const char *refname) return 1; } +/* + * Return true if refname, which has the specified oid and flags, can + * be resolved to an object in the database. If the referred-to object + * does not exist, emit a warning and return false. + */ +int ref_resolves_to_object(const char *refname, + const struct object_id *oid, + unsigned int flags) +{ + if (flags & REF_ISBROKEN) + return 0; + if (!has_sha1_file(oid->hash)) { + error("%s does not point to a valid object!", refname); + return 0; + } + return 1; +} + char *refs_resolve_refdup(struct ref_store *refs, const char *refname, int resolve_flags, unsigned char *sha1, int *flags) diff --git a/refs/files-backend.c b/refs/files-backend.c index ac4764f6f7..83ea773728 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -2,6 +2,7 @@ #include "../refs.h" #include "refs-internal.h" #include "ref-cache.h" +#include "packed-backend.h" #include "../iterator.h" #include "../dir-iterator.h" #include "../lockfile.h" @@ -14,85 +15,6 @@ struct ref_lock { struct object_id old_oid; }; -/* - * Return true if refname, which has the specified oid and flags, can - * be resolved to an object in the database. If the referred-to object - * does not exist, emit a warning and return false. - */ -static int ref_resolves_to_object(const char *refname, - const struct object_id *oid, - unsigned int flags) -{ - if (flags & REF_ISBROKEN) - return 0; - if (!has_sha1_file(oid->hash)) { - error("%s does not point to a valid object!", refname); - return 0; - } - return 1; -} - -struct packed_ref_cache { - struct ref_cache *cache; - - /* -* Count of references to the data structure in this instance, -* including the pointer from files_ref_store::packed if any. -* The data will not be freed as long as the reference count -* is nonzero. -*/ - unsigned int referrers; - - /* The metadata from when this packed-refs cache was read */ - struct stat_validity validity; -}; - -/* - * A container for `packed-refs`-related data. It is not (yet) a - * `ref_store`. - */ -struct packed_ref_store { - unsigned int store_flags; - - /* The path of the "packed-refs" file: */ - char *path; - - /* -* A cache of the values read from the `packed-refs` file, if -* it might still be current; otherwise, NULL. -*/ - struct packed_ref_cache *cache; - - /* -* Lock used for the "packed-refs" file. Note that this (and -* thus the enclosing `packed_ref_store`) must not be freed. -*/ - struct lock_file lock; -}; - -static struct packed_ref_store *packed_ref_store_create( - const char *path, unsigned int store_flags) -{ - struct packed_ref_store *refs = xcalloc(1, sizeof(*refs)); - - refs->store_flags = store_flags; - refs->path = xstrdup(path); - return refs; -} - -/* - * Die if refs is not the main ref store. caller is used in any - * necessary error messages. - */ -static void packed_assert_main_repository(struct packed_ref_store *refs, - const char *caller) -{ - if (refs->store_flags & REF_STORE_MAIN) - return; - - die("BUG: operation %s only allowed for main ref store", caller); -} - /* * Future: need to be in "struct repository" * when doing a full libification. @@ -109,42
[PATCH 26/28] commit_packed_refs(): remove call to `packed_refs_unlock()`
Instead, change the callers of `commit_packed_refs()` to call `packed_refs_unlock()`. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 2 ++ refs/packed-backend.c | 18 -- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 802ed9e2e9..09dad2806e 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1119,6 +1119,7 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) if (commit_packed_refs(refs->packed_ref_store, &err)) die("unable to overwrite old ref-pack file: %s", err.buf); + packed_refs_unlock(refs->packed_ref_store); prune_refs(refs, refs_to_prune); strbuf_release(&err); @@ -2687,6 +2688,7 @@ static int files_initial_transaction_commit(struct ref_store *ref_store, } cleanup: + packed_refs_unlock(refs->packed_ref_store); transaction->state = REF_TRANSACTION_CLOSED; string_list_clear(&affected_refnames, 0); return ret; diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 1732e3aad4..54b48d1f02 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -606,7 +606,6 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) struct packed_ref_cache *packed_ref_cache = get_packed_ref_cache(refs); int ok; - int ret = -1; struct strbuf sb = STRBUF_INIT; FILE *out; struct ref_iterator *iter; @@ -619,7 +618,7 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) strbuf_addf(err, "unable to create file %s: %s", sb.buf, strerror(errno)); strbuf_release(&sb); - goto out; + return -1; } strbuf_release(&sb); @@ -660,18 +659,14 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) if (rename_tempfile(&refs->tempfile, refs->path)) { strbuf_addf(err, "error replacing %s: %s", refs->path, strerror(errno)); - goto out; + return -1; } - ret = 0; - goto out; + return 0; error: delete_tempfile(&refs->tempfile); - -out: - packed_refs_unlock(ref_store); - return ret; + return -1; } /* @@ -705,6 +700,7 @@ int repack_without_refs(struct ref_store *ref_store, struct ref_dir *packed; struct string_list_item *refname; int needs_repacking = 0, removed = 0; + int ret; packed_assert_main_repository(refs, "repack_without_refs"); assert(err); @@ -740,7 +736,9 @@ int repack_without_refs(struct ref_store *ref_store, } /* Write what remains */ - return commit_packed_refs(&refs->base, err); + ret = commit_packed_refs(&refs->base, err); + packed_refs_unlock(ref_store); + return ret; } static int packed_init_db(struct ref_store *ref_store, struct strbuf *err) -- 2.11.0
[PATCH 28/28] read_packed_refs(): die if `packed-refs` contains bogus data
The old code ignored any lines that it didn't understand. This is dangerous. Instead, `die()` if the `packed-refs` file contains any lines that we don't know how to handle. Signed-off-by: Michael Haggerty --- refs/packed-backend.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 721afd066a..311fd014ce 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -253,9 +253,7 @@ static struct packed_ref_cache *read_packed_refs(const char *packed_refs_file) (peeled == PEELED_TAGS && starts_with(refname, "refs/tags/"))) last->flag |= REF_KNOWS_PEELED; add_ref_entry(dir, last); - continue; - } - if (last && + } else if (last && line.buf[0] == '^' && line.len == PEELED_LINE_LENGTH && line.buf[PEELED_LINE_LENGTH - 1] == '\n' && @@ -267,6 +265,8 @@ static struct packed_ref_cache *read_packed_refs(const char *packed_refs_file) * reference: */ last->flag |= REF_KNOWS_PEELED; + } else { + die("unexpected line in %s: %s", packed_refs_file, line.buf); } } -- 2.11.0
[PATCH 06/28] validate_packed_ref_cache(): take a `packed_ref_store *` parameter
It only cares about the packed-refs part of the reference store. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 2b0db60b2b..f061506bf0 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -396,12 +396,11 @@ static void files_ref_path(struct files_ref_store *refs, * Check that the packed refs cache (if any) still reflects the * contents of the file. If not, clear the cache. */ -static void validate_packed_ref_cache(struct files_ref_store *refs) +static void validate_packed_ref_cache(struct packed_ref_store *refs) { - if (refs->packed_ref_store->cache && - !stat_validity_check(&refs->packed_ref_store->cache->validity, -refs->packed_ref_store->path)) - clear_packed_ref_cache(refs->packed_ref_store); + if (refs->cache && + !stat_validity_check(&refs->cache->validity, refs->path)) + clear_packed_ref_cache(refs); } /* @@ -417,7 +416,7 @@ static struct packed_ref_cache *get_packed_ref_cache(struct files_ref_store *ref const char *packed_refs_file = refs->packed_ref_store->path; if (!is_lock_file_locked(&refs->packed_ref_store->lock)) - validate_packed_ref_cache(refs); + validate_packed_ref_cache(refs->packed_ref_store); if (!refs->packed_ref_store->cache) refs->packed_ref_store->cache = read_packed_refs(packed_refs_file); @@ -1364,7 +1363,7 @@ static int lock_packed_refs(struct files_ref_store *refs, int flags) * cache is still valid. We've just locked the file, but it * might have changed the moment *before* we locked it. */ - validate_packed_ref_cache(refs); + validate_packed_ref_cache(refs->packed_ref_store); packed_ref_cache = get_packed_ref_cache(refs); /* Increment the reference count to prevent it from being freed: */ -- 2.11.0
[PATCH 02/28] packed_ref_store: new struct
Start extracting the packed-refs-related data structures into a new class, `packed_ref_store`. It doesn't yet implement `ref_store`, but it will. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 42 +- 1 file changed, 33 insertions(+), 9 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 87cecde231..2efb71cee9 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -47,6 +47,28 @@ struct packed_ref_cache { struct stat_validity validity; }; +/* + * A container for `packed-refs`-related data. It is not (yet) a + * `ref_store`. + */ +struct packed_ref_store { + unsigned int store_flags; + + /* +* A cache of the values read from the `packed-refs` file, if +* it might still be current; otherwise, NULL. +*/ + struct packed_ref_cache *cache; +}; + +static struct packed_ref_store *packed_ref_store_create(unsigned int store_flags) +{ + struct packed_ref_store *refs = xcalloc(1, sizeof(*refs)); + + refs->store_flags = store_flags; + return refs; +} + /* * Future: need to be in "struct repository" * when doing a full libification. @@ -60,13 +82,14 @@ struct files_ref_store { char *packed_refs_path; struct ref_cache *loose; - struct packed_ref_cache *packed; /* * Lock used for the "packed-refs" file. Note that this (and * thus the enclosing `files_ref_store`) must not be freed. */ struct lock_file packed_refs_lock; + + struct packed_ref_store *packed_ref_store; }; /* @@ -95,12 +118,12 @@ static int release_packed_ref_cache(struct packed_ref_cache *packed_refs) static void clear_packed_ref_cache(struct files_ref_store *refs) { - if (refs->packed) { - struct packed_ref_cache *packed_refs = refs->packed; + if (refs->packed_ref_store->cache) { + struct packed_ref_cache *packed_refs = refs->packed_ref_store->cache; if (is_lock_file_locked(&refs->packed_refs_lock)) die("BUG: packed-ref cache cleared while locked"); - refs->packed = NULL; + refs->packed_ref_store->cache = NULL; release_packed_ref_cache(packed_refs); } } @@ -132,6 +155,7 @@ static struct ref_store *files_ref_store_create(const char *gitdir, refs->gitcommondir = strbuf_detach(&sb, NULL); strbuf_addf(&sb, "%s/packed-refs", refs->gitcommondir); refs->packed_refs_path = strbuf_detach(&sb, NULL); + refs->packed_ref_store = packed_ref_store_create(flags); return ref_store; } @@ -375,8 +399,8 @@ static void files_ref_path(struct files_ref_store *refs, */ static void validate_packed_ref_cache(struct files_ref_store *refs) { - if (refs->packed && - !stat_validity_check(&refs->packed->validity, + if (refs->packed_ref_store->cache && + !stat_validity_check(&refs->packed_ref_store->cache->validity, files_packed_refs_path(refs))) clear_packed_ref_cache(refs); } @@ -396,10 +420,10 @@ static struct packed_ref_cache *get_packed_ref_cache(struct files_ref_store *ref if (!is_lock_file_locked(&refs->packed_refs_lock)) validate_packed_ref_cache(refs); - if (!refs->packed) - refs->packed = read_packed_refs(packed_refs_file); + if (!refs->packed_ref_store->cache) + refs->packed_ref_store->cache = read_packed_refs(packed_refs_file); - return refs->packed; + return refs->packed_ref_store->cache; } static struct ref_dir *get_packed_ref_dir(struct packed_ref_cache *packed_ref_cache) -- 2.11.0
[PATCH 27/28] repack_without_refs(): don't lock or unlock the packed refs
Change `repack_without_refs()` to expect the packed-refs lock to be held already, and not to release the lock before returning. Change the callers to deal with lock management. This change makes it possible for callers to hold the packed-refs lock for a longer span of time, a possibility that will eventually make it possible to fix some longstanding races. The only semantic change here is that `repack_without_refs()` used to forgot to release the lock in the `if (!removed)` exit path. That omission is now fixed. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 47 +++ refs/packed-backend.c | 32 2 files changed, 39 insertions(+), 40 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 09dad2806e..972dd5c0c9 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1137,24 +1137,16 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg, if (!refnames->nr) return 0; - result = repack_without_refs(refs->packed_ref_store, refnames, &err); - if (result) { - /* -* If we failed to rewrite the packed-refs file, then -* it is unsafe to try to remove loose refs, because -* doing so might expose an obsolete packed value for -* a reference that might even point at an object that -* has been garbage collected. -*/ - if (refnames->nr == 1) - error(_("could not delete reference %s: %s"), - refnames->items[0].string, err.buf); - else - error(_("could not delete references: %s"), err.buf); + if (packed_refs_lock(refs->packed_ref_store, 0, &err)) + goto error; - goto out; + if (repack_without_refs(refs->packed_ref_store, refnames, &err)) { + packed_refs_unlock(refs->packed_ref_store); + goto error; } + packed_refs_unlock(refs->packed_ref_store); + for (i = 0; i < refnames->nr; i++) { const char *refname = refnames->items[i].string; @@ -1162,9 +1154,24 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg, result |= error(_("could not remove reference %s"), refname); } -out: strbuf_release(&err); return result; + +error: + /* +* If we failed to rewrite the packed-refs file, then it is +* unsafe to try to remove loose refs, because doing so might +* expose an obsolete packed value for a reference that might +* even point at an object that has been garbage collected. +*/ + if (refnames->nr == 1) + error(_("could not delete reference %s: %s"), + refnames->items[0].string, err.buf); + else + error(_("could not delete references: %s"), err.buf); + + strbuf_release(&err); + return -1; } /* @@ -2557,11 +2564,19 @@ static int files_transaction_finish(struct ref_store *ref_store, } } + if (packed_refs_lock(refs->packed_ref_store, 0, err)) { + ret = TRANSACTION_GENERIC_ERROR; + goto cleanup; + } + if (repack_without_refs(refs->packed_ref_store, &refs_to_delete, err)) { ret = TRANSACTION_GENERIC_ERROR; + packed_refs_unlock(refs->packed_ref_store); goto cleanup; } + packed_refs_unlock(refs->packed_ref_store); + /* Delete the reflogs of any references that were deleted: */ for_each_string_list_item(ref_to_delete, &refs_to_delete) { strbuf_reset(&sb); diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 54b48d1f02..721afd066a 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -669,25 +669,12 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) return -1; } -/* - * Rollback the lockfile for the packed-refs file, and discard the - * in-memory packed reference cache. (The packed-refs file will be - * read anew if it is needed again after this function is called.) - */ -static void rollback_packed_refs(struct packed_ref_store *refs) -{ - packed_assert_main_repository(refs, "rollback_packed_refs"); - - if (!is_lock_file_locked(&refs->lock)) - die("BUG: packed-refs not locked"); - packed_refs_unlock(&refs->base); - clear_packed_ref_cache(refs); -} - /* * Rewrite the packed-refs file, omitting any refs listed in * 'refnames'. On error, leave packed-refs unchanged, write an error - * message to 'err', and return a nonzero value. + * message to 'err', and return a nonzero value. The packed refs lock + * must be held when calling this function; it will still be held when + * the function r
[PATCH 12/28] rollback_packed_refs(): take a `packed_ref_store *` parameter
It only cares about the packed-refs part of the reference store. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 5d159620f0..a08d3fbadf 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1434,18 +1434,17 @@ static int commit_packed_refs(struct packed_ref_store *refs) * in-memory packed reference cache. (The packed-refs file will be * read anew if it is needed again after this function is called.) */ -static void rollback_packed_refs(struct files_ref_store *refs) +static void rollback_packed_refs(struct packed_ref_store *refs) { - struct packed_ref_cache *packed_ref_cache = - get_packed_ref_cache(refs->packed_ref_store); + struct packed_ref_cache *packed_ref_cache = get_packed_ref_cache(refs); - files_assert_main_repository(refs, "rollback_packed_refs"); + packed_assert_main_repository(refs, "rollback_packed_refs"); - if (!is_lock_file_locked(&refs->packed_ref_store->lock)) + if (!is_lock_file_locked(&refs->lock)) die("BUG: packed-refs not locked"); - rollback_lock_file(&refs->packed_ref_store->lock); + rollback_lock_file(&refs->lock); release_packed_ref_cache(packed_ref_cache); - clear_packed_ref_cache(refs->packed_ref_store); + clear_packed_ref_cache(refs); } struct ref_to_prune { @@ -1657,7 +1656,7 @@ static int repack_without_refs(struct files_ref_store *refs, * All packed entries disappeared while we were * acquiring the lock. */ - rollback_packed_refs(refs); + rollback_packed_refs(refs->packed_ref_store); return 0; } -- 2.11.0
[PATCH 05/28] clear_packed_ref_cache(): take a `packed_ref_store *` parameter
It only cares about the packed-refs part of the reference store. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index de8293493f..2b0db60b2b 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -120,15 +120,15 @@ static int release_packed_ref_cache(struct packed_ref_cache *packed_refs) } } -static void clear_packed_ref_cache(struct files_ref_store *refs) +static void clear_packed_ref_cache(struct packed_ref_store *refs) { - if (refs->packed_ref_store->cache) { - struct packed_ref_cache *packed_refs = refs->packed_ref_store->cache; + if (refs->cache) { + struct packed_ref_cache *cache = refs->cache; - if (is_lock_file_locked(&refs->packed_ref_store->lock)) + if (is_lock_file_locked(&refs->lock)) die("BUG: packed-ref cache cleared while locked"); - refs->packed_ref_store->cache = NULL; - release_packed_ref_cache(packed_refs); + refs->cache = NULL; + release_packed_ref_cache(cache); } } @@ -401,7 +401,7 @@ static void validate_packed_ref_cache(struct files_ref_store *refs) if (refs->packed_ref_store->cache && !stat_validity_check(&refs->packed_ref_store->cache->validity, refs->packed_ref_store->path)) - clear_packed_ref_cache(refs); + clear_packed_ref_cache(refs->packed_ref_store); } /* @@ -1435,7 +1435,7 @@ static void rollback_packed_refs(struct files_ref_store *refs) die("BUG: packed-refs not locked"); rollback_lock_file(&refs->packed_ref_store->lock); release_packed_ref_cache(packed_ref_cache); - clear_packed_ref_cache(refs); + clear_packed_ref_cache(refs->packed_ref_store); } struct ref_to_prune { -- 2.11.0
[PATCH 23/28] packed_refs_lock(): report errors via a `struct strbuf *err`
That way the callers don't have to come up with error messages themselves. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 6 ++ refs/packed-backend.c | 17 +++-- refs/packed-backend.h | 6 +++--- 3 files changed, 16 insertions(+), 13 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 3fc2254e33..802ed9e2e9 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1084,7 +1084,7 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) struct ref_to_prune *refs_to_prune = NULL; struct strbuf err = STRBUF_INIT; - packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR); + packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR, &err); iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0); while ((ok = ref_iterator_advance(iter)) == ITER_OK) { @@ -2667,9 +2667,7 @@ static int files_initial_transaction_commit(struct ref_store *ref_store, } } - if (packed_refs_lock(refs->packed_ref_store, 0)) { - strbuf_addf(err, "unable to lock packed-refs file: %s", - strerror(errno)); + if (packed_refs_lock(refs->packed_ref_store, 0, err)) { ret = TRANSACTION_GENERIC_ERROR; goto cleanup; } diff --git a/refs/packed-backend.c b/refs/packed-backend.c index eb9da04576..24451343b8 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -515,7 +515,7 @@ static int write_packed_entry(FILE *fh, const char *refname, return 0; } -int packed_refs_lock(struct ref_store *ref_store, int flags) +int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf *err) { struct packed_ref_store *refs = packed_downcast(ref_store, REF_STORE_WRITE | REF_STORE_MAIN, @@ -537,9 +537,15 @@ int packed_refs_lock(struct ref_store *ref_store, int flags) if (hold_lock_file_for_update_timeout( &refs->lock, refs->path, - flags, timeout_value) < 0 || - close_lock_file(&refs->lock)) + flags, timeout_value) < 0) { + unable_to_lock_message(refs->path, errno, err); + return -1; + } + + if (close_lock_file(&refs->lock)) { + strbuf_addf(err, "unable to close %s: %s", refs->path, strerror(errno)); return -1; + } /* * Now that we hold the `packed-refs` lock, make sure that our @@ -698,10 +704,9 @@ int repack_without_refs(struct ref_store *ref_store, if (!needs_repacking) return 0; /* no refname exists in packed refs */ - if (packed_refs_lock(&refs->base, 0)) { - unable_to_lock_message(refs->path, errno, err); + if (packed_refs_lock(&refs->base, 0, err)) return -1; - } + packed = get_packed_refs(refs); /* Remove refnames from the cache */ diff --git a/refs/packed-backend.h b/refs/packed-backend.h index dbc00d3396..210e3f35ce 100644 --- a/refs/packed-backend.h +++ b/refs/packed-backend.h @@ -6,10 +6,10 @@ struct ref_store *packed_ref_store_create(const char *path, /* * Lock the packed-refs file for writing. Flags is passed to - * hold_lock_file_for_update(). Return 0 on success. On errors, set - * errno appropriately and return a nonzero value. + * hold_lock_file_for_update(). Return 0 on success. On errors, write + * an error message to `err` and return a nonzero value. */ -int packed_refs_lock(struct ref_store *ref_store, int flags); +int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf *err); void add_packed_ref(struct ref_store *ref_store, const char *refname, const struct object_id *oid); -- 2.11.0
[PATCH 25/28] clear_packed_ref_cache(): don't protest if the lock is held
The existing callers already check that the lock isn't held just before calling `clear_packed_ref_cache()`, and in the near future we want to be able to call this function when the lock is held. Signed-off-by: Michael Haggerty --- refs/packed-backend.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/refs/packed-backend.c b/refs/packed-backend.c index ff6326ddb8..1732e3aad4 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -133,8 +133,6 @@ static void clear_packed_ref_cache(struct packed_ref_store *refs) if (refs->cache) { struct packed_ref_cache *cache = refs->cache; - if (is_lock_file_locked(&refs->lock)) - die("BUG: packed-ref cache cleared while locked"); refs->cache = NULL; release_packed_ref_cache(cache); } -- 2.11.0
Re: [PATCH] checkout: don't write merge results into the object database
Am 15.06.2017 um 15:57 schrieb Jeff King: > On Thu, Jun 15, 2017 at 01:33:42PM +0200, René Scharfe wrote: > >> Merge results need to be written to the worktree, of course, but we >> don't necessarily need object entries for them, especially if they >> contain conflict markers. Use pretend_sha1_file() to create fake >> blobs to pass to make_cache_entry() and checkout_entry() instead. > > Conceptually this makes sense, although I have a comment below. > > Out of curiosity, did this come up when looking into the cherry-pick > segfault/error from a few hours ago? No, it came up in the discussion of Dscho's memory leak plug series [1]. >> @@ -225,8 +225,8 @@ static int checkout_merged(int pos, const struct >> checkout *state) >> * (it also writes the merge result to the object database even >> * when it may contain conflicts). >> */ >> -if (write_sha1_file(result_buf.ptr, result_buf.size, >> -blob_type, oid.hash)) >> +if (pretend_sha1_file(result_buf.ptr, result_buf.size, >> + OBJ_BLOB, oid.hash)) >> die(_("Unable to add merge result for '%s'"), path); >> free(result_buf.ptr); > > I wondered if pretend_sha1_file() makes a copy of the buffer, and indeed > it does. So this is correct. > > But that raises an interesting question: how big are these objects, and > is it a good idea to store them in RAM? Obviously they're in RAM > already, but I'm not sure if it's one-at-a-time. We could be bumping the > peak RAM used if there's a large number of these entries. Touching the > on-disk odb does feel hacky, but it also means they behave like other > objects. > > If it is a concern, I think it could be solved by "unpretending" after > our call to checkout_entry completes. That would need a new call in > sha1_file.c, but it should be easy to write. Good point; we'd accumulate fake entries that we'll never need again. The patch should clean them up. Alternatively we could finally address the NEEDSWORK comment and provide a way to checkout a file without faking an index entry.. René [1] https://public-inbox.org/git/2704e145927c851c4163a68cfdfd5ada48fff21d.1493906085.git.johannes.schinde...@gmx.de/
Re: What's cooking in git.git (Jun 2017, #04; Tue, 13)
Jonathan Nieder writes: >> It is not known if a simple "yes/no" is sufficient in the longer >> term, and what should happen when --recurse-submodules option starts >> taking "recurse into them how?" parameter, though. > > Any pointers for where this has been discussed, if anywhere (e.g. was If this were discussed, then the answer to the question we may know by now ;-) I do not think anybody gave a serious thought to convince the public why a boolean is enough, hence this comment. >> * bw/config-h (2017-06-13) 4 commits >> - config: don't implicitly use gitdir >> - config: don't include config.h by default >> - config: remove git_config_iter >> - config: create config.h >> >> Code clean-up. > > Patches 1-3 are good to go IMHO. > > Patch 4 in pu is marked with my Reviewed-by. I think it's getting > there but not there yet. Did some script pull the tag from my reply > to the cover letter? No, nothing that elaborate. I go through each message in Gnus newsreader and feed the article to a shell command, e.g. "Meta/add-by -r jrnieder@ | git am -s3c". The UI remembers the last command I used when I choose to feed the next article to a shell command, and after running it to first three, I forgot to remove the 'add-by' bit from the command line for the fourth one.
Re: [PATCH 2/2] date: use localtime() for "-local" time formats
Am 15.06.2017 um 15:52 schrieb Jeff King: But for the special case of the "-local" formats, we can just skip the adjustment and use localtime() instead of gmtime(). This makes --date=format-local:%Z work correctly, showing the local timezone instead of an empty string. Documentation/rev-list-options.txt should be updated to mention that %Z is passed to strftime in the local case, no? The new test checks the result for "UTC", our default test-lib value for $TZ. Using something like EST5 might be more interesting, but the actual zone string is system-dependent (for instance, on my system it expands to just EST). Hopefully "UTC" is vanilla enough that every system treats it the same. Signed-off-by: Jeff King --- I don't have a Windows system to test this on, but from the output Dscho provided earlier, I believe this should pass. The first patch applies with some fuzz on master of Git for Windows, the second one applies cleanly. A "typedef unsigned long timestamp_t;" is required to compile it; such a fixup won't be needed for long, I guess. t0006 succeeds. René
git diff sometimes brings up buggy pager
When I do `git diff` sometimes I get this: ...skipping... ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ...skipping... ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ...skipping... ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ it goes on like this for about 10 times the length. Looks like this happens exclusively when I use git diff with a github remote that is at the same commit. I will update if I find any other case where this happens.
Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan
On Thu, Jun 15 2017, Jeff King jotted: > On Thu, Jun 15, 2017 at 08:05:18PM +0900, Mike Hommey wrote: > >> On Thu, Jun 15, 2017 at 12:30:46PM +0200, Johannes Schindelin wrote: >> > Footnote *1*: SHA-256, as all hash functions whose output is essentially >> > the entire internal state, are susceptible to a so-called "length >> > extension attack", where the hash of a secret+message can be used to >> > generate the hash of secret+message+piggyback without knowing the secret. >> > This is not the case for Git: only visible data are hashed. The type of >> > attacks Git has to worry about is very different from the length extension >> > attacks, and it is highly unlikely that that weakness of SHA-256 leads to, >> > say, a collision attack. >> >> What do the experts think or SHA512/256, which completely removes the >> concerns over length extension attack? (which I'd argue is better than >> sweeping them under the carpet) > > I don't think it's sweeping them under the carpet. Git does not use the > hash as a MAC, so length extension attacks aren't a thing (and even if > we later wanted to use the same algorithm as a MAC, the HMAC > construction is a well-studied technique for dealing with it). > > That said, SHA-512 is typically a little faster than SHA-256 on 64-bit > platforms. I don't know if that will change with the advent of hardware > instructions oriented towards SHA-256. Quoting my own cacbzzx7jra2niwt9wsgaxnzs+gws8htugzwm8nay1gs87o8...@mail.gmail.com sent ~2 weeks ago to the list: On Fri, Jun 2, 2017 at 7:54 PM, Jonathan Nieder wrote: [...] > 4. When choosing a hash function, people may argue about performance. >It would be useful for run some benchmarks for git (running >the test suite, t/perf tests, etc) using a variety of hash >functions as input to such a discussion. To the extent that such benchmarks matter, it seems prudent to heavily weigh them in favor of whatever seems to be likely to be the more common hash function going forward, since those are likely to get faster through future hardware acceleration. E.g. Intel announced Goldmont last year which according to one SHA-1 implementation improved from 9.5 cycles per byte to 2.7 cpb[1]. They only have acceleration for SHA-1 and SHA-256[2] 1. https://github.com/weidai11/cryptopp/issues/139#issuecomment-264283385 2. https://en.wikipedia.org/wiki/Goldmont Maybe someone else knows of better numbers / benchmarks, but such a reduction in CBP likely makes it faster than SHA-512.
[PATCH] diff-highlight: split code into module
The diff-so-fancy project is also written in perl, and most of its users pipe diffs through both diff-highlight and diff-so-fancy. It would be nice if this could be done in a single script. So let's pull most of diff-highlight's code into its own module which can be used by diff-so-fancy. In addition, we'll abstract a few basic items like reading from stdio so that a script using the module can do more processing before or after diff-highlight handles the lines. See the README update for more details. One small downside is that the diff-highlight script must now be built using the Makefile. There are ways around this, but it quickly gets into perl arcana. Let's go with the simple solution. As a bonus, our Makefile now respects the PERL_PATH variable if it is set. Signed-off-by: Jeff King --- Scott and I discussed this off-list, and this was the least-gross solution I came up with. The plan would be for diff-so-fancy to pull in this copy of diff-highlight from git.git and have a wrapper script similar to the diff-highlight.perl found here. contrib/diff-highlight/.gitignore | 2 ++ .../{diff-highlight => DiffHighlight.pm} | 40 +- contrib/diff-highlight/Makefile| 21 ++-- contrib/diff-highlight/README | 30 contrib/diff-highlight/diff-highlight.perl | 8 + 5 files changed, 82 insertions(+), 19 deletions(-) create mode 100644 contrib/diff-highlight/.gitignore rename contrib/diff-highlight/{diff-highlight => DiffHighlight.pm} (91%) mode change 100755 => 100644 create mode 100644 contrib/diff-highlight/diff-highlight.perl diff --git a/contrib/diff-highlight/.gitignore b/contrib/diff-highlight/.gitignore new file mode 100644 index 0..c07454824 --- /dev/null +++ b/contrib/diff-highlight/.gitignore @@ -0,0 +1,2 @@ +shebang.perl +diff-highlight diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/DiffHighlight.pm old mode 100755 new mode 100644 similarity index 91% rename from contrib/diff-highlight/diff-highlight rename to contrib/diff-highlight/DiffHighlight.pm index 81bd8040e..663992e53 --- a/contrib/diff-highlight/diff-highlight +++ b/contrib/diff-highlight/DiffHighlight.pm @@ -1,4 +1,4 @@ -#!/usr/bin/perl +package DiffHighlight; use 5.008; use warnings FATAL => 'all'; @@ -29,13 +29,14 @@ my @removed; my @added; my $in_hunk; -# Some scripts may not realize that SIGPIPE is being ignored when launching the -# pager--for instance scripts written in Python. -$SIG{PIPE} = 'DEFAULT'; +our $line_cb = sub { print @_ }; +our $flush_cb = sub { local $| = 1 }; + +sub handle_line { + local $_ = shift; -while (<>) { if (!$in_hunk) { - print; + $line_cb->($_); $in_hunk = /^$GRAPH*$COLOR*\@\@ /; } elsif (/^$GRAPH*$COLOR*-/) { @@ -49,7 +50,7 @@ while (<>) { @removed = (); @added = (); - print; + $line_cb->($_); $in_hunk = /^$GRAPH*$COLOR*[\@ ]/; } @@ -62,15 +63,22 @@ while (<>) { # place to flush. Flushing on a blank line is a heuristic that # happens to match git-log output. if (!length) { - local $| = 1; + $flush_cb->(); } } -# Flush any queued hunk (this can happen when there is no trailing context in -# the final diff of the input). -show_hunk(\@removed, \@added); +sub flush { + # Flush any queued hunk (this can happen when there is no trailing + # context in the final diff of the input). + show_hunk(\@removed, \@added); +} -exit 0; +sub highlight_stdin { + while () { + handle_line($_); + } + flush(); +} # Ideally we would feed the default as a human-readable color to # git-config as the fallback value. But diff-highlight does @@ -88,7 +96,7 @@ sub show_hunk { # If one side is empty, then there is nothing to compare or highlight. if (!@$a || !@$b) { - print @$a, @$b; + $line_cb->(@$a, @$b); return; } @@ -97,17 +105,17 @@ sub show_hunk { # stupid, and only handle multi-line hunks that remove and add the same # number of lines. if (@$a != @$b) { - print @$a, @$b; + $line_cb->(@$a, @$b); return; } my @queue; for (my $i = 0; $i < @$a; $i++) { my ($rm, $add) = highlight_pair($a->[$i], $b->[$i]); - print $rm; + $line_cb->($rm); push @queue, $add; } - print @queue; + $line_cb->(@queue); } sub highlight_pair { diff --git a/contrib/diff-highlight/Makefile b/contrib/diff-highlight/Makefile index 901872452..fbf5c5824 100644 --- a/contrib/diff-highlight/Makefile +++ b/contrib/diff-highlight/Makefile @@ -1,5 +1,20 @@ -# nothing to build
Re: [PATCH v1] Configure Git contribution guidelines for github.com
Ævar Arnfjörð Bjarmason writes: > There are things we get out of this that would regress if > submitGit were changed this way: > > * Now when you submit a pull request you get a Travis build for > git/git, I don't get this if I push to any random branch in my > avar/git, and although I could probably set it up, it's extra work > etc. Thanks for pointing this out. I agree that this indeed is a downside. > * I like being able to "git fetch" patches I'm reviewing. Now I can > just set "+refs/pull/*:refs/remotes/origin/pull/*" in the refspec for > git/git, if it were pulling from target repos I'd need to "git remote > add" for each one, not a big deal, but less convenient. > > * There would be no single place to list submitGit requests, which > you can do now through the pull page, although I think this is largely > stale now because nothing auto-closes them if they're merged (by which > point they'll have different sha1s), but that could be improved with > some bot... I do not think these two are 'regressions', unless your definition of 'regression' is a "this thing I used to be able to do, now I no longer can", which is slightly different from mine, which is "this useful thing I used to be able to do, now I no longer can". It would be useful to be able to do the above two things, if the list of submitGit requests and what you can get from pull/* hierarchy (1) covered most of the changes proposed, allowing people to check only this place and nowhere else, and/or (2) covered the more interesting changes that are worth looking at than other proposed changes. I do not think either is the case. The submitGit mechanism being an easier way to propose a change to the mailing list, by definition, (1) will not hold true. And among the changes proposed to be made to Git, the "selection criteria" to be in the set to be discoverable by going to github.com/git/git.git and checking the submitGit pull requests is not that they are of higher quality or touch interesting topics. The only common trait these proposed changes share, compared to other proposed changes we see on the mailing list, are that they were originally made as pull requests and (2) will not hold true. Another thing that may regress that you did not mention is that we would lose a convenient way to _count_ proposed changes coming via submitGit (i.e. you can simply go to the pull-request page), so that the number can be compared with the number of proposed changes directly made on the mailing list, which would be a good way to gauge how submitGit service is helping our community. But even for that, you'd need to go to the list to find the denominator (i.e. total number of changes proposed), and by the time you do that, counting the numerator (i.e. those come via submitGit) by finding the telltale sign submitGit leaves in its output among these denominator messages should be trivial. So in all, I see the only downside is the loss of automated triggering of Travis. But I do agree with you that it is a rather significant downside. > There's probably ways to do this without git/git pull requests, but I > don't see what problem would be solved by moving this off git/git, > even if there's open requests submitGit is the only thing using these, > and any confusion about that pull UI would remain if it wasn't (AFAIK > there's no way to completely disable pull requests on github, but I > may be wrong). Hopefully the pull-request template update Lars proposes will keep the pull request useful, in that they create one and then have submitGit relay it to the official channel.
Re: [PATCH v2 1/2] git-compat-util: add a FREEZ() wrapper around free(ptr); ptr = NULL
Jeff King writes: > On Sat, Jun 10, 2017 at 03:21:43AM +, Eric Wong wrote: > >> > So make Jonathan's freez_impl a public API and rename it to >> > free_and_null(), perhaps? >> >> Perhaps... I think it needs to take "void *" to avoid warnings: >> >> static inline void free_and_null(void *ptrptr) >> { >> void **tmp = ptrptr; >> >> free(*tmp); >> *tmp = NULL; >> } > > That unfortunately makes it very easy to get it wrong in the callers. > Both: > > free_and_null(&p); > > and > > free_and_null(p); > > would be accepted by the compiler, but one of them causes undefined > behavior. > > Unfortunately using "void **" in the declaration doesn't work, because > C's implicit casting rules don't apply to pointer-to-pointer types. All true. I still think the macro FREEZ() is too confusing a name to live; perhaps we can take Ævar's patch with s/FREEZ/FREE_AND_NULL/ and be done with it? By spelling it in all caps, readers will know that there is something special going on in that macro, and Eric's "forcing the readers to type & in front to let them be aware that the ptr variable is being manipulated" may become less necessary.
Re: [PATCH v2 2/4] sha1_file: move delta base cache code up
Jonathan Tan writes: > In a subsequent patch, packed_object_info() will be modified to use the > delta base cache, so move the relevant code to before > packed_object_info(). > > Signed-off-by: Jonathan Tan > --- > sha1_file.c | 226 > +++- > 1 file changed, 116 insertions(+), 110 deletions(-) Hmph, is this meant to be just moving two whole functions? > diff --git a/sha1_file.c b/sha1_file.c > index a52b27541..a158907d1 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -2239,116 +2239,6 @@ static enum ... > ... > -int packed_object_info(struct packed_git *p, off_t obj_offset, > -struct object_info *oi) > -{ > -... > - if (oi->delta_base_sha1) { > - if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) { > - const unsigned char *base; > - > - base = get_delta_base_sha1(p, &w_curs, curpos, > -type, obj_offset); > - if (!base) { > - type = OBJ_BAD; > - goto out; > - } > - > - hashcpy(oi->delta_base_sha1, base); > - } else > - hashclr(oi->delta_base_sha1); > - } > - > -out: > - unuse_pack(&w_curs); > - return type; > -} > -... The above is what was removed, while ... > @@ -2486,6 +2376,122 @@ static void ... > ... > +int packed_object_info(struct packed_git *p, off_t obj_offset, > +struct object_info *oi) > +{ > +... > + if (oi->delta_base_sha1) { > + if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) { > + const unsigned char *base; > + > + base = get_delta_base_sha1(p, &w_curs, curpos, > +type, obj_offset); > + if (!base) { > + type = OBJ_BAD; > + goto out; > + } > + > + hashcpy(oi->delta_base_sha1, base); > + } else > + hashclr(oi->delta_base_sha1); > + } > + > + oi->whence = OI_PACKED; > + oi->u.packed.offset = obj_offset; > + oi->u.packed.pack = p; > + oi->u.packed.is_delta = (type == OBJ_REF_DELTA || > + type == OBJ_OFS_DELTA); > + > +out: > + unuse_pack(&w_curs); > + return type; > +} ... we somehow gained code to update *oi that used to be (and still is) done by its sole caller, sha1_object_info_extended(). Perhaps this is a rebase-gotcha?
Re: [PATCH v2 1/2] git-compat-util: add a FREEZ() wrapper around free(ptr); ptr = NULL
On Thu, Jun 15, 2017 at 6:48 PM, Junio C Hamano wrote: > Jeff King writes: > >> On Sat, Jun 10, 2017 at 03:21:43AM +, Eric Wong wrote: >> >>> > So make Jonathan's freez_impl a public API and rename it to >>> > free_and_null(), perhaps? >>> >>> Perhaps... I think it needs to take "void *" to avoid warnings: >>> >>> static inline void free_and_null(void *ptrptr) >>> { >>> void **tmp = ptrptr; >>> >>> free(*tmp); >>> *tmp = NULL; >>> } >> >> That unfortunately makes it very easy to get it wrong in the callers. >> Both: >> >> free_and_null(&p); >> >> and >> >> free_and_null(p); >> >> would be accepted by the compiler, but one of them causes undefined >> behavior. >> >> Unfortunately using "void **" in the declaration doesn't work, because >> C's implicit casting rules don't apply to pointer-to-pointer types. > > All true. > > I still think the macro FREEZ() is too confusing a name to live; > perhaps we can take Ævar's patch with s/FREEZ/FREE_AND_NULL/ and be > done with it? By spelling it in all caps, readers will know that > there is something special going on in that macro, and Eric's > "forcing the readers to type & in front to let them be aware that > the ptr variable is being manipulated" may become less necessary. I'll change it to FREE_AND_NULL and submit my patch as-is, my reading of the rest of this thread is that making it a function instead of a macro would be interesting, but has its own caveats that are likely better considered as part of its own series, whereas this just changes existing code to its macro-expanded functional equivalent.
Re: [PATCH v1] Configure Git contribution guidelines for github.com
Am 15.06.2017 um 18:43 schrieb Junio C Hamano: > Another thing that may regress that you did not mention is that we > would lose a convenient way to _count_ proposed changes coming via > submitGit (i.e. you can simply go to the pull-request page), so that > the number can be compared with the number of proposed changes > directly made on the mailing list, which would be a good way to > gauge how submitGit service is helping our community. But even for > that, you'd need to go to the list to find the denominator > (i.e. total number of changes proposed), and by the time you do > that, counting the numerator (i.e. those come via submitGit) by > finding the telltale sign submitGit leaves in its output among these > denominator messages should be trivial. This numbers can be aquired quite easily if submitGit adds a defined trailer to the converted commit message like this: Signed-off-by: Foo Bar Submit-git-id: url-or-id-of-pr
Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan
On 06/15, Johannes Schindelin wrote: > Hi, > > I thought it better to revive this old thread rather than start a new > thread, so as to automatically reach everybody who chimed in originally. > > On Mon, 6 Mar 2017, Brandon Williams wrote: > > > On 03/06, brian m. carlson wrote: > > > > > On Sat, Mar 04, 2017 at 06:35:38PM -0800, Linus Torvalds wrote: > > > > > > > Btw, I do think the particular choice of hash should still be on the > > > > table. sha-256 may be the obvious first choice, but there are > > > > definitely a few reasons to consider alternatives, especially if > > > > it's a complete switch-over like this. > > > > > > > > One is large-file behavior - a parallel (or tree) mode could improve > > > > on that noticeably. BLAKE2 does have special support for that, for > > > > example. And SHA-256 does have known attacks compared to SHA-3-256 > > > > or BLAKE2 - whether that is due to age or due to more effort, I > > > > can't really judge. But if we're switching away from SHA1 due to > > > > known attacks, it does feel like we should be careful. > > > > > > I agree with Linus on this. SHA-256 is the slowest option, and it's > > > the one with the most advanced cryptanalysis. SHA-3-256 is faster on > > > 64-bit machines (which, as we've seen on the list, is the overwhelming > > > majority of machines using Git), and even BLAKE2b-256 is stronger. > > > > > > Doing this all over again in another couple years should also be a > > > non-goal. > > > > I agree that when we decide to move to a new algorithm that we should > > select one which we plan on using for as long as possible (much longer > > than a couple years). While writing the document we simply used > > "sha256" because it was more tangible and easier to reference. > > The SHA-1 transition *requires* a knob telling Git that the current > repository uses a hash function different from SHA-1. > > It would make *a whole of a lot of sense* to make that knob *not* Boolean, > but to specify *which* hash function is in use. 100% agree on this point. I believe the current plan is to have the hashing function used for a repository be a repository format extension which would be a value (most likely a string like 'sha1', 'sha256', 'black2', etc) stored in a repository's .git/config. This way, upon startup git will die or ignore a repository which uses a hashing function which it does not recognize or does not compiled to handle. I hope (and expect) that the end produce of this transition is a nice, clean hashing API and interface with sufficient abstractions such that if I wanted to switch to a different hashing function I would just need to implement the interface with the new hashing function and ensure that 'verify_repository_format' allows the new function. > > That way, it will be easier to switch another time when it becomes > necessary. > > And it will also make it easier for interested parties to use a different > hash function in their infrastructure if they want. > > And it lifts part of that burden that we have to consider *very carefully* > which function to pick. We still should be more careful than in 2005, when > Git was born, and when, incidentally, when the first attacks on SHA-1 > became known, of course. We were just lucky for almost 12 years. > > Now, with Dunning-Kruger in mind, I feel that my degree in mathematics > equips me with *just enough* competence to know just how little *even I* > know about cryptography. > > The smart thing to do, hence, was to get involved in this discussion and > act as Lt Tawney Madison between us Git developers and experts in > cryptography. > > It just so happens that I work at a company with access to excellent > cryptographers, and as we own the largest Git repository on the planet, we > have a vested interest in ensuring Git's continued success. > > After a couple of conversations with a couple of experts who I cannot > thank enough for their time and patience, let alone their knowledge about > this matter, it would appear that we may not have had a complete enough > picture yet to even start to make the decision on the hash function to > use. > -- Brandon Williams
Re: git diff sometimes brings up buggy pager
Any chance you can tell us what repo this happens on? + git version, OS, and what the triggering diff invocation is. On Thu, Jun 15, 2017 at 12:19 PM, Matthew Groth wrote: > When I do `git diff` sometimes I get this: > > > ...skipping... > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ...skipping... > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ...skipping... > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > ~ > > > it goes on like this for about 10 times the length. Looks like this > happens exclusively when I use git diff with a github remote that is at the > same commit. I will update if I find any other case where this happens. > > > >
Re: [PATCH v2 3/4] sha1_file: consolidate storage-agnostic object fns
Jonathan Tan writes: > Looking at the 3 primary functions (sha1_object_info_extended, > read_object, has_sha1_file_with_flags), they independently implement > mechanisms such as object replacement, retrying the packed store after > failing to find the object in the packed store then the loose store, and > being able to mark a packed object as bad and then retrying the whole > process. Consolidating these mechanisms would be a great help to > maintainability. > > Therefore, consolidate all 3 functions by extending > sha1_object_info_extended() to support the functionality needed by all 3 > functions, and then modifying the other 2 to use > sha1_object_info_extended(). This is a rather "ugly" looking patch ;-) but I followed what has_sha1_file_with_flags() and read_object() do before and after this change, and I think this patch is a no-op wrt their behaviour (which is a good thing). But I have a very mixed feelings on one aspect of the resulting sha1_object_info_extended(). > int sha1_object_info_extended(const unsigned char *sha1, struct object_info > *oi, unsigned flags) > { > ... > if (!find_pack_entry(real, &e)) { > /* Most likely it's a loose object. */ > - if (!sha1_loose_object_info(real, oi, flags)) { > + if (oi && !sha1_loose_object_info(real, oi, flags)) { > oi->whence = OI_LOOSE; > return 0; > } > + if (!oi && has_loose_object(real)) > + return 0; This conversion is not incorrect per-se. We can see that has_sha1_file_with_flags() after this patch still calls has_loose_object(). But it bothers me that there is no hint to future developers to warn that a rewrite of the above like this is incorrect: if (!find_pack_entry(read, &e)) { /* Most likely it's a loose object. */ +struct object_info dummy_oi; +if (!oi) { +memset(&dummy_oi, 0, sizeof(dummy_oi); +oi = &dummy_oi; +} -if (oi && !sha1_loose_object_info(real, oi, flags)) { +if (!sha1_loose_object_info(real, oi, flags)) { oi->whence = OI_LOOSE; return 0; } -if (!oi && has_loose_object(real)) -return 0; It used to be very easy to see that has_sha1_file_with_flags() will call has_loose_object() when it does not find the object in a pack, which will result in the loose object file freshened. In the new code, it is very subtle to see that---it will happen when the caller passes oi == NULL, and has_sha1_file_with_flags() is such a caller, but it is unclear if there are other callers of this "consolidated" sha1_object_info_extended() that passes oi == NULL, and if they do also want to freshen the loose object file when they do so. > @@ -3480,18 +3491,12 @@ int has_sha1_pack(const unsigned char *sha1) > > int has_sha1_file_with_flags(const unsigned char *sha1, int flags) > { > - struct pack_entry e; > + int f = OBJECT_INFO_SKIP_CACHED | > + ((flags & HAS_SHA1_QUICK) ? OBJECT_INFO_QUICK : 0); > > if (!startup_info->have_repository) > return 0; > - if (find_pack_entry(sha1, &e)) > - return 1; > - if (has_loose_object(sha1)) > - return 1; > - if (flags & HAS_SHA1_QUICK) > - return 0; > - reprepare_packed_git(); > - return find_pack_entry(sha1, &e); > + return !sha1_object_info_extended(sha1, NULL, f); > } I would have preferred to see the new variable not to be called 'f', as that makes it unclear what it is (is that a callback function pointer?). In this case, uyou are forcing the flag bits passed down, so perhaps you can reuse the same variable? If you allocated a separate variable because has_sha1_file_with_flags() and sha1_object_info_extended() take flag bits from two separate vocabularies, that is a valid reasoning, but if that is the case, then I would have named 'f' to reflect that fact that this is different from parameter 'flag' that is defined in the has_sha1_file_with_flags() world, but a different thing that is defined in sha1_object_info_extended() world, e.g. "soie_flag" or something like that. Thanks.
Re: [PATCH v1] Configure Git contribution guidelines for github.com
Andreas Heiduk writes: > Am 15.06.2017 um 18:43 schrieb Junio C Hamano: >> Another thing that may regress that you did not mention is that we >> would lose a convenient way to _count_ proposed changes coming via >> submitGit (i.e. you can simply go to the pull-request page), so that >> the number can be compared with the number of proposed changes >> directly made on the mailing list, which would be a good way to >> gauge how submitGit service is helping our community. But even for >> that, you'd need to go to the list to find the denominator >> (i.e. total number of changes proposed), and by the time you do >> that, counting the numerator (i.e. those come via submitGit) by >> finding the telltale sign submitGit leaves in its output among these >> denominator messages should be trivial. > > This numbers can be aquired quite easily if submitGit adds a defined > trailer to the converted commit message like this: > > Signed-off-by: Foo Bar > Submit-git-id: url-or-id-of-pr I do not think you would want the noise _in_ the log message. The "telltale sign" I had in mind was these "signature" lines at the end of the message: -- https://github.com/git/git/pull/538
Re: [PATCH v2 3/4] sha1_file: consolidate storage-agnostic object fns
On Thu, 15 Jun 2017 10:50:46 -0700 Junio C Hamano wrote: > Jonathan Tan writes: > > > Looking at the 3 primary functions (sha1_object_info_extended, > > read_object, has_sha1_file_with_flags), they independently implement > > mechanisms such as object replacement, retrying the packed store after > > failing to find the object in the packed store then the loose store, and > > being able to mark a packed object as bad and then retrying the whole > > process. Consolidating these mechanisms would be a great help to > > maintainability. > > > > Therefore, consolidate all 3 functions by extending > > sha1_object_info_extended() to support the functionality needed by all 3 > > functions, and then modifying the other 2 to use > > sha1_object_info_extended(). > > This is a rather "ugly" looking patch ;-) but I followed what > has_sha1_file_with_flags() and read_object() do before and after > this change, and I think this patch is a no-op wrt their behaviour > (which is a good thing). > > But I have a very mixed feelings on one aspect of the resulting > sha1_object_info_extended(). > > > int sha1_object_info_extended(const unsigned char *sha1, struct > > object_info *oi, unsigned flags) > > { > > ... > > if (!find_pack_entry(real, &e)) { > > /* Most likely it's a loose object. */ > > - if (!sha1_loose_object_info(real, oi, flags)) { > > + if (oi && !sha1_loose_object_info(real, oi, flags)) { > > oi->whence = OI_LOOSE; > > return 0; > > } > > + if (!oi && has_loose_object(real)) > > + return 0; > > This conversion is not incorrect per-se. > > We can see that has_sha1_file_with_flags() after this patch still > calls has_loose_object(). But it bothers me that there is no hint > to future developers to warn that a rewrite of the above like this > is incorrect: > > if (!find_pack_entry(read, &e)) { > /* Most likely it's a loose object. */ >+struct object_info dummy_oi; >+if (!oi) { >+memset(&dummy_oi, 0, sizeof(dummy_oi); >+oi = &dummy_oi; >+} >-if (oi && !sha1_loose_object_info(real, oi, flags)) { >+if (!sha1_loose_object_info(real, oi, flags)) { > oi->whence = OI_LOOSE; > return 0; > } >-if (!oi && has_loose_object(real)) >-return 0; > > It used to be very easy to see that has_sha1_file_with_flags() will > call has_loose_object() when it does not find the object in a pack, > which will result in the loose object file freshened. In the new > code, it is very subtle to see that---it will happen when the caller > passes oi == NULL, and has_sha1_file_with_flags() is such a caller, > but it is unclear if there are other callers of this "consolidated" > sha1_object_info_extended() that passes oi == NULL, and if they do > also want to freshen the loose object file when they do so. Good point - sorry, I didn't pay much attention to the freshening behavior. After some thought, I now think it might be better to avoid modifying has_sha1_file_with_flags(). As it is, sha1_object_info_extended() already needs special handling (special flags and handling the possibility of "oi" being NULL) to handle the functionality required by has_sha1_file_with_flags(); adding yet another thing to handle (freshen or not) would make it much too complicated. This means that subsequent patches that modify the handling of storage-agnostic objects would still need to modify 2 functions, but at least that is fewer than the current 3. I'll reroll with these changes so that you (and others) can see what it looks like. > > @@ -3480,18 +3491,12 @@ int has_sha1_pack(const unsigned char *sha1) > > > > int has_sha1_file_with_flags(const unsigned char *sha1, int flags) > > { > > - struct pack_entry e; > > + int f = OBJECT_INFO_SKIP_CACHED | > > + ((flags & HAS_SHA1_QUICK) ? OBJECT_INFO_QUICK : 0); > > > > if (!startup_info->have_repository) > > return 0; > > - if (find_pack_entry(sha1, &e)) > > - return 1; > > - if (has_loose_object(sha1)) > > - return 1; > > - if (flags & HAS_SHA1_QUICK) > > - return 0; > > - reprepare_packed_git(); > > - return find_pack_entry(sha1, &e); > > + return !sha1_object_info_extended(sha1, NULL, f); > > } > > I would have preferred to see the new variable not to be called 'f', > as that makes it unclear what it is (is that a callback function > pointer?). In this case, uyou are forcing the flag bits passed > down, so perhaps you can reuse the same variable? > > If you allocated a separate variable because > has_sha1_file_with_flags() and sha1_object_info_extended() take flag > bits from two separate vocabularies, that is a valid reasoning, but > if that is the cas
Re: [PATCH v2 4/4] sha1_file, fsck: add missing blob support
Jonathan Tan writes: > diff --git a/sha1_file.c b/sha1_file.c > index 98086e21e..75fe2174d 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -27,6 +27,9 @@ > #include "list.h" > #include "mergesort.h" > #include "quote.h" > +#include "iterator.h" > +#include "dir-iterator.h" > +#include "sha1-lookup.h" > > #define SZ_FMT PRIuMAX > static inline uintmax_t sz_fmt(size_t s) { return s; } > @@ -1624,6 +1627,72 @@ static const struct packed_git > *has_packed_and_bad(const unsigned char *sha1) > return NULL; > } > > +struct missing_blob_manifest { > + struct missing_blob_manifest *next; > + const char *data; > +}; > +struct missing_blob_manifest *missing_blobs; > +int missing_blobs_initialized; I do not think you meant to make these non-static. The type of the former is not even visible to the outside world, and the latter is something that could be made into static to prepare_missing_blobs() function (unless and until you start allowing the missing-blobs manifest to be re-initialized). Your ensure_configured() below seems to do the "static" right, on the other hand ;-). Do we expect that we will have only a handful of these missing blob manifests? Each manifest seems to be efficiently looked-up with a binary search, but it makes me wonder if it is a good idea to consolidate these manifests into a single list of object names to eliminate the outer loop in has_missing_blob(). Unlike pack .idx files that must stay one-to-one with .pack files, it appears to me that there is no reason why we need to keep multiple ones separate for extended period of time (e.g. whenever we learn that we receieved an incomplete pack from the other side with a list of newly missing blobs, we could incorporate that into existing missing blob list). > +int has_missing_blob(const unsigned char *sha1, unsigned long *size) > +{ This function that answers "is it expected to be missing?" is confusingly named. Is it missing, or does it exist? > @@ -2981,11 +3050,55 @@ static int sha1_loose_object_info(const unsigned char > *sha1, > return (status < 0) ? status : 0; > } > > +static char *missing_blob_command; > +static int sha1_file_config(const char *conf_key, const char *value, void > *cb) > +{ > + if (!strcmp(conf_key, "core.missingblobcommand")) { > + missing_blob_command = xstrdup(value); > + } > + return 0; > +} > + > +static int configured; > +static void ensure_configured(void) > +{ > + if (configured) > + return; Do not be selfish and pretend that this is the _only_ kind of configuration that needs to be done inside sha1_file.c. Call the function ensure__is_configured() and rename the run-once guard to match. The run-once guard can be made static to the "ensure" function, and if you do so, then its name can stay to be "configured", as at that point it is clear what it is guarding. > diff --git a/t/t3907-missing-blob.sh b/t/t3907-missing-blob.sh > new file mode 100755 > index 0..e0ce0942d > --- /dev/null > +++ b/t/t3907-missing-blob.sh > @@ -0,0 +1,69 @@ > +#!/bin/sh > + > +test_description='core.missingblobcommand option' > + > +. ./test-lib.sh > + > +pack() { Style: "pack () {" > + perl -e '$/ = undef; $input = <>; print pack("H*", $input)' high-nybble first to match ntohll() done in has_missing_blob()? OK. > +} > + > +test_expect_success 'sha1_object_info_extended and read_sha1_file (through > git cat-file -p)' ' > + rm -rf server client && > + > + git init server && > + test_commit -C server 1 && > + test_config -C server uploadpack.allowanysha1inwant 1 && > + HASH=$(git hash-object server/1.t) && > + > + git init client && > + test_config -C client core.missingblobcommand \ > + "git -C \"$(pwd)/server\" pack-objects --stdout | git > unpack-objects" && > + > + # does not work if missing blob is not registered > + test_must_fail git -C client cat-file -p "$HASH" && > + > + mkdir -p client/.git/objects/missing && > + printf "%016x%s%016x" 1 "$HASH" "$(wc -c + pack >client/.git/objects/missing/x && > + > + # works when missing blob is registered > + git -C client cat-file -p "$HASH" > +' OK, by passing printf '%016x', implementations of "$(wc -c)" that gives extra whitespace around its output can still work correctly. Good.
Re: [BUG] add_again() off-by-one error in custom format
René Scharfe writes: > Am 13.06.2017 um 23:20 schrieb Junio C Hamano: > >> I think the real question is how likely people use more than one >> occurrence of the same thing in their custom format, and how deeply >> they care that --format='%h %h' costs more than --format='%h'. The >> cost won't of course be double (because the main traversal costs >> without any output), but it would be rather unreasonable to expect >> that --format='%h %h %h %h %h' to cost the same as --format='%h'; >> after all, Git is doing more for them ;-) > > The answer to the first half is obviously "very likely" -- otherwise > this bug wouldn't have been found, right? :) Not really. There was only one (this one) after all these years. The question we are asking is not "very rarely this is used and we can afford to leave it broken?" It is "very rarely this is used and we can afford not to optimize for that rare use case?". > Regarding the question of how bad a 50% slowdown for a second %h > would be: No idea. If ran interactively it may not even be noticeable > because the user can read the first few lines in less while the rest > is prepared in the background. We don't have a perf test for formats > with duplicate short hashes, so we don't promise anything, right? :) OK. > -- >8 -- > Subject: [PATCH] pretty: recalculate duplicate short hashes > > b9c6232138 (--format=pretty: avoid calculating expensive expansions > twice) optimized adding short hashes multiple times by using the > fact that the output strbuf was only ever simply appended to and > copying the added string from the previous run. That prerequisite > is no longer given; we now have modfiers like %< and %+ that can > cause the cache to lose track of the correct offsets. Remove it. > > Reported-by: Michael Giuffrida > Signed-off-by: Rene Scharfe > --- > I'm sending this out in the hope that there might be a simple way > to fix it after all, like Gábor's patch does for %+. %< and %> > seem to be the only other problematic modifiers for now -- I'm > actually surprised that %w seems to be OK. Thanks, this looks like a sensible first step. Will queue.
[ANNOUNCE] Git for Windows 2.13.1(2)
Dear Git users, It is my pleasure to announce that Git for Windows 2.13.1(2) is available from: https://git-for-windows.github.io/ Changes since Git for Windows v2.13.1 (June 13th 2017) Bug Fixes * git commit and git status no longer randomly throw segmentation faults. Filename | SHA-256 | --- Git-2.13.1.2-64-bit.exe | cd11e57bd25c4d8fde0a7568d19bf3fc6418dd23080901414309b144e2bf0b32 Git-2.13.1.2-32-bit.exe | 5eb854b666a77a2efc0119fc144cbba1e01a716c542f4259af1dbd4323d68fe9 PortableGit-2.13.1.2-64-bit.7z.exe | 2c98f6cab688d585d68896c8954e4849c70b33a34f8b5b6009d2ba56ddd95c43 PortableGit-2.13.1.2-32-bit.7z.exe | 7eeccb6aa3aa294a05538a913f465b9ddeb36160126caf709b378bb78630216b MinGit-2.13.1.2-64-bit.zip | 9d3d572f275ebf69ea14bb4abfda64af78c738d2db8a54ee1f9f9db7cdfadf74 MinGit-2.13.1.2-32-bit.zip | 4b643c986a8c2455cddd2338a3c892acf111d3833384e866410785f9ea073a1a Git-2.13.1.2-64-bit.tar.bz2 | 6fc4fa4903ff974f3960c4422269beeb3f02176029b69db0d6090986b21a9206 Git-2.13.1.2-32-bit.tar.bz2 | 9476b762c4eb007d82627e34b7b0fde6ddfae8c78f3b1d29518c68dd65f4a4e2 Ciao, Johannes
Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan
Hi Dscho, Johannes Schindelin wrote: > From what I read, pretty much everybody who participated in the discussion > was aware that the essential question is: performance vs security. I don't completely agree with this framing. The essential question is: how to get the right security properties without abysmal performance. > It turns out that we can have essentially both. > > SHA-256 is most likely the best-studied hash function we currently know [... etc ...] Thanks for a thoughtful restart to the discussion. This is much more concrete than your previous objections about process, and that is very helpful. In the interest of transparency: here are my current questions for cryptographers to whom I have forwarded this thread. Several of these questions involve predictions or opinions, so in my ideal world we'd want multiple, well reasoned answers to them. Please feel free to forward them to appropriate people or add more. 1. Now it sounds like SHA-512/256 is the safest choice (see also Mike Hommey's response to Dscho's message). Please poke holes in my understanding. 2. Would you be willing to weigh in publicly on the mailing list? I think that would be the most straightforward way to move this forward (and it would give you a chance to ask relevant questions, etc). Feel free to contact me privately if you have any questions about how this particular mailing list works. 3. On the speed side, Dscho states "SHA-256 will be faster than BLAKE (and even than BLAKE2) once the Intel and AMD CPUs with hardware support for SHA-256 become common." Do you agree? 4. On the security side, Dscho states "to compete in the SHA-3 contest, BLAKE added complexity so that it would be roughly on par with its competitors. To allow for faster execution in software, this complexity was *removed* from BLAKE to create BLAKE2, making it weaker than SHA-256." Putting aside the historical questions, do you agree with this "weaker than" claim? 5. On the security side, Dscho states, "The type of attacks Git has to worry about is very different from the length extension attacks, and it is highly unlikely that that weakness of SHA-256 leads to, say, a collision attack", and Jeff King states, "Git does not use the hash as a MAC, so length extension attacks aren't a thing (and even if we later wanted to use the same algorithm as a MAC, the HMAC construction is a well-studied technique for dealing with it)." Is this correct in spirit? Is SHA-256 equally strong to SHA-512/256 for Git's purposes, or are the increased bits of internal state (or other differences) relevant? How would you compare the two functions' properties? 6. On the speed side, Jeff King states "That said, SHA-512 is typically a little faster than SHA-256 on 64-bit platforms. I don't know if that will change with the advent of hardware instructions oriented towards SHA-256." Thoughts? 7. If the answer to (2) is "no", do I have permission to quote or paraphrase your replies that were given here? Thanks, sincerely, Jonathan
Re: [PATCH] diff-highlight: split code into module
Jeff King writes: > The diff-so-fancy project is also written in perl, and most > of its users pipe diffs through both diff-highlight and > diff-so-fancy. It would be nice if this could be done in a > single script. So let's pull most of diff-highlight's code > into its own module which can be used by diff-so-fancy. > > In addition, we'll abstract a few basic items like reading > from stdio so that a script using the module can do more > processing before or after diff-highlight handles the lines. > See the README update for more details. > > One small downside is that the diff-highlight script must > now be built using the Makefile. There are ways around this, > but it quickly gets into perl arcana. Let's go with the > simple solution. As a bonus, our Makefile now respects the > PERL_PATH variable if it is set. > > Signed-off-by: Jeff King > --- > Scott and I discussed this off-list, and this was the least-gross > solution I came up with. The plan would be for diff-so-fancy to pull in > this copy of diff-highlight from git.git and have a wrapper script > similar to the diff-highlight.perl found here. > > contrib/diff-highlight/.gitignore | 2 ++ > .../{diff-highlight => DiffHighlight.pm} | 40 > +- > contrib/diff-highlight/Makefile| 21 ++-- > contrib/diff-highlight/README | 30 > contrib/diff-highlight/diff-highlight.perl | 8 + > 5 files changed, 82 insertions(+), 19 deletions(-) > create mode 100644 contrib/diff-highlight/.gitignore > rename contrib/diff-highlight/{diff-highlight => DiffHighlight.pm} (91%) > mode change 100755 => 100644 > create mode 100644 contrib/diff-highlight/diff-highlight.perl Do you want +x bit for the last one? I could throw that bit in while queuing if you want. Thanks.
Re: [PATCH] diff-highlight: split code into module
Junio C Hamano writes: >> contrib/diff-highlight/.gitignore | 2 ++ >> .../{diff-highlight => DiffHighlight.pm} | 40 >> +- >> contrib/diff-highlight/Makefile| 21 ++-- >> contrib/diff-highlight/README | 30 >> contrib/diff-highlight/diff-highlight.perl | 8 + >> 5 files changed, 82 insertions(+), 19 deletions(-) >> create mode 100644 contrib/diff-highlight/.gitignore >> rename contrib/diff-highlight/{diff-highlight => DiffHighlight.pm} (91%) >> mode change 100755 => 100644 >> create mode 100644 contrib/diff-highlight/diff-highlight.perl > > Do you want +x bit for the last one? I could throw that bit in > while queuing if you want. Ah, I do not think you want it, as this is not something to be executed as-is (it is a source file, which diff-highlight that is executable is made out of).
Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan
Brandon Williams writes: >> It would make a whole of a lot of sense to make that knob not Boolean, >> but to specify which hash function is in use. > > 100% agree on this point. I believe the current plan is to have the > hashing function used for a repository be a repository format extension > which would be a value (most likely a string like 'sha1', 'sha256', > 'black2', etc) stored in a repository's .git/config. This way, upon > startup git will die or ignore a repository which uses a hashing > function which it does not recognize or does not compiled to handle. > > I hope (and expect) that the end produce of this transition is a nice, > clean hashing API and interface with sufficient abstractions such that > if I wanted to switch to a different hashing function I would just need > to implement the interface with the new hashing function and ensure that > 'verify_repository_format' allows the new function. Yup. I thought that part has already been agreed upon, but it is a good thing that somebody is writing it down (perhaps "again", if not "for the first time"). Thanks.
Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan
Hi, On Thu, 15 Jun 2017, Ævar Arnfjörð Bjarmason wrote: > On Thu, Jun 15 2017, Jeff King jotted: > > > On Thu, Jun 15, 2017 at 08:05:18PM +0900, Mike Hommey wrote: > > > >> On Thu, Jun 15, 2017 at 12:30:46PM +0200, Johannes Schindelin wrote: > >> > >> > Footnote *1*: SHA-256, as all hash functions whose output is > >> > essentially the entire internal state, are susceptible to a > >> > so-called "length extension attack", where the hash of a > >> > secret+message can be used to generate the hash of > >> > secret+message+piggyback without knowing the secret. This is not > >> > the case for Git: only visible data are hashed. The type of attacks > >> > Git has to worry about is very different from the length extension > >> > attacks, and it is highly unlikely that that weakness of SHA-256 > >> > leads to, say, a collision attack. > >> > >> What do the experts think or SHA512/256, which completely removes the > >> concerns over length extension attack? (which I'd argue is better than > >> sweeping them under the carpet) > > > > I don't think it's sweeping them under the carpet. Git does not use the > > hash as a MAC, so length extension attacks aren't a thing (and even if > > we later wanted to use the same algorithm as a MAC, the HMAC > > construction is a well-studied technique for dealing with it). I really tried to drive that point home, as it had been made very clear to me that the length extension attack is something that Git need not concern itself. The length extension attack *only* comes into play when there are secrets that are hashed. In that case, one would not want others to be able to produce a valid hash *without* knowing the secrets. And SHA-256 allows to "reconstruct" the internal state (which is the hash value) in order to continue at any point, i.e. if the hash for secret+message is known, it is easy to calculate the hash for secret+message+addition, without knowing the secret at all. That is exactly *not* the case with Git. In Git, what we want to hash is known in its entirety. If the hash value were not identical to the internal state, it would be easy enough to reconstruct, because *there are no secrets*. So please understand that even the direction that the length extension attack takes is completely different than the direction any attack would have to take that weakens SHA-256 for Git's purposes. As far as Git's usage is concerned, SHA-256 has no known weaknesses. It is *really, really, really* important to understand this before going on to suggest another hash function such as SHA-512/256 (i.e. SHA-512 truncated to 256 bits), based only on that perceived weakness of SHA-256. > > That said, SHA-512 is typically a little faster than SHA-256 on 64-bit > > platforms. I don't know if that will change with the advent of > > hardware instructions oriented towards SHA-256. > > Quoting my own > cacbzzx7jra2niwt9wsgaxnzs+gws8htugzwm8nay1gs87o8...@mail.gmail.com sent > ~2 weeks ago to the list: > > On Fri, Jun 2, 2017 at 7:54 PM, Jonathan Nieder > wrote: > [...] > > 4. When choosing a hash function, people may argue about performance. > >It would be useful for run some benchmarks for git (running > >the test suite, t/perf tests, etc) using a variety of hash > >functions as input to such a discussion. > > To the extent that such benchmarks matter, it seems prudent to heavily > weigh them in favor of whatever seems to be likely to be the more > common hash function going forward, since those are likely to get > faster through future hardware acceleration. > > E.g. Intel announced Goldmont last year which according to one SHA-1 > implementation improved from 9.5 cycles per byte to 2.7 cpb[1]. They > only have acceleration for SHA-1 and SHA-256[2] > > 1. https://github.com/weidai11/cryptopp/issues/139#issuecomment-264283385 > > 2. https://en.wikipedia.org/wiki/Goldmont > > Maybe someone else knows of better numbers / benchmarks, but such a > reduction in CBP likely makes it faster than SHA-512. Very, very likely faster than SHA-512. I'd like to stress explicitly that the Intel SHA extensions do *not* cover SHA-512: https://en.wikipedia.org/wiki/Intel_SHA_extensions In other words, once those extensions become commonplace, SHA-256 will be faster than SHA-512, hands down. Ciao, Dscho
Re: [PATCH v4 6/6] Use the early config machinery to expand aliases
Johannes Schindelin writes: > +struct config_alias_data > +{ Style: "struct config_alias_data {" which I can tweak while applying. Other than that, everything looks good. Thanks.
Re: [PATCH v4 6/6] Use the early config machinery to expand aliases
Hi Junio, On Thu, 15 Jun 2017, Junio C Hamano wrote: > Johannes Schindelin writes: > > > +struct config_alias_data > > +{ > > Style: "struct config_alias_data {" > > which I can tweak while applying. Please do. > Other than that, everything looks good. Thanks, Dscho
Re: [PATCH v3 0/6] config.h
Brandon Williams writes: > Changes in v3: > > * tweaked the discover_git_directory function's API based on Peff's feedback > * reordered the last three patches so that they flowed a bit better > * renamed 'git_config_with_options' > * rebased ontop of v4 of Dscho's alias series > > https://public-inbox.org/git/cover.1497440104.git.johannes.schinde...@gmx.de/ Applying this series was messier than necessary, I'd have to say, as this series would not apply cleanly on top of the result of applying Dscho's v4 on top of the same base as Dscho's v3 was applied (which is v2.13.0). It applied cleanly only when Dscho's v4 and then this series were applied on top of 02a2850a ("Sync with maint", 2017-06-13), which is a much newer commit than v2.13.0. Which in turn means these two fixes cannot be merged to 'maint' as you two collaboratively prepared. I've applied Dscho's v4 on top of v2.13.0 (just like his v3 was queued in my tree on the same base), and then tweaked this series to apply on top of that, so that they can go to 'maint' if we chose to. But it is possible that during this unnecessary patch shuffling, I may have made some mistake, so please eyeball the resulting 12 patches carefully when they are pushed out. Thanks. >
Re: [PATCH] diff-highlight: split code into module
On 06/15/2017 12:15 PM, Junio C Hamano wrote: > Do you want +x bit for the last one? I could throw that bit in > while queuing if you want. > > Thanks. > Ya probably best.
Re: [WIP v2 2/2] pack-objects: support --blob-max-bytes
On 6/2/2017 6:26 PM, Jeff King wrote: On Fri, Jun 02, 2017 at 12:38:45PM -0700, Jonathan Tan wrote: ... We have a name-hash cache extension in the bitmap file, but it doesn't carry enough information to deduce the .git-ness of a file. I don't think it would be too hard to add a "flags" extension, and give a single bit to "this is a .git file". I do also wonder if the two features would need to be separated for a GVFS-style solution. If you're not just avoiding large blobs but trying to get a narrow clone, you don't want the .git files from the uninteresting parts of the tree. You want to get no blobs at all, and then fault them in as they become relevant due to user action. -Peff I agree with Peff here. I've been working on my partial/narrow/sparse clone/fetch ideas since my original RFC and have come to the conclusion that the server can do the size limiting efficiently, but we should leave the pathname filtering to the client. That is, let the client get the commits and trees and then locally apply pattern matching, whether that be a sparse-checkout definition or simple ".git*" matching and then make a later request for the "blobs of interest". Whether we "fault-in" the missing blobs or have a "fetch-blobs" command (like fetch-pack) to get them is open to debate. There are concerns about the size of the requested blob-id list in a fetch-blobs approach, but I think there are ways to say I need all of the blobs referenced by the directory /foo in commit (and optionally, that aren't present in directory /foo in commit or in the range ..). (handwave) Jeff
Re: [PATCH v2 4/4] sha1_file, fsck: add missing blob support
A reroll is coming soon, but there is an interesting discussion point here so I'll reply to this e-mail first. On Thu, 15 Jun 2017 11:34:45 -0700 Junio C Hamano wrote: > Jonathan Tan writes: > > > +struct missing_blob_manifest { > > + struct missing_blob_manifest *next; > > + const char *data; > > +}; > > +struct missing_blob_manifest *missing_blobs; > > +int missing_blobs_initialized; > > I do not think you meant to make these non-static. The type of the > former is not even visible to the outside world, and the latter is > something that could be made into static to prepare_missing_blobs() > function (unless and until you start allowing the missing-blobs > manifest to be re-initialized). Your ensure_configured() below > seems to do the "static" right, on the other hand ;-). Good catch - done. > Do we expect that we will have only a handful of these missing blob > manifests? Each manifest seems to be efficiently looked-up with a > binary search, but it makes me wonder if it is a good idea to > consolidate these manifests into a single list of object names to > eliminate the outer loop in has_missing_blob(). Unlike pack .idx > files that must stay one-to-one with .pack files, it appears to me > that there is no reason why we need to keep multiple ones separate > for extended period of time (e.g. whenever we learn that we receieved > an incomplete pack from the other side with a list of newly missing > blobs, we could incorporate that into existing missing blob list). There is indeed no reason why we need to keep multiple ones separate for an extended period of time - my thinking was to let fetch/clone be fast by not needing to scan through the entire existing manifest (in order to create the new one), letting GC take care of consolidating them (since it would have to check individual entries to delete those corresponding to objects that have entered the repo through other means). But this is at the expense of making the individual object lookups a bit slower. For now, I'll leave the possibility of multiple files open while I try to create a set of patches that can implement missing blob support from fetch to day-to-day usage. But I am not opposed to changing it to a single-file manifest. > > +int has_missing_blob(const unsigned char *sha1, unsigned long *size) > > +{ > > This function that answers "is it expected to be missing?" is > confusingly named. Is it missing, or does it exist? Renamed to in_missing_blob_manifest(). > > @@ -2981,11 +3050,55 @@ static int sha1_loose_object_info(const unsigned > > char *sha1, > > return (status < 0) ? status : 0; > > } > > > > +static char *missing_blob_command; > > +static int sha1_file_config(const char *conf_key, const char *value, void > > *cb) > > +{ > > + if (!strcmp(conf_key, "core.missingblobcommand")) { > > + missing_blob_command = xstrdup(value); > > + } > > + return 0; > > +} > > + > > +static int configured; > > +static void ensure_configured(void) > > +{ > > + if (configured) > > + return; > > Do not be selfish and pretend that this is the _only_ kind of > configuration that needs to be done inside sha1_file.c. Call the > function ensure__is_configured() and rename the run-once > guard to match. My thinking was that any additional configuration could be added to this function, but individual configuration for each feature is fine too. I have renamed things according to your suggestion. > The run-once guard can be made static to the "ensure" function, and > if you do so, then its name can stay to be "configured", as at that > point it is clear what it is guarding. Done. > > +pack() { > > Style: "pack () {" Done. > > > + perl -e '$/ = undef; $input = <>; print pack("H*", $input)' > > high-nybble first to match ntohll() done in has_missing_blob()? OK. Actually it's to match the printf behavior below that prints the high nybble first (like in English).
Re: [PATCH v3 0/6] config.h
On 06/15, Junio C Hamano wrote: > Brandon Williams writes: > > > Changes in v3: > > > > * tweaked the discover_git_directory function's API based on Peff's feedback > > * reordered the last three patches so that they flowed a bit better > > * renamed 'git_config_with_options' > > * rebased ontop of v4 of Dscho's alias series > > > > https://public-inbox.org/git/cover.1497440104.git.johannes.schinde...@gmx.de/ > > Applying this series was messier than necessary, I'd have to say, as > this series would not apply cleanly on top of the result of applying > Dscho's v4 on top of the same base as Dscho's v3 was applied (which > is v2.13.0). It applied cleanly only when Dscho's v4 and then this > series were applied on top of 02a2850a ("Sync with maint", > 2017-06-13), which is a much newer commit than v2.13.0. > > Which in turn means these two fixes cannot be merged to 'maint' as > you two collaboratively prepared. > > I've applied Dscho's v4 on top of v2.13.0 (just like his v3 was > queued in my tree on the same base), and then tweaked this series to > apply on top of that, so that they can go to 'maint' if we chose to. > > But it is possible that during this unnecessary patch shuffling, I > may have made some mistake, so please eyeball the resulting 12 > patches carefully when they are pushed out. Ugh, I'm terribly sorry. Completely my bad as I didn't consider what you would need to do on your end. When I built my patches on top of his I naively just applied his v4 to what ever the current origin/master was at that point in time. I'll be sure to be more careful with this next time. -- Brandon Williams
[PATCH v3 1/4] sha1_file: teach packed_object_info about typename
In commit 46f0344 ("sha1_file: support reading from a loose object of unknown type", 2015-05-06), "struct object_info" gained a "typename" field that could represent a type name from a loose object file, whether valid or invalid, as opposed to the existing "typep" which could only represent valid types. Some relatively complex manipulations were added to avoid breaking packed_object_info() without modifying it, but it is much easier to just teach packed_object_info() about the new field. Therefore, teach packed_object_info() as described above. Signed-off-by: Jonathan Tan --- sha1_file.c | 29 - 1 file changed, 12 insertions(+), 17 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 59a4ed2ed..a52b27541 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2277,9 +2277,18 @@ int packed_object_info(struct packed_git *p, off_t obj_offset, *oi->disk_sizep = revidx[1].offset - obj_offset; } - if (oi->typep) { - *oi->typep = packed_to_object_type(p, obj_offset, type, &w_curs, curpos); - if (*oi->typep < 0) { + if (oi->typep || oi->typename) { + enum object_type ptot; + ptot = packed_to_object_type(p, obj_offset, type, &w_curs, +curpos); + if (oi->typep) + *oi->typep = ptot; + if (oi->typename) { + const char *tn = typename(ptot); + if (tn) + strbuf_addstr(oi->typename, tn); + } + if (ptot < 0) { type = OBJ_BAD; goto out; } @@ -2960,7 +2969,6 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, struct cached_object *co; struct pack_entry e; int rtype; - enum object_type real_type; const unsigned char *real = lookup_replace_object_extended(sha1, flags); co = find_cached_object(real); @@ -2992,18 +3000,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, return -1; } - /* -* packed_object_info() does not follow the delta chain to -* find out the real type, unless it is given oi->typep. -*/ - if (oi->typename && !oi->typep) - oi->typep = &real_type; - rtype = packed_object_info(e.p, e.offset, oi); if (rtype < 0) { mark_bad_packed_object(e.p, real); - if (oi->typep == &real_type) - oi->typep = NULL; return sha1_object_info_extended(real, oi, 0); } else if (in_delta_base_cache(e.p, e.offset)) { oi->whence = OI_DBCACHED; @@ -3014,10 +3013,6 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, oi->u.packed.is_delta = (rtype == OBJ_REF_DELTA || rtype == OBJ_OFS_DELTA); } - if (oi->typename) - strbuf_addstr(oi->typename, typename(*oi->typep)); - if (oi->typep == &real_type) - oi->typep = NULL; return 0; } -- 2.13.1.518.g3df882009-goog
[PATCH v3 4/4] sha1_file, fsck: add missing blob support
Currently, Git does not support repos with very large numbers of blobs or repos that wish to minimize manipulation of certain blobs (for example, because they are very large) very well, even if the user operates mostly on part of the repo, because Git is designed on the assumption that every blob referenced by a tree object is available somewhere in the repo storage. As a first step to reducing this problem, add rudimentary support for missing blobs by teaching sha1_file to invoke a hook whenever a blob is requested and unavailable but registered to be missing, and by updating fsck to tolerate such blobs. The hook is a shell command that can be configured through "git config"; this hook takes in a list of hashes and writes (if successful) the corresponding objects to the repo's local storage. This commit does not include support for generating such a repo; neither has any command (other than fsck) been modified to either tolerate missing blobs (without invoking the hook) or be more efficient in invoking the missing blob hook. Only a fallback is provided in the form of sha1_file invoking the missing blob hook when necessary. In order to determine the code changes in sha1_file.c necessary, I investigated the following: (1) functions in sha1_file that take in a hash, without the user regarding how the object is stored (loose or packed) (2) functions in sha1_file that operate on packed objects (because I need to check callers that know about the loose/packed distinction and operate on both differently, and ensure that they can handle the concept of objects that are neither loose nor packed) (1) is handled by the modification to sha1_object_info_extended() and has_sha1_file_with_flags(). For (2), I looked through the same functions as in (1) and also for_each_packed_object. The ones that are relevant are: - parse_pack_index - http - indirectly from http_get_info_packs - find_pack_entry_one - this searches a single pack that is provided as an argument; the caller already knows (through other means) that the sought object is in a specific pack - find_sha1_pack - fast-import - appears to be an optimization to not store a file if it is already in a pack - http-walker - to search through a struct alt_base - http-push - to search through remote packs - has_sha1_pack - builtin/fsck - fixed in this commit - builtin/count-objects - informational purposes only (check if loose object is also packed) - builtin/prune-packed - check if object to be pruned is packed (if not, don't prune it) - revision - used to exclude packed objects if requested by user - diff - just for optimization - for_each_packed_object - reachable - only to find recent objects - builtin/fsck - fixed in this commit - builtin/cat-file - see below As described in the list above, builtin/fsck has been updated. I have left builtin/cat-file alone; this means that cat-file --batch-all-objects will only operate on objects physically in the repo. An alternative design that I considered but rejected: - Adding a hook whenever a packed blob is requested, not on any blob. That is, whenever we attempt to search the packfiles for a blob, if it is missing (from the packfiles and from the loose object storage), to invoke the hook (which must then store it as a packfile), open the packfile the hook generated, and report that the blob is found in that new packfile. This reduces the amount of analysis needed (in that we only need to look at how packed blobs are handled), but requires that the hook generate packfiles (or for sha1_file to pack whatever loose objects are generated), creating one packfile for each missing blob and potentially very many packfiles that must be linearly searched. This may be tolerable now for repos that only have a few missing blobs (for example, repos that only want to exclude large blobs), and might be tolerable in the future if we have batching support for the most commonly used commands, but is not tolerable now for repos that exclude a large amount of blobs. Signed-off-by: Jonathan Tan --- Documentation/config.txt | 10 +++ builtin/fsck.c | 7 ++ cache.h | 7 ++ sha1_file.c | 171 +++ t/t3907-missing-blob.sh | 69 +++ 5 files changed, 250 insertions(+), 14 deletions(-) create mode 100755 t/t3907-missing-blob.sh diff --git a/Documentation/config.txt b/Documentation/config.txt index dd4beec39..10da5fde1 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -390,6 +390,16 @@ The default is false, except linkgit:git-clone[1] or linkgit:git-init[1] will probe and set core.ignoreCase true if appropriate when the repository is created. +core.missingBlobCommand:: + If set, whenever a blob in the local repo is attempted to be + read but is missing, invoke this shel
[PATCH v3 3/4] sha1_file: consolidate storage-agnostic object fns
In sha1_file.c, there are a few functions that provide information on an object regardless of its storage (cached, loose, or packed). Looking through all non-static functions in sha1_file.c that take in an unsigned char * pointer, the relevant ones are: - sha1_object_info_extended - sha1_object_info (auto-fixed by sha1_object_info_extended) - read_sha1_file_extended (uses read_object) - read_object_with_reference (auto-fixed by read_sha1_file_extended) - has_sha1_file_with_flags - assert_sha1_type (auto-fixed by sha1_object_info) Looking at the 3 primary functions (sha1_object_info_extended, read_object, has_sha1_file_with_flags), they independently implement mechanisms such as object replacement, retrying the packed store after failing to find the object in the packed store then the loose store, and being able to mark a packed object as bad and then retrying the whole process. Consolidating these mechanisms would be a great help to maintainability. However, has_sha1_file_with_flags() does things that the other 2 don't (skipping cached storage, allowing a "quick" mode that skips retrying the packed storage after trying the loose storage, and refreshing any loose files found). Therefore, consolidate only the other 2 functions by extending sha1_object_info_extended() to support the functionality needed, and then modifying read_object() to use sha1_object_info_extended(). Signed-off-by: Jonathan Tan --- cache.h | 1 + sha1_file.c | 84 ++--- 2 files changed, 43 insertions(+), 42 deletions(-) diff --git a/cache.h b/cache.h index 4d92aae0e..63a73af17 100644 --- a/cache.h +++ b/cache.h @@ -1835,6 +1835,7 @@ struct object_info { off_t *disk_sizep; unsigned char *delta_base_sha1; struct strbuf *typename; + void **contentp; /* Response */ enum { diff --git a/sha1_file.c b/sha1_file.c index a38319443..60b487c70 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2005,19 +2005,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep) return parse_sha1_header_extended(hdr, &oi, LOOKUP_REPLACE_OBJECT); } -static void *unpack_sha1_file(void *map, unsigned long mapsize, enum object_type *type, unsigned long *size, const unsigned char *sha1) -{ - int ret; - git_zstream stream; - char hdr[8192]; - - ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)); - if (ret < Z_OK || (*type = parse_sha1_header(hdr, size)) < 0) - return NULL; - - return unpack_sha1_rest(&stream, hdr, *size, sha1); -} - unsigned long get_size_from_delta(struct packed_git *p, struct pack_window **w_curs, off_t curpos) @@ -2326,8 +2313,10 @@ static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset, if (!ent) return unpack_entry(p, base_offset, type, base_size); - *type = ent->type; - *base_size = ent->size; + if (type) + *type = ent->type; + if (base_size) + *base_size = ent->size; return xmemdupz(ent->data, ent->size); } @@ -2388,9 +2377,16 @@ int packed_object_info(struct packed_git *p, off_t obj_offset, * We always get the representation type, but only convert it to * a "real" type later if the caller is interested. */ - type = unpack_object_header(p, &w_curs, &curpos, &size); + if (oi->contentp) { + *oi->contentp = cache_or_unpack_entry(p, obj_offset, oi->sizep, + &type); + if (!*oi->contentp) + type = OBJ_BAD; + } else { + type = unpack_object_header(p, &w_curs, &curpos, &size); + } - if (oi->sizep) { + if (!oi->contentp && oi->sizep) { if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) { off_t tmp_pos = curpos; off_t base_offset = get_delta_base(p, &w_curs, &tmp_pos, @@ -2679,8 +2675,10 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset, free(external_base); } - *final_type = type; - *final_size = size; + if (final_type) + *final_type = type; + if (final_size) + *final_size = size; unuse_pack(&w_curs); @@ -2914,6 +2912,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, git_zstream stream; char hdr[32]; struct strbuf hdrbuf = STRBUF_INIT; + unsigned long size_scratch; if (oi->delta_base_sha1) hashclr(oi->delta_base_sha1); @@ -2926,7 +2925,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, * return value implicitly indicates whether the * object even exists. */ - if (!oi->typep && !oi->typename && !oi->sizep) { + i
[PATCH v3 0/4] Improvements to sha1_file
Thanks - this has been updated following Junio's comments. Patch 1 is unmodified from the previous version. Patch 2 has been modified to remove the extraneous code pointed out by Junio. I previously had an idea of populating those fields in packed_object_info(), but later changed my mind, but a rebase went wrong. Patches 3-4 have been updated as I have described in [1] and [2]. [1] https://public-inbox.org/git/20170615111447.1208e...@twelve2.svl.corp.google.com/ [2] https://public-inbox.org/git/20170615111447.1208e...@twelve2.svl.corp.google.com/ As before, I would like review on patches 1-3 to go into the tree. (Patch 4 is a work in progress, and is here just to demonstrate the effectiveness of the refactoring.) Jonathan Tan (4): sha1_file: teach packed_object_info about typename sha1_file: move delta base cache code up sha1_file: consolidate storage-agnostic object fns sha1_file, fsck: add missing blob support Documentation/config.txt | 10 + builtin/fsck.c | 7 + cache.h | 8 + sha1_file.c | 474 ++- t/t3907-missing-blob.sh | 69 +++ 5 files changed, 400 insertions(+), 168 deletions(-) create mode 100755 t/t3907-missing-blob.sh -- 2.13.1.518.g3df882009-goog
[PATCH v3 2/4] sha1_file: move delta base cache code up
In a subsequent patch, packed_object_info() will be modified to use the delta base cache, so move the relevant code to before packed_object_info(). Signed-off-by: Jonathan Tan --- sha1_file.c | 220 ++-- 1 file changed, 110 insertions(+), 110 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index a52b27541..a38319443 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2239,116 +2239,6 @@ static enum object_type packed_to_object_type(struct packed_git *p, goto out; } -int packed_object_info(struct packed_git *p, off_t obj_offset, - struct object_info *oi) -{ - struct pack_window *w_curs = NULL; - unsigned long size; - off_t curpos = obj_offset; - enum object_type type; - - /* -* We always get the representation type, but only convert it to -* a "real" type later if the caller is interested. -*/ - type = unpack_object_header(p, &w_curs, &curpos, &size); - - if (oi->sizep) { - if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) { - off_t tmp_pos = curpos; - off_t base_offset = get_delta_base(p, &w_curs, &tmp_pos, - type, obj_offset); - if (!base_offset) { - type = OBJ_BAD; - goto out; - } - *oi->sizep = get_size_from_delta(p, &w_curs, tmp_pos); - if (*oi->sizep == 0) { - type = OBJ_BAD; - goto out; - } - } else { - *oi->sizep = size; - } - } - - if (oi->disk_sizep) { - struct revindex_entry *revidx = find_pack_revindex(p, obj_offset); - *oi->disk_sizep = revidx[1].offset - obj_offset; - } - - if (oi->typep || oi->typename) { - enum object_type ptot; - ptot = packed_to_object_type(p, obj_offset, type, &w_curs, -curpos); - if (oi->typep) - *oi->typep = ptot; - if (oi->typename) { - const char *tn = typename(ptot); - if (tn) - strbuf_addstr(oi->typename, tn); - } - if (ptot < 0) { - type = OBJ_BAD; - goto out; - } - } - - if (oi->delta_base_sha1) { - if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) { - const unsigned char *base; - - base = get_delta_base_sha1(p, &w_curs, curpos, - type, obj_offset); - if (!base) { - type = OBJ_BAD; - goto out; - } - - hashcpy(oi->delta_base_sha1, base); - } else - hashclr(oi->delta_base_sha1); - } - -out: - unuse_pack(&w_curs); - return type; -} - -static void *unpack_compressed_entry(struct packed_git *p, - struct pack_window **w_curs, - off_t curpos, - unsigned long size) -{ - int st; - git_zstream stream; - unsigned char *buffer, *in; - - buffer = xmallocz_gently(size); - if (!buffer) - return NULL; - memset(&stream, 0, sizeof(stream)); - stream.next_out = buffer; - stream.avail_out = size + 1; - - git_inflate_init(&stream); - do { - in = use_pack(p, w_curs, curpos, &stream.avail_in); - stream.next_in = in; - st = git_inflate(&stream, Z_FINISH); - if (!stream.avail_out) - break; /* the payload is larger than it should be */ - curpos += stream.next_in - in; - } while (st == Z_OK || st == Z_BUF_ERROR); - git_inflate_end(&stream); - if ((st != Z_STREAM_END) || stream.total_out != size) { - free(buffer); - return NULL; - } - - return buffer; -} - static struct hashmap delta_base_cache; static size_t delta_base_cached; @@ -2486,6 +2376,116 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset, hashmap_add(&delta_base_cache, ent); } +int packed_object_info(struct packed_git *p, off_t obj_offset, + struct object_info *oi) +{ + struct pack_window *w_curs = NULL; + unsigned long size; + off_t curpos = obj_offset; + enum object_type type; + + /* +* We always get the representation type, but only convert i
Re: [PATCH v2 4/4] sha1_file, fsck: add missing blob support
Jonathan Tan writes: > There is indeed no reason why we need to keep multiple ones separate for > an extended period of time - my thinking was to let fetch/clone be fast > by not needing to scan through the entire existing manifest (in order to > create the new one), letting GC take care of consolidating them ... Given that fetch/clone already incur network cost and the users expect to wait for them to finish, I wouldn't have made such a trade-off. >> > +int has_missing_blob(const unsigned char *sha1, unsigned long *size) >> > +{ >> >> This function that answers "is it expected to be missing?" is >> confusingly named. Is it missing, or does it exist? > > Renamed to in_missing_blob_manifest(). Either that, or "is_known_to_be_missing()", would be OK. Thanks.
Re: [WIP v2 2/2] pack-objects: support --blob-max-bytes
On Thu, 15 Jun 2017 16:28:24 -0400 Jeff Hostetler wrote: > I agree with Peff here. I've been working on my partial/narrow/sparse > clone/fetch ideas since my original RFC and have come to the conclusion > that the server can do the size limiting efficiently, but we should > leave the pathname filtering to the client. That is, let the client > get the commits and trees and then locally apply pattern matching, > whether that be a sparse-checkout definition or simple ".git*" > matching and then make a later request for the "blobs of interest". This means that the client would need to inflate all the trees it received, but that might not be that bad. > Whether we "fault-in" the missing blobs or have a "fetch-blobs" > command (like fetch-pack) to get them is open to debate. > > There are concerns about the size of the requested blob-id list in a > fetch-blobs approach, but I think there are ways to say I need all > of the blobs referenced by the directory /foo in commit (and > optionally, that aren't present in directory /foo in commit > or in the range ..). (handwave) Unless the server no longer has the relevant commit (e.g. because a branch was rewound), but even then, even if we only sent blob hashes, the list would be probably much smaller than the downloaded pack anyway, so things might still be OK.
[PATCH v3 0/2] Add a FREE_AND_NULL() wrapper macro
On Thu, Jun 15 2017, Ævar Arnfjörð Bjarmason jotted: > I'll change it to FREE_AND_NULL and submit my patch as-is, my reading > of the rest of this thread is that making it a function instead of a > macro would be interesting, but has its own caveats that are likely > better considered as part of its own series, whereas this just changes > existing code to its macro-expanded functional equivalent. Here's v3 with that change. Nothing but the macro name (and comments, commit messages etc. referring to it) have changed. Ævar Arnfjörð Bjarmason (2): git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL *.[ch] refactoring: make use of the FREE_AND_NULL() macro alias.c | 6 ++ apply.c | 3 +-- attr.c | 6 ++ blame.c | 3 +-- branch.c | 3 +-- builtin/am.c | 18 +- builtin/clean.c | 6 ++ builtin/config.c | 6 ++ builtin/index-pack.c | 6 ++ builtin/pack-objects.c | 12 builtin/unpack-objects.c | 3 +-- builtin/worktree.c | 6 ++ commit-slab.h| 3 +-- commit.c | 3 +-- config.c | 3 +-- credential.c | 9 +++-- diff-lib.c | 3 +-- diff.c | 6 ++ diffcore-rename.c| 6 ++ dir.c| 9 +++-- fast-import.c| 6 ++ git-compat-util.h| 6 ++ gpg-interface.c | 15 +-- grep.c | 12 help.c | 3 +-- http-push.c | 24 http.c | 15 +-- imap-send.c | 3 +-- line-log.c | 6 ++ ll-merge.c | 3 +-- mailinfo.c | 3 +-- object.c | 3 +-- pathspec.c | 3 +-- prio-queue.c | 3 +-- read-cache.c | 6 ++ ref-filter.c | 3 +-- refs/files-backend.c | 3 +-- refs/ref-cache.c | 3 +-- remote-testsvn.c | 3 +-- rerere.c | 3 +-- sequencer.c | 3 +-- sha1-array.c | 3 +-- sha1_file.c | 3 +-- split-index.c| 3 +-- transport-helper.c | 27 +-- transport.c | 3 +-- tree-diff.c | 6 ++ tree-walk.c | 3 +-- tree.c | 3 +-- 49 files changed, 103 insertions(+), 197 deletions(-) -- 2.13.1.508.gb3defc5cc
[PATCH v3 1/2] git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
Add a FREE_AND_NULL() wrapper marco for the common pattern of freeing a pointer and assigning NULL to it right afterwards. The implementation is similar to the (currently unused) XDL_PTRFREE macro in xdiff/xmacros.h added in commit 3443546f6e ("Use a *real* built-in diff generator", 2006-03-24). The only difference is that free() is called unconditionally, see [1]. See [2] for a suggested alternative which does this via a function instead of a macro. As covered in replies to that message, while it's a viable approach, it would introduce caveats which this approach doesn't have, so that potential change is left to a future follow-up change. This merely allows us to translate exactly what we're doing now to a less verbose & idiomatic form using a macro, while guaranteeing that we don't introduce any functional changes. 1. (http://public-inbox.org/git/alpine.DEB.2.20.1608301948310.129229@virtualbox/) 2. <20170610032143.GA7880@starla> (https://public-inbox.org/git/20170610032143.GA7880@starla/) Signed-off-by: Ævar Arnfjörð Bjarmason --- git-compat-util.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/git-compat-util.h b/git-compat-util.h index 51ba4e6b3b..047172d173 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -808,6 +808,12 @@ extern char *xgetcwd(void); extern FILE *fopen_for_writing(const char *path); extern FILE *fopen_or_warn(const char *path, const char *mode); +/* + * FREE_AND_NULL(ptr) is like free(ptr) followed by ptr = NULL. Note + * that ptr is used twice, so don't pass e.g. ptr++. + */ +#define FREE_AND_NULL(p) do { free(p); (p) = NULL; } while (0) + #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult(sizeof(*(x)), (alloc))) #define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult(sizeof(*(x)), (alloc))) -- 2.13.1.508.gb3defc5cc
[PATCH v3 2/2] *.[ch] refactoring: make use of the FREE_AND_NULL() macro
Replace occurrences of `free(ptr); ptr = NULL` with `FREE_AND_NULL(ptr)`. This introduces no functional changes, but reduces the line count and establishes this pattern as a common idiom with a wrapper macro. Signed-off-by: Ævar Arnfjörð Bjarmason --- alias.c | 6 ++ apply.c | 3 +-- attr.c | 6 ++ blame.c | 3 +-- branch.c | 3 +-- builtin/am.c | 18 +- builtin/clean.c | 6 ++ builtin/config.c | 6 ++ builtin/index-pack.c | 6 ++ builtin/pack-objects.c | 12 builtin/unpack-objects.c | 3 +-- builtin/worktree.c | 6 ++ commit-slab.h| 3 +-- commit.c | 3 +-- config.c | 3 +-- credential.c | 9 +++-- diff-lib.c | 3 +-- diff.c | 6 ++ diffcore-rename.c| 6 ++ dir.c| 9 +++-- fast-import.c| 6 ++ gpg-interface.c | 15 +-- grep.c | 12 help.c | 3 +-- http-push.c | 24 http.c | 15 +-- imap-send.c | 3 +-- line-log.c | 6 ++ ll-merge.c | 3 +-- mailinfo.c | 3 +-- object.c | 3 +-- pathspec.c | 3 +-- prio-queue.c | 3 +-- read-cache.c | 6 ++ ref-filter.c | 3 +-- refs/files-backend.c | 3 +-- refs/ref-cache.c | 3 +-- remote-testsvn.c | 3 +-- rerere.c | 3 +-- sequencer.c | 3 +-- sha1-array.c | 3 +-- sha1_file.c | 3 +-- split-index.c| 3 +-- transport-helper.c | 27 +-- transport.c | 3 +-- tree-diff.c | 6 ++ tree-walk.c | 3 +-- tree.c | 3 +-- 48 files changed, 97 insertions(+), 197 deletions(-) diff --git a/alias.c b/alias.c index 3b90397a99..911481855f 100644 --- a/alias.c +++ b/alias.c @@ -47,8 +47,7 @@ int split_cmdline(char *cmdline, const char ***argv) src++; c = cmdline[src]; if (!c) { - free(*argv); - *argv = NULL; + FREE_AND_NULL(*argv); return -SPLIT_CMDLINE_BAD_ENDING; } } @@ -60,8 +59,7 @@ int split_cmdline(char *cmdline, const char ***argv) cmdline[dst] = 0; if (quoted) { - free(*argv); - *argv = NULL; + FREE_AND_NULL(*argv); return -SPLIT_CMDLINE_UNCLOSED_QUOTE; } diff --git a/apply.c b/apply.c index 854faa6779..e78de0affa 100644 --- a/apply.c +++ b/apply.c @@ -3705,8 +3705,7 @@ static int check_preimage(struct apply_state *state, is_new: patch->is_new = 1; patch->is_delete = 0; - free(patch->old_name); - patch->old_name = NULL; + FREE_AND_NULL(patch->old_name); return 0; } diff --git a/attr.c b/attr.c index 821203e2a9..ebdcfb0b8a 100644 --- a/attr.c +++ b/attr.c @@ -638,13 +638,11 @@ void attr_check_reset(struct attr_check *check) void attr_check_clear(struct attr_check *check) { - free(check->items); - check->items = NULL; + FREE_AND_NULL(check->items); check->alloc = 0; check->nr = 0; - free(check->all_attrs); - check->all_attrs = NULL; + FREE_AND_NULL(check->all_attrs); check->all_attrs_nr = 0; drop_attr_stack(&check->stack); diff --git a/blame.c b/blame.c index 843c845cba..1183943960 100644 --- a/blame.c +++ b/blame.c @@ -314,8 +314,7 @@ static void fill_origin_blob(struct diff_options *opt, static void drop_origin_blob(struct blame_origin *o) { if (o->file.ptr) { - free(o->file.ptr); - o->file.ptr = NULL; + FREE_AND_NULL(o->file.ptr); } } diff --git a/branch.c b/branch.c index 985316eb76..2347cb8649 100644 --- a/branch.c +++ b/branch.c @@ -24,8 +24,7 @@ static int find_tracked_branch(struct remote *remote, void *priv) } else { free(tracking->spec.src); if (tracking->src) { - free(tracking->src); - tracking->src = NULL; + FREE_AND_NULL(tracking->src); } } tracking->spec.src = NULL; diff --git a/builtin/am.c b/builtin/am.c index 8881d73615..80368b6fe6 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -483,8 +48