Re: [PATCH v1] Configure Git contribution guidelines for github.com

2017-06-15 Thread Ævar Arnfjörð Bjarmason
On Mon, Jun 12, 2017 at 5:52 PM, Junio C Hamano  wrote:
> "Philip Oakley"  writes:
>
>> From: "Lars Schneider" 
>>> Many open source projects use github.com for their contribution process.
>>> Although we mirror the Git core repository to github.com [1] we do not
>>> use any other github.com service. This is unknown/unexpected to a
>>> number of (potential) contributors and consequently they create Pull
>>> Requests against our mirror with their contributions. These Pull
>>> Requests become stall [2]. This is frustrating to them as they think we
>>> ignore them and it is also unsatisfactory for us as we miss potential
>>> code improvements and/or new contributors.
>>>
>>> GitHub offers a way to notify Pull Request contributors about the
>>> contribution guidelines for a project [3]. Let's make use of this!
>>>
>>> [1] https://github.com/git/git
>>> [2] https://github.com/git/git/pulls
>>> [3]
>>> https://help.github.com/articles/creating-a-pull-request-template-for-your-repository/
>>>
>>> Signed-off-by: Lars Schneider 
>>> ---

[The latest patch in this thread looks good to me to, thanks Lars]

>> I see there are currently 84 open PRs (13 in the last 14 days), so it
>> is real.
>
> Not so insignificant fraction of these are done purely for the
> purpose of using submitgit, though.  In other words, if submitgit
> were improved not to require a pull request against [1] (instead, it
> could be pointed at a branch in a contributor's repository and do
> the fromatting), these numbers will go down.

There are things we get out of this that would regress if
submitGit were changed this way:

 * Now when you submit a pull request you get a Travis build for
git/git, I don't get this if I push to any random branch in my
avar/git, and although I could probably set it up, it's extra work
etc.

 * I like being able to "git fetch" patches I'm reviewing. Now I can
just set "+refs/pull/*:refs/remotes/origin/pull/*" in the refspec for
git/git, if it were pulling from target repos I'd need to "git remote
add" for each one, not a big deal, but less convenient.

 * There would be no single place to list submitGit requests, which
you can do now through the pull page, although I think this is largely
stale now because nothing auto-closes them if they're merged (by which
point they'll have different sha1s), but that could be improved with
some bot...

There's probably ways to do this without git/git pull requests, but I
don't see what problem would be solved by moving this off git/git,
even if there's open requests submitGit is the only thing using these,
and any confusion about that pull UI would remain if it wasn't (AFAIK
there's no way to completely disable pull requests on github, but I
may be wrong).


Re: [PATCH] wt-status.c: Modified status message shown for a parent-less branch

2017-06-15 Thread Kaartic Sivaraam
On Mon, 2017-06-12 at 17:37 -0400, Jeff King wrote:
> On Mon, Jun 12, 2017 at 02:31:25PM -0700, Junio C Hamano wrote:
> > > I think "staged for commit" still makes perfect sense even when
> > > we are
> > > just asking "what's the current status" and not "what would it
> > > look like
> > > if I were to commit".
> > > 
> > > And avoiding the word "index" is worth-while here, I think. I am
> > > not in
> > > general of the "let's hide the index" camp" but it is a technical
> > > term.
> > > If we can say the same thing in a way that is understood both by
> > > people
> > > who know what the index is and people who do not, that seems like
> > > a win.
> > 
> > I do not mind "Changes not staged yet:".  The point was not about
> > getting rid of "stage" but about not mentioning "commit", because
> > stepping back a bit, if the readers are prepared to accept these
> > messages in the mindset that they are guiding them toward their
> > next
> > commit, "I find 'Initial commit' confusing" would not have been an
> > issue in the first place.
> 
> I think the difference is that "Initial commit" is talking about a
> _specific_ commit. If we're not working on one, it doesn't make much
> sense. But "staged for commit" is not necessarily talking about a
> specific commit; we are talking about the purpose of staging
> something
> in general. You could equally well say "staged for committing"
> (though I
> think the shorter word sounds more natural).
> 
> Likewise with "to be committed".
> 
> > If we can get rid of 'yet' and 'already' from the above two, that
> > would be even better.  The point of the exercise is to be
> > understood
> > by those who do not think in terms of 'preparing for the next
> > commit',
> > so 'yet', 'already', 'to be committed' are all counter-productive
> > for that goal.  Those who accept the 'description of the current
> > state in the context of preparing for the next commit' are not the
> > ones we are trying to help with the suggested three changes.
> 
> I agree that is the goal. My point was that the existing messages are
> OK
> even if you aren't thinking of preparing for the next commit. Saying
> "this is in the index" and "this is staged" are synonyms. Saying
> "this
> is staged for commit" is likewise a synonym, unless there is some
> other
> reason we stage things.
> 
> -Peff
What about, "not making any assumptions" about what the user would
think when he views the output of `git status` ? Why not try some
general messages like, 

* Staged changes
* Unstaged changes

instead of the messages such as 

* Changes to be committed, Changes already in the index
* Changes not staged for commit, Changes not yet in the index

which seem to make assumptions about the user who view the output ?

A typical patch could be found below,

diff --git a/builtin/commit.c b/builtin/commit.c
index 1d805f5da..3ed8e40bc 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1364,6 +1364,7 @@ int cmd_status(int argc, const char **argv, const
char *prefix)
    usage_with_options(builtin_status_usage,
builtin_status_options);
 
    status_init_config(&s, git_status_config);
+   s.commit_template = 1;
    argc = parse_options(argc, argv, prefix,
     builtin_status_options,
     builtin_status_usage, 0);
diff --git a/wt-status.c b/wt-status.c
index 037548496..55a7bd757 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -196,9 +196,14 @@ static void
wt_longstatus_print_cached_header(struct wt_status *s)
 {
    const char *c = color(WT_STATUS_HEADER, s);
 
-   status_printf_ln(s, c, _("Changes to be committed:"));
+   if (s->commit_template)
+   status_printf_ln(s, c, _("Changes to be committed:"));
+   else
+   status_printf_ln(s, c, _("Staged changes:"));
+
    if (!s->hints)
    return;
+
    if (s->whence != FROM_COMMIT)
    ; /* NEEDSWORK: use "git reset --unresolve"??? */
    else if (!s->is_initial)
@@ -214,7 +219,11 @@ static void
wt_longstatus_print_dirty_header(struct wt_status *s,
 {
    const char *c = color(WT_STATUS_HEADER, s);
 
-   status_printf_ln(s, c, _("Changes not staged for commit:"));
+   if (s->commit_template)
+   status_printf_ln(s, c, _("Changes not staged for
commit:"));
+   else
+   status_printf_ln(s, c, _("Unstaged changes:"));
+
    if (!s->hints)
    return;
    if (!has_deleted)
@@ -1576,7 +1585,10 @@ static void wt_longstatus_print(struct wt_status
*s)
 
    if (s->is_initial) {
    status_printf_ln(s, color(WT_STATUS_HEADER, s), "%s",
"");
-   status_printf_ln(s, color(WT_STATUS_HEADER, s),
_("Initial commit"));
+   status_printf_ln(s, color(WT_STATUS_HEADER, s),
+    s->commit_template
+    ? _("Initial commit")
+    : _("No commits yet on the branch")

Re: [PATCH] wt-status.c: Modified status message shown for a parent-less branch

2017-06-15 Thread Jeff King
On Thu, Jun 15, 2017 at 01:49:20PM +0530, Kaartic Sivaraam wrote:

> What about, "not making any assumptions" about what the user would
> think when he views the output of `git status` ? Why not try some
> general messages like, 
> 
> * Staged changes
> * Unstaged changes
> 
> instead of the messages such as 
> 
> * Changes to be committed, Changes already in the index
> * Changes not staged for commit, Changes not yet in the index
> 
> which seem to make assumptions about the user who view the output ?

Saying just "staged changes" is definitely accurate. I don't know if
some users would find that too terse, too. The phrase "not staged for
commit" gives more information if you don't know what "staged" means in
the Git world.

-Peff


[PATCH v2] strbuf: let strbuf_addftime handle %z and %Z itself

2017-06-15 Thread René Scharfe
There is no portable way to pass timezone information to strftime.  Add
parameters for timezone offset and name to strbuf_addftime and let it
handle the timezone-related format specifiers %z and %Z internally.

Callers can opt out for %Z by passing NULL as timezone name.  %z is
always handled internally -- this helps on Windows, where strftime would
expand it to a timezone name (same as %Z), in violation of POSIX.
Modifiers are not handled, e.g. %Ez is still passed to strftime.

Use an empty string as timezone name in show_date (the only current
caller) for now because we only have the timezone offset in non-local
mode.  POSIX allows %Z to resolve to an empty string in case of missing
information.

Helped-by: Ulrich Mueller 
Helped-by: Jeff King 
Signed-off-by: Rene Scharfe 
---
Changes from v1:
- Always handle %z internally.
- Move tests from t6006 to t0006, as that's a more appropriate place.
- Changed tests to only use %%, %z and %Z to avoid incompatibilities.
- Tested on mingw (applies there with patch and some fuzz).

 date.c  |  2 +-
 strbuf.c| 41 +
 strbuf.h| 11 ---
 t/t0006-date.sh |  6 ++
 4 files changed, 52 insertions(+), 8 deletions(-)

diff --git a/date.c b/date.c
index 63fa99685e..5580577334 100644
--- a/date.c
+++ b/date.c
@@ -246,7 +246,7 @@ const char *show_date(timestamp_t time, int tz, const 
struct date_mode *mode)
month_names[tm->tm_mon], tm->tm_year + 1900,
tm->tm_hour, tm->tm_min, tm->tm_sec, tz);
else if (mode->type == DATE_STRFTIME)
-   strbuf_addftime(&timebuf, mode->strftime_fmt, tm);
+   strbuf_addftime(&timebuf, mode->strftime_fmt, tm, tz, "");
else
strbuf_addf(&timebuf, "%.3s %.3s %d %02d:%02d:%02d %d%c%+05d",
weekday_names[tm->tm_wday],
diff --git a/strbuf.c b/strbuf.c
index 00457940cf..be3b9e37b1 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -785,14 +785,48 @@ char *xstrfmt(const char *fmt, ...)
return ret;
 }
 
-void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm)
+void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
+int tz_offset, const char *tz_name)
 {
+   struct strbuf munged_fmt = STRBUF_INIT;
size_t hint = 128;
size_t len;
 
if (!*fmt)
return;
 
+   /*
+* There is no portable way to pass timezone information to
+* strftime, so we handle %z and %Z here.
+*/
+   for (;;) {
+   const char *percent = strchrnul(fmt, '%');
+   strbuf_add(&munged_fmt, fmt, percent - fmt);
+   if (!*percent)
+   break;
+   fmt = percent + 1;
+   switch (*fmt) {
+   case '%':
+   strbuf_addstr(&munged_fmt, "%%");
+   fmt++;
+   break;
+   case 'z':
+   strbuf_addf(&munged_fmt, "%+05d", tz_offset);
+   fmt++;
+   break;
+   case 'Z':
+   if (tz_name) {
+   strbuf_addstr(&munged_fmt, tz_name);
+   fmt++;
+   break;
+   }
+   /* FALLTHROUGH */
+   default:
+   strbuf_addch(&munged_fmt, '%');
+   }
+   }
+   fmt = munged_fmt.buf;
+
strbuf_grow(sb, hint);
len = strftime(sb->buf + sb->len, sb->alloc - sb->len, fmt, tm);
 
@@ -804,17 +838,16 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, 
const struct tm *tm)
 * output contains at least one character, and then drop the 
extra
 * character before returning.
 */
-   struct strbuf munged_fmt = STRBUF_INIT;
-   strbuf_addf(&munged_fmt, "%s ", fmt);
+   strbuf_addch(&munged_fmt, ' ');
while (!len) {
hint *= 2;
strbuf_grow(sb, hint);
len = strftime(sb->buf + sb->len, sb->alloc - sb->len,
   munged_fmt.buf, tm);
}
-   strbuf_release(&munged_fmt);
len--; /* drop munged space */
}
+   strbuf_release(&munged_fmt);
strbuf_setlen(sb, sb->len + len);
 }
 
diff --git a/strbuf.h b/strbuf.h
index 80047b1bb7..39d5836abd 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -339,9 +339,14 @@ __attribute__((format (printf,2,0)))
 extern void strbuf_vaddf(struct strbuf *sb, const char *fmt, va_list ap);
 
 /**
- * Add the time specified by `tm`, as formatted by `strftime`.
- */
-extern void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct 
tm *tm);
+ * Add the time specified by `tm`, as formatte

Re: rs/strbuf-addftime-zZ, was Re: What's cooking in git.git (Jun 2017, #04; Tue, 13)

2017-06-15 Thread René Scharfe

Am 15.06.2017 um 07:42 schrieb Jeff King:

On Thu, Jun 15, 2017 at 01:03:29AM +0200, René Scharfe wrote:

But there's more.  strftime on Windows doesn't support common POSIX-
defined tokens like %F (%Y-%m-%d) and %T (%H:%M:%S). We could handle
them as well.  Do we want that?  At least we'd have to update the
added test that uses them..

Here's the full list of tokens in POSIX [1], but not supported by
Windows [2]: %C, %D, %F, %G, %R, %T, %V, %e, %g, %h, %n, %r, %t, %u
plus the modifiers %E and %O.


I don't have a real opinion on that. The point of adding strftime was
always to give the user access to whatever their system supports. In
particular "%c" which we cannot emulate ourselves.

If people want support for those other things on platforms that don't
have it, I have no real objection. But I also don't know that it's worth
spending time on if nobody is asking for it.


Agreed; let's make the tests more focused (i.e. not exercise %F and %T
needlessly).

René


Re:

2017-06-15 Thread Sai al
I would need your partnership in a transaction and details will
disclose to you once i receive your reply.

Thanks
Saif.


[BUG] git cherry-pick segfaults with local changes in working directory

2017-06-15 Thread Sebastian Dröge
Hi,

This is with git 2.11.0 (Debian 2.11.0-4) and can be reproduced with
the packed checkout here:
  
https://people.freedesktop.org/~slomo/git-cherry-pick-segfault_gst-plugins-good.tar.xz

$ tar xf git-cherry-pick-segfault_gst-plugins-good.tar.xz
$ cd gst-plugins-good
$ git cherry-pick 0421fb04470af90e8810e7e5e69955d3192896ba
Segmentation fault (core dumped)


Without local changes (e.g. after "git stash") it works fine. Also note
that the commit that is cherry-picked does not contain any changes to
files that have local changes.


Backtrace below:

Thread 1 "git" received signal SIGSEGV, Segmentation fault.
add_index_entry_with_check (option=0, ce=0x0, istate=0x55993900 
) at read-cache.c:1012
1012read-cache.c: No such file or directory.
(gdb) bt
#0  add_index_entry_with_check (option=0, ce=0x0, istate=0x55993900 
) at read-cache.c:1012
#1  add_index_entry (istate=0x55993900 , ce=0x0, option=0) at 
read-cache.c:1061
#2  0x55646df6 in merge_content (o=o@entry=0x7fffd550, 
path=path@entry=0x55a461c0 
"docs/plugins/inspect/plugin-shout2send.xml", o_oid=, 
o_oid@entry=0x55a42adc, o_mode=o_mode@entry=33188, 
a_oid=a_oid@entry=0x55a42af4, 
a_mode=a_mode@entry=33188, b_oid=0x55a42b0c, b_mode=33188, 
rename_conflict_info=0x55a45fa0)
at merge-recursive.c:1727
#3  0x55647a1d in process_entry (entry=, path=, o=0x7fffd550)
at merge-recursive.c:1885
#4  merge_trees (o=o@entry=0x7fffd550, head=head@entry=0x559b7c98, 
merge=, 
merge@entry=0x559b7c70, common=, 
common@entry=0x559b7cc0, 
result=result@entry=0x7fffd530) at merge-recursive.c:1948
#5  0x5568a248 in do_recursive_merge (opts=0x7fffdcc0, 
msgbuf=0x7fffd510, 
head=0x7fffd620 "\246\256՝s8J\273ݵ\241y\303#j\277\211\266\\\n\377\177", 
next_label=, 
base_label=, next=, base=) at 
sequencer.c:389
#6  do_pick_commit (command=TODO_PICK, commit=commit@entry=0x559a7c60, 
opts=opts@entry=0x7fffdcc0)
at sequencer.c:757
#7  0x5568baa3 in single_pick (opts=0x7fffdcc0, 
cmit=0x559a7c60) at sequencer.c:1329
#8  sequencer_pick_revisions (opts=0x7fffdcc0) at sequencer.c:1378
#9  0x555d34cf in run_sequencer (argc=1, argc@entry=2, 
argv=argv@entry=0x7fffe0c0, 
opts=, opts@entry=0x7fffdcc0) at builtin/revert.c:178
#10 0x555d3928 in cmd_cherry_pick (argc=2, argv=0x7fffe0c0, 
prefix=)
at builtin/revert.c:203
#11 0x55566205 in run_builtin (argv=, argc=, p=)
at git.c:373
#12 handle_builtin (argc=2, argv=0x7fffe0c0) at git.c:572
#13 0x555665a2 in run_argv (argv=0x7fffde60, argcp=0x7fffde6c) 
at git.c:630
#14 cmd_main (argc=, argv=) at git.c:707
#15 0x555655d2 in main (argc=3, argv=0x7fffe0b8) at common-main.c:40

-- 
Sebastian Dröge, Centricular Ltd · http://www.centricular.com

signature.asc
Description: This is a digitally signed message part


Which hash function to use, was Re: RFC: Another proposed hash function transition plan

2017-06-15 Thread Johannes Schindelin
Hi,

I thought it better to revive this old thread rather than start a new
thread, so as to automatically reach everybody who chimed in originally.

On Mon, 6 Mar 2017, Brandon Williams wrote:

> On 03/06, brian m. carlson wrote:
>
> > On Sat, Mar 04, 2017 at 06:35:38PM -0800, Linus Torvalds wrote:
> >
> > > Btw, I do think the particular choice of hash should still be on the
> > > table. sha-256 may be the obvious first choice, but there are
> > > definitely a few reasons to consider alternatives, especially if
> > > it's a complete switch-over like this.
> > > 
> > > One is large-file behavior - a parallel (or tree) mode could improve
> > > on that noticeably. BLAKE2 does have special support for that, for
> > > example. And SHA-256 does have known attacks compared to SHA-3-256
> > > or BLAKE2 - whether that is due to age or due to more effort, I
> > > can't really judge. But if we're switching away from SHA1 due to
> > > known attacks, it does feel like we should be careful.
> > 
> > I agree with Linus on this.  SHA-256 is the slowest option, and it's
> > the one with the most advanced cryptanalysis.  SHA-3-256 is faster on
> > 64-bit machines (which, as we've seen on the list, is the overwhelming
> > majority of machines using Git), and even BLAKE2b-256 is stronger.
> > 
> > Doing this all over again in another couple years should also be a
> > non-goal.
> 
> I agree that when we decide to move to a new algorithm that we should
> select one which we plan on using for as long as possible (much longer
> than a couple years).  While writing the document we simply used
> "sha256" because it was more tangible and easier to reference.

The SHA-1 transition *requires* a knob telling Git that the current
repository uses a hash function different from SHA-1.

It would make *a whole of a lot of sense* to make that knob *not* Boolean,
but to specify *which* hash function is in use.

That way, it will be easier to switch another time when it becomes
necessary.

And it will also make it easier for interested parties to use a different
hash function in their infrastructure if they want.

And it lifts part of that burden that we have to consider *very carefully*
which function to pick. We still should be more careful than in 2005, when
Git was born, and when, incidentally, when the first attacks on SHA-1
became known, of course. We were just lucky for almost 12 years.

Now, with Dunning-Kruger in mind, I feel that my degree in mathematics
equips me with *just enough* competence to know just how little *even I*
know about cryptography.

The smart thing to do, hence, was to get involved in this discussion and
act as Lt Tawney Madison between us Git developers and experts in
cryptography.

It just so happens that I work at a company with access to excellent
cryptographers, and as we own the largest Git repository on the planet, we
have a vested interest in ensuring Git's continued success.

After a couple of conversations with a couple of experts who I cannot
thank enough for their time and patience, let alone their knowledge about
this matter, it would appear that we may not have had a complete enough
picture yet to even start to make the decision on the hash function to
use.

>From what I read, pretty much everybody who participated in the discussion
was aware that the essential question is: performance vs security.

It turns out that we can have essentially both.

SHA-256 is most likely the best-studied hash function we currently know
about (*maybe* SHA3-256 has been studied slightly more, but only
slightly). All the experts in the field banged on it with multiple sticks
and other weapons. And so far, they only found one weakness that does not
even apply to Git's usage [*1*]. For cryptography experts, this is the
ultimate measure of security: if something has been attacked that
intensely, by that many experts, for that long, with that little effect,
it is the best we got at the time.

And since SHA-256 has become the standard, and more importantly: since
SHA-256 was explicitly designed to allow for relatively inexpensive
hardware acceleration, this is what we will soon have: hardware support in
the form of, say, special CPU instructions. (That is what I meant by: we
can have performance *and* security.)

This is a rather important point to stress, by the way: BLAKE's design is
apparently *not* friendly to CPU instruction implementations. Meaning that
SHA-256 will be faster than BLAKE (and even than BLAKE2) once the Intel
and AMD CPUs with hardware support for SHA-256 become common.

I also heard something really worrisome about BLAKE2 that makes me want to
stay away from it (in addition to the difficulty it poses for hardware
acceleration): to compete in the SHA-3 contest, BLAKE added complexity so
that it would be roughly on par with its competitors. To allow for faster
execution in software, this complexity was *removed* from BLAKE to create
BLAKE2, making it weaker than SHA-256.

Another important point to conside

Re: [BUG] git cherry-pick segfaults with local changes in working directory

2017-06-15 Thread Jeff King
On Thu, Jun 15, 2017 at 12:11:50PM +0300, Sebastian Dröge wrote:

> This is with git 2.11.0 (Debian 2.11.0-4) and can be reproduced with
> the packed checkout here:
>   
> https://people.freedesktop.org/~slomo/git-cherry-pick-segfault_gst-plugins-good.tar.xz
> 
> $ tar xf git-cherry-pick-segfault_gst-plugins-good.tar.xz
> $ cd gst-plugins-good
> $ git cherry-pick 0421fb04470af90e8810e7e5e69955d3192896ba
> Segmentation fault (core dumped)

Note that the tarball doesn't have all the necessary objects. Its
.git/objects/info/alternates points to another full clone of
git://anongit.freedesktop.org/gstreamer/gst-plugins-good.

The segfault was fixed in 55e9f0e5c (merge-recursive: handle NULL in
add_cacheinfo() correctly, 2016-11-26), which is in v2.11.1.

Curiously, after the fix we print:

error: addinfo_cache failed for path 
'docs/plugins/inspect/plugin-shout2send.xml'

but the cherry-pick succeeds anyway. I'm not sure if that's a bug or
not.

-Peff


Re: preserve untracked cache, was Re: What's cooking in git.git (Jun 2017, #01; Thu, 1)

2017-06-15 Thread Johannes Schindelin
Hi Junio,

On Sat, 3 Jun 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > On Fri, 2 Jun 2017, Junio C Hamano wrote:
> >
> >> Samuel Lijin  writes:
> >> 
> >> >> What is holding this topic up? Anything Ben or I can do to move this
> >> >> closer to `next` or even `master`?
> >> >
> >> > It's in `next` right now (3196d093d6).
> >> 
> >> Thanks for pinging and checking ;-)  
> >> 
> >> I think the topic was merged to 'next' on the 23rd of last month and
> >> graduated to 'master' in the past few days, together with other
> >> topics.
> >
> > Okay. I never saw any "Will merge to" message, so I got worried.
> 
> Well, I cannot quite help if you are not reading them ;-)

Well, I cannot quite help when I am expected to read long "What's cooking"
mails with out-of-context information as opposed to replies to the threads
discussing the actual patch series.

:-P

I understand that it would be hard on you if I expected you to untangle
your What's cooking discussions into the mail threads with the
corresponding patches, of course.

That just proves my point about mailing lists being an inadequate means to
perform code review...

> Issue #06 of May marked it to be merged to 'next':
> https://public-inbox.org/git/
> 
> Issue #07 of May marked it for 'master':
> https://public-inbox.org/git/
> 
> Issue #08 of May kept it (i.e. no issues discovered in the
> meantime):
> https://public-inbox.org/git/
> 
> Issue #01 of June reports it in 'master':
> https://public-inbox.org/git/

Thank you,
Dscho


[PATCH] docs/pretty-formats: stress that %- removes all preceding line-feeds

2017-06-15 Thread SZEDER Gábor
Signed-off-by: SZEDER Gábor 
---

A mere plural "line-feeds" was too subtle for me to grasp on first
(and second...) reading.

 Documentation/pretty-formats.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index 38040e95b..a48d267e2 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -213,8 +213,8 @@ If you add a `+` (plus sign) after '%' of a placeholder, a 
line-feed
 is inserted immediately before the expansion if and only if the
 placeholder expands to a non-empty string.
 
-If you add a `-` (minus sign) after '%' of a placeholder, line-feeds that
-immediately precede the expansion are deleted if and only if the
+If you add a `-` (minus sign) after '%' of a placeholder, all consecutive
+line-feeds immediately preceding the expansion are deleted if and only if the
 placeholder expands to an empty string.
 
 If you add a ` ` (space) after '%' of a placeholder, a space
-- 
2.13.1.505.g7cc9fcafb



Re: [BUG] git cherry-pick segfaults with local changes in working directory

2017-06-15 Thread Sebastian Dröge
Hi Jeff,

Thanks for the fast reply!

On Thu, 2017-06-15 at 06:32 -0400, Jeff King wrote:
> On Thu, Jun 15, 2017 at 12:11:50PM +0300, Sebastian Dröge wrote:
> 
> > This is with git 2.11.0 (Debian 2.11.0-4) and can be reproduced with
> > the packed checkout here:
> >   
> > https://people.freedesktop.org/~slomo/git-cherry-pick-segfault_gst-plugins-good.tar.xz
> > 
> > $ tar xf git-cherry-pick-segfault_gst-plugins-good.tar.xz
> > $ cd gst-plugins-good
> > $ git cherry-pick 0421fb04470af90e8810e7e5e69955d3192896ba
> > Segmentation fault (core dumped)
> 
> Note that the tarball doesn't have all the necessary objects. Its
> .git/objects/info/alternates points to another full clone of
> git://anongit.freedesktop.org/gstreamer/gst-plugins-good.

Ah good to know, I thought this only happens if you clone with
--reference and not otherwise.

> The segfault was fixed in 55e9f0e5c (merge-recursive: handle NULL in
> add_cacheinfo() correctly, 2016-11-26), which is in v2.11.1.

I can confirm that this also fixes my specific problem. Thanks!

-- 
Sebastian Dröge, Centricular Ltd · http://www.centricular.com

signature.asc
Description: This is a digitally signed message part


Re: pushing for a new hash, was Re: [PATCH 2/3] rebase: Add tests for console output

2017-06-15 Thread Johannes Schindelin
Hi Ævar,

On Fri, 2 Jun 2017, Ævar Arnfjörð Bjarmason wrote:

> On Fri, Jun 2, 2017 at 7:54 PM, Jonathan Nieder  wrote:
> >
> > Johannes Schindelin wrote:
> >> On Thu, 1 Jun 2017, Stefan Beller wrote:
> >
> >>> We had a discussion off list how much of the test suite is in bad
> >>> shape, and "$ git grep ^index" points out a lot of places as well.
> >>
> >> Maybe we should call out a specific month (or even a longer period)
> >> during which we try to push toward that new hash function, and focus
> >> more on those tasks (and on critical bug fixes, if any) than anything
> >> else.
> >
> > Thanks for offering. ;-)
> >
> > Here's a rough list of some useful tasks, in no particular order:
> >
> > 1. bc/object-id: This patch series continues, eliminating assumptions
> >about the size of object ids by encapsulating them in a struct.
> >One straightforward way to find code that still needs to be
> >converted is to grep for "sha" --- often the conversion patches
> >change function and variable names to refer to oid_ where they used
> >to use sha1_, making the stragglers easier to spot.
> >
> > 2. Hard-coded object ids in tests: As Stefan hinted, many tests beyond
> >t00* make assumptions about the exact values of object ids.  That's
> >bad for maintainability for other reasons beyond the hash function
> >transition, too.
> >
> >It should be possible to suss them out by patching git's sha1
> >routine to use the ones-complement of sha1 (~sha1) instead and
> >seeing which tests fail.
> 
> I just hacked this up locally.

Would you mind turning that into a patch? I figure this could be a
compile-time switch and applied to Git's `master` so that it is easier to
find those assumptions, as well as verify fixes on multiple platforms.

> > 4. When choosing a hash function, people may argue about performance.
> >It would be useful for run some benchmarks for git (running
> >the test suite, t/perf tests, etc) using a variety of hash
> >functions as input to such a discussion.
> 
> To the extent that such benchmarks matter, it seems prudent to heavily
> weigh them in favor of whatever seems to be likely to be the more
> common hash function going forward, since those are likely to get
> faster through future hardware acceleration.

Exactly.

As I just mentioned elsewhere, the cryptographers I talked to expect
SHA-256 and SHA3-256 to see the most focus of all hash functions, by far.

> E.g. Intel announced Goldmont last year which according to one SHA-1
> implementation improved from 9.5 cycles per byte to 2.7 cpb[1]. They
> only have acceleration for SHA-1 and SHA-256[2]
> 
> 1. https://github.com/weidai11/cryptopp/issues/139#issuecomment-264283385
> 
> 2. https://en.wikipedia.org/wiki/Goldmont

Thanks for digging that up! Very valuable information.

Ciao,
Dscho

Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan

2017-06-15 Thread Mike Hommey
On Thu, Jun 15, 2017 at 12:30:46PM +0200, Johannes Schindelin wrote:
> Footnote *1*: SHA-256, as all hash functions whose output is essentially
> the entire internal state, are susceptible to a so-called "length
> extension attack", where the hash of a secret+message can be used to
> generate the hash of secret+message+piggyback without knowing the secret.
> This is not the case for Git: only visible data are hashed. The type of
> attacks Git has to worry about is very different from the length extension
> attacks, and it is highly unlikely that that weakness of SHA-256 leads to,
> say, a collision attack.

What do the experts think or SHA512/256, which completely removes the
concerns over length extension attack? (which I'd argue is better than
sweeping them under the carpet)

Mike


Re: [PATCH v2] strbuf: let strbuf_addftime handle %z and %Z itself

2017-06-15 Thread Ulrich Mueller
> On Thu, 15 Jun 2017, René Scharfe wrote:

> Callers can opt out for %Z by passing NULL as timezone name.  %z is
> always handled internally -- this helps on Windows, where strftime would
> expand it to a timezone name (same as %Z), in violation of POSIX.
> Modifiers are not handled, e.g. %Ez is still passed to strftime.

POSIX would also allow other things, like a field width:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/strftime.html

$ date '+%8z'
+200

(But I believe that's not very useful, and supporting it might require
duplicating much of strftime's code.)

> Changes from v1:
> - Always handle %z internally.

Minor nitpick: Shouldn't the comment in strbuf.h be updated to reflect
that change?

> + * Add the time specified by `tm`, as formatted by `strftime`.  `tz_offset`
> + * and `tz_name` are used to expand %z and %Z internally, unless `tz_name`
> + * is NULL.  `tz_offset` is in decimal hhmm format, e.g. -600 means six
> + * hours west of Greenwich.

Ulrich


[PATCH] checkout: don't write merge results into the object database

2017-06-15 Thread René Scharfe
Merge results need to be written to the worktree, of course, but we
don't necessarily need object entries for them, especially if they
contain conflict markers.  Use pretend_sha1_file() to create fake
blobs to pass to make_cache_entry() and checkout_entry() instead.

Signed-off-by: Rene Scharfe 
---
 builtin/checkout.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 1624eed7e7..f51c15af9f 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -215,7 +215,7 @@ static int checkout_merged(int pos, const struct checkout 
*state)
 
/*
 * NEEDSWORK:
-* There is absolutely no reason to write this as a blob object
+* There is absolutely no reason to build a fake blob object
 * and create a phony cache entry.  This hack is primarily to get
 * to the write_entry() machinery that massages the contents to
 * work-tree format and writes out which only allows it for a
@@ -225,8 +225,8 @@ static int checkout_merged(int pos, const struct checkout 
*state)
 * (it also writes the merge result to the object database even
 * when it may contain conflicts).
 */
-   if (write_sha1_file(result_buf.ptr, result_buf.size,
-   blob_type, oid.hash))
+   if (pretend_sha1_file(result_buf.ptr, result_buf.size,
+ OBJ_BLOB, oid.hash))
die(_("Unable to add merge result for '%s'"), path);
free(result_buf.ptr);
ce = make_cache_entry(mode, oid.hash, path, 2, 0);
-- 
2.13.0



Re: [BUG] add_again() off-by-one error in custom format

2017-06-15 Thread René Scharfe

Am 15.06.2017 um 07:56 schrieb Jeff King:

One interesting thing is that the cost of finding short hashes very much
depends on your loose object setup. I timed:

   git log --format=%H >/dev/null

versus

   git log --format=%h >/dev/null

on git.git. It went from about 400ms to about 800ms. But then I noticed
I had a lot of loose object directories, and ran "git gc --prune=now".
Afterwards, my timings were more like 380ms and 460ms.

The difference is that in the "before" case, we actually opened each
directory and ran getdents(). But after gc, the directories are gone
totally and open() fails. We also have to do a linear walk through the
objects in each directory, since the contents are sorted.


Do you mean "unsorted"?


So I wonder if it is worth trying to optimize the short-sha1 computation
in the first place. Double-%h aside, that would make _everything_
faster, including --oneline.


Right.


I'm not really sure how, though, short of caching the directory
contents. That opens up questions of whether and when to invalidate the
cache. If the cache were _just_ about short hashes, it might be OK to
just assume that it remains valid through the length of the program (so
worst case, a simultaneous write might mean that we generate a sha1
which just became ambiguous, but that's generally going to be racy
anyway).

The other downside of course is that we'd spend RAM on it. We could
bound the size of the cache, I suppose.


You mean like an in-memory pack index for loose objects?  In order to
avoid the readdir call in sha1_name.c::find_short_object_filename()?
We'd only need to keep the hashes of found objects.  An oid_array
would be quite compact.

Non-racy writes inside a process should not be ignored (write, read
later) -- e.g. checkout does something like that.

Can we trust object directory time stamps for cache invalidation?

René


Re: [PATCH] wt-status.c: Modified status message shown for a parent-less branch

2017-06-15 Thread Samuel Lijin
On Thu, Jun 15, 2017 at 4:42 AM, Jeff King  wrote:
>
> On Thu, Jun 15, 2017 at 01:49:20PM +0530, Kaartic Sivaraam wrote:
>
> > What about, "not making any assumptions" about what the user would
> > think when he views the output of `git status` ? Why not try some
> > general messages like,
> >
> > * Staged changes
> > * Unstaged changes
> >
> > instead of the messages such as
> >
> > * Changes to be committed, Changes already in the index
> > * Changes not staged for commit, Changes not yet in the index
> >
> > which seem to make assumptions about the user who view the output ?
>
> Saying just "staged changes" is definitely accurate. I don't know if
> some users would find that too terse, too. The phrase "not staged for
> commit" gives more information if you don't know what "staged" means in
> the Git world.

Perhaps there should be a message pointing people at documentation
explaining the index and staging terminology?

Offhand, this is something I was wondering about the other day - has
there ever been a discussion of what level of proficiency Git expects
of its users?

> -Peff


[BUG] GITK don't show unstaged changes

2017-06-15 Thread Clébio C . Felix

Details:  https://github.com/git-for-windows/git/issues/1203  

Version with bug: 2.13.1
Normal: 2.13.0


CCFelix


Re: [PATCH v2] strbuf: let strbuf_addftime handle %z and %Z itself

2017-06-15 Thread René Scharfe

Am 15.06.2017 um 13:27 schrieb Ulrich Mueller:

On Thu, 15 Jun 2017, René Scharfe wrote:



Callers can opt out for %Z by passing NULL as timezone name.  %z is
always handled internally -- this helps on Windows, where strftime would
expand it to a timezone name (same as %Z), in violation of POSIX.
Modifiers are not handled, e.g. %Ez is still passed to strftime.


POSIX would also allow other things, like a field width:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/strftime.html

$ date '+%8z'
+200

(But I believe that's not very useful, and supporting it might require
duplicating much of strftime's code.)


Windows doesn't support that (unsurprisingly), but it accepts %#z,
which does the same as %z.  Let's wait for someone to request support
for modifiers and just document the behavior for now.


Changes from v1:
- Always handle %z internally.


Minor nitpick: Shouldn't the comment in strbuf.h be updated to reflect
that change?


+ * Add the time specified by `tm`, as formatted by `strftime`.  `tz_offset`
+ * and `tz_name` are used to expand %z and %Z internally, unless `tz_name`
+ * is NULL.  `tz_offset` is in decimal hhmm format, e.g. -600 means six
+ * hours west of Greenwich.


Yes, it should.  Thanks for paying attention! :)

René


[PATCH v3] strbuf: let strbuf_addftime handle %z and %Z itself

2017-06-15 Thread René Scharfe
There is no portable way to pass timezone information to strftime.  Add
parameters for timezone offset and name to strbuf_addftime and let it
handle the timezone-related format specifiers %z and %Z internally.

Callers can opt out for %Z by passing NULL as timezone name.  %z is
always handled internally -- this helps on Windows, where strftime would
expand it to a timezone name (same as %Z), in violation of POSIX.
Modifiers are not handled, e.g. %Ez is still passed to strftime.

Use an empty string as timezone name in show_date (the only current
caller) for now because we only have the timezone offset in non-local
mode.  POSIX allows %Z to resolve to an empty string in case of missing
information.

Helped-by: Ulrich Mueller 
Helped-by: Jeff King 
Signed-off-by: Rene Scharfe 
---
Changes from v3:
- Updated developer documentation in strbuf.h.
- Added short note to user documentation.

 Documentation/rev-list-options.txt |  3 ++-
 date.c |  2 +-
 strbuf.c   | 41 ++
 strbuf.h   | 10 --
 t/t0006-date.sh|  6 ++
 5 files changed, 54 insertions(+), 8 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index a46f70c2b1..34ae2553f1 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -768,7 +768,8 @@ timezone value.
 1970).  As with `--raw`, this is always in UTC and therefore `-local`
 has no effect.
 +
-`--date=format:...` feeds the format `...` to your system `strftime`.
+`--date=format:...` feeds the format `...` to your system `strftime`,
+except for %z and %Z, which are handled internally.
 Use `--date=format:%c` to show the date in your system locale's
 preferred format.  See the `strftime` manual for a complete list of
 format placeholders. When using `-local`, the correct syntax is
diff --git a/date.c b/date.c
index 63fa99685e..5580577334 100644
--- a/date.c
+++ b/date.c
@@ -246,7 +246,7 @@ const char *show_date(timestamp_t time, int tz, const 
struct date_mode *mode)
month_names[tm->tm_mon], tm->tm_year + 1900,
tm->tm_hour, tm->tm_min, tm->tm_sec, tz);
else if (mode->type == DATE_STRFTIME)
-   strbuf_addftime(&timebuf, mode->strftime_fmt, tm);
+   strbuf_addftime(&timebuf, mode->strftime_fmt, tm, tz, "");
else
strbuf_addf(&timebuf, "%.3s %.3s %d %02d:%02d:%02d %d%c%+05d",
weekday_names[tm->tm_wday],
diff --git a/strbuf.c b/strbuf.c
index 00457940cf..be3b9e37b1 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -785,14 +785,48 @@ char *xstrfmt(const char *fmt, ...)
return ret;
 }
 
-void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm)
+void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
+int tz_offset, const char *tz_name)
 {
+   struct strbuf munged_fmt = STRBUF_INIT;
size_t hint = 128;
size_t len;
 
if (!*fmt)
return;
 
+   /*
+* There is no portable way to pass timezone information to
+* strftime, so we handle %z and %Z here.
+*/
+   for (;;) {
+   const char *percent = strchrnul(fmt, '%');
+   strbuf_add(&munged_fmt, fmt, percent - fmt);
+   if (!*percent)
+   break;
+   fmt = percent + 1;
+   switch (*fmt) {
+   case '%':
+   strbuf_addstr(&munged_fmt, "%%");
+   fmt++;
+   break;
+   case 'z':
+   strbuf_addf(&munged_fmt, "%+05d", tz_offset);
+   fmt++;
+   break;
+   case 'Z':
+   if (tz_name) {
+   strbuf_addstr(&munged_fmt, tz_name);
+   fmt++;
+   break;
+   }
+   /* FALLTHROUGH */
+   default:
+   strbuf_addch(&munged_fmt, '%');
+   }
+   }
+   fmt = munged_fmt.buf;
+
strbuf_grow(sb, hint);
len = strftime(sb->buf + sb->len, sb->alloc - sb->len, fmt, tm);
 
@@ -804,17 +838,16 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, 
const struct tm *tm)
 * output contains at least one character, and then drop the 
extra
 * character before returning.
 */
-   struct strbuf munged_fmt = STRBUF_INIT;
-   strbuf_addf(&munged_fmt, "%s ", fmt);
+   strbuf_addch(&munged_fmt, ' ');
while (!len) {
hint *= 2;
strbuf_grow(sb, hint);
len = strftime(sb->buf + sb->len, sb->alloc - sb->len,

Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan

2017-06-15 Thread Jeff King
On Thu, Jun 15, 2017 at 08:05:18PM +0900, Mike Hommey wrote:

> On Thu, Jun 15, 2017 at 12:30:46PM +0200, Johannes Schindelin wrote:
> > Footnote *1*: SHA-256, as all hash functions whose output is essentially
> > the entire internal state, are susceptible to a so-called "length
> > extension attack", where the hash of a secret+message can be used to
> > generate the hash of secret+message+piggyback without knowing the secret.
> > This is not the case for Git: only visible data are hashed. The type of
> > attacks Git has to worry about is very different from the length extension
> > attacks, and it is highly unlikely that that weakness of SHA-256 leads to,
> > say, a collision attack.
> 
> What do the experts think or SHA512/256, which completely removes the
> concerns over length extension attack? (which I'd argue is better than
> sweeping them under the carpet)

I don't think it's sweeping them under the carpet. Git does not use the
hash as a MAC, so length extension attacks aren't a thing (and even if
we later wanted to use the same algorithm as a MAC, the HMAC
construction is a well-studied technique for dealing with it).

That said, SHA-512 is typically a little faster than SHA-256 on 64-bit
platforms. I don't know if that will change with the advent of hardware
instructions oriented towards SHA-256.

-Peff


Re: [BUG] git cherry-pick segfaults with local changes in working directory

2017-06-15 Thread Jeff King
On Thu, Jun 15, 2017 at 01:37:36PM +0300, Sebastian Dröge wrote:

> > Note that the tarball doesn't have all the necessary objects. Its
> > .git/objects/info/alternates points to another full clone of
> > git://anongit.freedesktop.org/gstreamer/gst-plugins-good.
> 
> Ah good to know, I thought this only happens if you clone with
> --reference and not otherwise.

If you do a local-filesystem clone of a repository with alternates, the
clone will have the same alternates. So I'm guessing you may have done
such a clone of your --reference repository as part of preparing the
tarball.

-Peff


Re: [PATCH] wt-status.c: Modified status message shown for a parent-less branch

2017-06-15 Thread Jeff King
On Thu, Jun 15, 2017 at 07:43:17AM -0400, Samuel Lijin wrote:

> > Saying just "staged changes" is definitely accurate. I don't know if
> > some users would find that too terse, too. The phrase "not staged for
> > commit" gives more information if you don't know what "staged" means in
> > the Git world.

Oops, I meant to say "too terse, though". But it sounds like you got my
meaning.

> Perhaps there should be a message pointing people at documentation
> explaining the index and staging terminology?

Maybe. I wouldn't want this message to get too verbose. People see it a
lot. There advice.statusHints message is already pretty verbose (though
I turned it off myself years ago).

> Offhand, this is something I was wondering about the other day - has
> there ever been a discussion of what level of proficiency Git expects
> of its users?

There have been lots of discussions, but none that I can think of as
definitive.

I think the general strategy these days is to try to give hints via
advise() for confusing situations, and to make it possible for expert
users to turn those off.

In general, I think using words from "git help glossary" is OK, but when
we can use plainer language without loss of precision, that seems like a
good idea. That's just my personal opinion, though.

-Peff


Re: [BUG] add_again() off-by-one error in custom format

2017-06-15 Thread Jeff King
On Thu, Jun 15, 2017 at 01:33:34PM +0200, René Scharfe wrote:

> > The difference is that in the "before" case, we actually opened each
> > directory and ran getdents(). But after gc, the directories are gone
> > totally and open() fails. We also have to do a linear walk through the
> > objects in each directory, since the contents are sorted.
> 
> Do you mean "unsorted"?

Er yeah, sorry, I meant to say "aren't".

> > I'm not really sure how, though, short of caching the directory
> > contents. That opens up questions of whether and when to invalidate the
> > cache. If the cache were _just_ about short hashes, it might be OK to
> > just assume that it remains valid through the length of the program (so
> > worst case, a simultaneous write might mean that we generate a sha1
> > which just became ambiguous, but that's generally going to be racy
> > anyway).
> > 
> > The other downside of course is that we'd spend RAM on it. We could
> > bound the size of the cache, I suppose.
> 
> You mean like an in-memory pack index for loose objects?  In order to
> avoid the readdir call in sha1_name.c::find_short_object_filename()?
> We'd only need to keep the hashes of found objects.  An oid_array
> would be quite compact.

Yes, that's what I was thinking of.

> Non-racy writes inside a process should not be ignored (write, read
> later) -- e.g. checkout does something like that.

Ideally, yes. Though thinking on this more, it's racy even today,
because we rely on the in-memory packed_git list. We usually re-check the
directory only when we look for an object and it's missing. So any new
packs which have been added while the process runs will be missed when
doing short-sha1 lookups (and actually, find_short_packed_object does
not even seem to do the usual retry-on-miss).

A normal process like "checkout" isn't writing new packs, but a
simultaneous repack could be migrating its new objects to a pack behind
its back. (It also _can_ write packs, but only for large blobs).

Given that we are talking about finding "unique" abbreviations here, and
that those abbreviations can become invalidated immediately anyway as
more objects are added (even by the same process), I'm not sure we need
to strive for absolute accuracy.

> Can we trust object directory time stamps for cache invalidation?

I think it would work on POSIX-ish systems, since loose object changes
always involve new files, not modifications of existing ones. I don't
know if there are systems that don't update directory mtimes even for
newly added files.

That would still be a stat() per request. I'd be curious how the timing
compares to the opendir/readdir that happens now.

-Peff


Re: [PATCH v3] strbuf: let strbuf_addftime handle %z and %Z itself

2017-06-15 Thread Jeff King
On Thu, Jun 15, 2017 at 02:29:53PM +0200, René Scharfe wrote:

> There is no portable way to pass timezone information to strftime.  Add
> parameters for timezone offset and name to strbuf_addftime and let it
> handle the timezone-related format specifiers %z and %Z internally.
> 
> Callers can opt out for %Z by passing NULL as timezone name.  %z is
> always handled internally -- this helps on Windows, where strftime would
> expand it to a timezone name (same as %Z), in violation of POSIX.
> Modifiers are not handled, e.g. %Ez is still passed to strftime.
> 
> Use an empty string as timezone name in show_date (the only current
> caller) for now because we only have the timezone offset in non-local
> mode.  POSIX allows %Z to resolve to an empty string in case of missing
> information.
> 
> Helped-by: Ulrich Mueller 
> Helped-by: Jeff King 
> Signed-off-by: Rene Scharfe 
> ---
> Changes from v3:
> - Updated developer documentation in strbuf.h.
> - Added short note to user documentation.

This looks good to me overall.

> diff --git a/t/t0006-date.sh b/t/t0006-date.sh
> index 42d4ea61ef..71082008f0 100755
> --- a/t/t0006-date.sh
> +++ b/t/t0006-date.sh
> @@ -51,6 +51,12 @@ check_show iso-local "$TIME" '2016-06-15 14:13:20 +'
>  check_show raw-local "$TIME" '146600 +'
>  check_show unix-local "$TIME" '146600'
>  
> +check_show 'format:%z' "$TIME" '+0200'
> +check_show 'format-local:%z' "$TIME" '+'
> +check_show 'format:%Z' "$TIME" ''
> +check_show 'format:%%z' "$TIME" '%z'
> +check_show 'format-local:%%z' "$TIME" '%z'

These check that the zone output is correct, but I don't think we ever
check that the value we feed to strftime is actually in the correct zone
in the first place (i.e., that %H shows the correct time).

I think that should go in a separate test from the %z/%Z handling, as
there are some subtleties.

So here are two patches on top of yours: more tests, and then the
format-local handling of %Z.

  [1/2]: t0006: check --date=format zone offsets
  [2/2]: date: use localtime() for "-local" time formats

 date.c  | 14 --
 t/t0006-date.sh | 10 --
 2 files changed, 20 insertions(+), 4 deletions(-)

-Peff


[PATCH 1/2] t0006: check --date=format zone offsets

2017-06-15 Thread Jeff King
We already test that "%z" and "%Z" show the right thing, but
we don't actually check that the time we display is the
correct one. Let's add two new tests:

  1. Test that "format:" shows the time in the author's
 timezone, just like the other time formats.

  2. Test that "format-local:" shows time in the local
 timezone. We don't want to use our normal UTC for this,
 because its offset is zero (so the result would be
 "correct" even if the code forgot to apply the offset
 or applied it in the wrong direction).

 We'll use the EST5 zone, which is already used
 elsewhere in the script (and so is assumed to be
 available everywhere).

Signed-off-by: Jeff King 
---
 t/t0006-date.sh | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 71082008f..9f81bec7a 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -31,9 +31,11 @@ check_show () {
format=$1
time=$2
expect=$3
-   test_expect_success $4 "show date ($format:$time)" '
+   prereqs=$4
+   zone=$5
+   test_expect_success $prereqs "show date ($format:$time)" '
echo "$time -> $expect" >expect &&
-   test-date show:$format "$time" >actual &&
+   TZ=${zone:-$TZ} test-date show:"$format" "$time" >actual &&
test_cmp expect actual
'
 }
@@ -57,6 +59,9 @@ check_show 'format:%Z' "$TIME" ''
 check_show 'format:%%z' "$TIME" '%z'
 check_show 'format-local:%%z' "$TIME" '%z'
 
+check_show 'format:%Y-%m-%d %H:%M:%S' "$TIME" '2016-06-15 16:13:20'
+check_show 'format-local:%Y-%m-%d %H:%M:%S' "$TIME" '2016-06-15 09:13:20' '' 
EST5
+
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
 check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" 
TIME_IS_64BIT,TIME_T_IS_64BIT
-- 
2.13.1.766.g6bea926c5



[PATCH 2/2] date: use localtime() for "-local" time formats

2017-06-15 Thread Jeff King
When we convert seconds-since-epochs timestamps into a
broken-down "struct tm", we do so by adjusting the timestamp
according to the known offset and then using gmtime() to
break down the result. This means that the resulting struct
"knows" that it's in GMT, even though the time it represents
is adjusted for a different zone. The fields where it stores
this data are not portably accessible, so we have no way to
override them to tell them the real zone info.

For the most part, this works. Our date-formatting routines
don't pay attention to these inaccessible fields, and use
the same tz info we provided for adjustment. The one
exception is when we call strftime(), whose %Z format
reveals this hidden timezone data.

We solved that by always showing the empty string for %Z.
This is allowed by POSIX, but not very helpful to the user.
We can't make this work in the general case, as there's no
portable function for setting an arbitrary timezone (and
anyway, we don't have the zone name for the author zones,
only their offsets).

But for the special case of the "-local" formats, we can
just skip the adjustment and use localtime() instead of
gmtime(). This makes --date=format-local:%Z work correctly,
showing the local timezone instead of an empty string.

The new test checks the result for "UTC", our default
test-lib value for $TZ. Using something like EST5 might be
more interesting, but the actual zone string is
system-dependent (for instance, on my system it expands to
just EST). Hopefully "UTC" is vanilla enough that every
system treats it the same.

Signed-off-by: Jeff King 
---
I don't have a Windows system to test this on, but from the output Dscho
provided earlier, I believe this should pass.

 date.c  | 14 --
 t/t0006-date.sh |  1 +
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/date.c b/date.c
index 558057733..1fd6d6637 100644
--- a/date.c
+++ b/date.c
@@ -70,6 +70,12 @@ static struct tm *time_to_tm(timestamp_t time, int tz)
return gmtime(&t);
 }
 
+static struct tm *time_to_tm_local(timestamp_t time)
+{
+   time_t t = time;
+   return localtime(&t);
+}
+
 /*
  * What value of "tz" was in effect back then at "time" in the
  * local timezone?
@@ -214,7 +220,10 @@ const char *show_date(timestamp_t time, int tz, const 
struct date_mode *mode)
return timebuf.buf;
}
 
-   tm = time_to_tm(time, tz);
+   if (mode->local)
+   tm = time_to_tm_local(time);
+   else
+   tm = time_to_tm(time, tz);
if (!tm) {
tm = time_to_tm(0, 0);
tz = 0;
@@ -246,7 +255,8 @@ const char *show_date(timestamp_t time, int tz, const 
struct date_mode *mode)
month_names[tm->tm_mon], tm->tm_year + 1900,
tm->tm_hour, tm->tm_min, tm->tm_sec, tz);
else if (mode->type == DATE_STRFTIME)
-   strbuf_addftime(&timebuf, mode->strftime_fmt, tm, tz, "");
+   strbuf_addftime(&timebuf, mode->strftime_fmt, tm, tz,
+   mode->local ? NULL : "");
else
strbuf_addf(&timebuf, "%.3s %.3s %d %02d:%02d:%02d %d%c%+05d",
weekday_names[tm->tm_wday],
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 9f81bec7a..7ac9466d5 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -56,6 +56,7 @@ check_show unix-local "$TIME" '146600'
 check_show 'format:%z' "$TIME" '+0200'
 check_show 'format-local:%z' "$TIME" '+'
 check_show 'format:%Z' "$TIME" ''
+check_show 'format-local:%Z' "$TIME" 'UTC'
 check_show 'format:%%z' "$TIME" '%z'
 check_show 'format-local:%%z' "$TIME" '%z'
 
-- 
2.13.1.766.g6bea926c5


Re: [PATCH] checkout: don't write merge results into the object database

2017-06-15 Thread Jeff King
On Thu, Jun 15, 2017 at 01:33:42PM +0200, René Scharfe wrote:

> Merge results need to be written to the worktree, of course, but we
> don't necessarily need object entries for them, especially if they
> contain conflict markers.  Use pretend_sha1_file() to create fake
> blobs to pass to make_cache_entry() and checkout_entry() instead.

Conceptually this makes sense, although I have a comment below.

Out of curiosity, did this come up when looking into the cherry-pick
segfault/error from a few hours ago?

> @@ -225,8 +225,8 @@ static int checkout_merged(int pos, const struct checkout 
> *state)
>* (it also writes the merge result to the object database even
>* when it may contain conflicts).
>*/
> - if (write_sha1_file(result_buf.ptr, result_buf.size,
> - blob_type, oid.hash))
> + if (pretend_sha1_file(result_buf.ptr, result_buf.size,
> +   OBJ_BLOB, oid.hash))
>   die(_("Unable to add merge result for '%s'"), path);
>   free(result_buf.ptr);

I wondered if pretend_sha1_file() makes a copy of the buffer, and indeed
it does. So this is correct.

But that raises an interesting question: how big are these objects, and
is it a good idea to store them in RAM? Obviously they're in RAM
already, but I'm not sure if it's one-at-a-time. We could be bumping the
peak RAM used if there's a large number of these entries. Touching the
on-disk odb does feel hacky, but it also means they behave like other
objects.

If it is a concern, I think it could be solved by "unpretending" after
our call to checkout_entry completes. That would need a new call in
sha1_file.c, but it should be easy to write.

-Peff


Re: [PATCH] sub-process: fix comment about api-sub-process.txt

2017-06-15 Thread Ben Peart



On 6/14/2017 2:26 PM, Jonathan Nieder wrote:

Christian Couder wrote:


Subject: sub-process: fix comment about api-sub-process.txt


nit: this one-line description doesn't describe what was wrong and is
being fixed.  I think something like

sub-process: correct path to API docs in comment


Looks good to me.  Thanks for finding/fixing this.



would be easier to understand in "git log" output.


Signed-off-by: Christian Couder 
---
  sub-process.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


With or without such a tweak,
Reviewed-by: Jonathan Nieder 


diff --git a/sub-process.h b/sub-process.h
index 7d451e1cde..d9a45cd359 100644
--- a/sub-process.h
+++ b/sub-process.h
@@ -7,7 +7,7 @@
  
  /*

   * Generic implementation of background process infrastructure.
- * See Documentation/technical/api-background-process.txt.
+ * See: Documentation/technical/api-sub-process.txt
   */
  
   /* data structures */


[PATCH 00/28] Create a reference backend for packed refs

2017-06-15 Thread Michael Haggerty
This patch series continues the saga of picking apart the code for
handling packed references from the code for handling loose
references, all in preparation for making big changes to how the
packed-ref reading and writing works as described in [1]. As a
reminder, the final goal is to read the "packed-refs" file using mmap,
parsing it on the fly instead of storing it into an in-memory
`ref_cache`, and to read and parse only the parts of the file that are
actually needed, giving a big speedup for many operations in
repositories that have lots of refs.

In this episode, we create a `packed_ref_store` class, implementing
part of the `ref_store` API, that represents the packed references
within a repository. The `files_ref_store` now contains an instance of
`packed_ref_store` and delegates to it for the operations that have to
touch the packed refs.

After this patch series, `packed_ref_store` supports:

* Iteration
* `peel_ref`
* `pack_refs` (they're already packed, so it's a NOOP)
* `read_raw_ref`

A future patch series will add support for:

* Reference transactions (`transaction_prepare`, `transaction_finish`,
  `transaction_abort`, `initial_transaction_commit`)
* `delete_refs`

Operations that `packed_ref_store` will probably never support:

* `create_symref`
* `rename_ref` (could be supported, but is probably not useful)
* Reflog-related operations

In addition, all of the packed-refs related code has been moved to a
new module, "refs/packed-backend.{c,h}". This includes some functions
that are still called by `files_ref_store` directly to update the
packed refs.

The patch series is long, but I think relatively straightforward. In
particular, patches 2-14 are quite mechanical. Its main point is to
separate concerns, but it does bring one end-user advantage: if there
is a problem parsing the "packed-refs" file, we now report an error
and die. The old code just ignored lines that it didn't understand.

I've developed these patches on top of master plus the following
patches, which are followups to mh/packed-refs-store-prep:

* lock_packed_refs(): fix cache validity check
* for_each_bisect_ref(): don't trim refnames

The patches can also be obtained from my GitHub fork [2] as branch
"packed-ref-store".

Michael

[1] http://public-inbox.org/git/cover.1490026594.git.mhag...@alum.mit.edu/
[2] https://github.com/mhagger/git

Michael Haggerty (28):
  add_packed_ref(): teach function to overwrite existing refs
  packed_ref_store: new struct
  packed_ref_store: move `packed_refs_path` here
  packed_ref_store: move `packed_refs_lock` member here
  clear_packed_ref_cache(): take a `packed_ref_store *` parameter
  validate_packed_ref_cache(): take a `packed_ref_store *` parameter
  get_packed_ref_cache(): take a `packed_ref_store *` parameter
  get_packed_refs(): take a `packed_ref_store *` parameter
  add_packed_ref(): take a `packed_ref_store *` parameter
  lock_packed_refs(): take a `packed_ref_store *` parameter
  commit_packed_refs(): take a `packed_ref_store *` parameter
  rollback_packed_refs(): take a `packed_ref_store *` parameter
  get_packed_ref(): take a `packed_ref_store *` parameter
  repack_without_refs(): take a `packed_ref_store *` parameter
  packed_peel_ref(): new function, extracted from `files_peel_ref()`
  packed_ref_store: support iteration
  packed_read_raw_ref(): new function, replacing `resolve_packed_ref()`
  packed-backend: new module for handling packed references
  packed_ref_store: make class into a subclass of `ref_store`
  commit_packed_refs(): report errors rather than dying
  commit_packed_refs(): use a staging file separate from the lockfile
  packed_refs_lock(): function renamed from lock_packed_refs()
  packed_refs_lock(): report errors via a `struct strbuf *err`
  packed_refs_unlock(), packed_refs_is_locked(): new functions
  clear_packed_ref_cache(): don't protest if the lock is held
  commit_packed_refs(): remove call to `packed_refs_unlock()`
  repack_without_refs(): don't lock or unlock the packed refs
  read_packed_refs(): die if `packed-refs` contains bogus data

 Makefile  |   1 +
 refs.c|  18 ++
 refs/files-backend.c  | 619 ---
 refs/packed-backend.c | 868 ++
 refs/packed-backend.h |  25 ++
 refs/refs-internal.h  |  10 +
 6 files changed, 981 insertions(+), 560 deletions(-)
 create mode 100644 refs/packed-backend.c
 create mode 100644 refs/packed-backend.h

-- 
2.11.0



[PATCH 01/28] add_packed_ref(): teach function to overwrite existing refs

2017-06-15 Thread Michael Haggerty
Teach `add_packed_ref()` to overwrite an existing entry if one already
exists for the specified `refname`. This means that we can call it
from `files_pack_refs()`, thereby reducing the amount that the latter
function needs to know about the internals of packed-reference
handling.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 40 ++--
 1 file changed, 18 insertions(+), 22 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index b040bb3b0a..87cecde231 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -413,15 +413,16 @@ static struct ref_dir *get_packed_refs(struct 
files_ref_store *refs)
 }
 
 /*
- * Add a reference to the in-memory packed reference cache.  This may
- * only be called while the packed-refs file is locked (see
- * lock_packed_refs()).  To actually write the packed-refs file, call
- * commit_packed_refs().
+ * Add or overwrite a reference in the in-memory packed reference
+ * cache. This may only be called while the packed-refs file is locked
+ * (see lock_packed_refs()). To actually write the packed-refs file,
+ * call commit_packed_refs().
  */
 static void add_packed_ref(struct files_ref_store *refs,
   const char *refname, const struct object_id *oid)
 {
-   struct packed_ref_cache *packed_ref_cache = get_packed_ref_cache(refs);
+   struct ref_dir *packed_refs;
+   struct ref_entry *packed_entry;
 
if (!is_lock_file_locked(&refs->packed_refs_lock))
die("BUG: packed refs not locked");
@@ -429,8 +430,17 @@ static void add_packed_ref(struct files_ref_store *refs,
if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
die("Reference has invalid format: '%s'", refname);
 
-   add_ref_entry(get_packed_ref_dir(packed_ref_cache),
- create_ref_entry(refname, oid, REF_ISPACKED));
+   packed_refs = get_packed_refs(refs);
+   packed_entry = find_ref_entry(packed_refs, refname);
+   if (packed_entry) {
+   /* Overwrite the existing entry: */
+   oidcpy(&packed_entry->u.value.oid, oid);
+   packed_entry->flag = REF_ISPACKED;
+   oidclr(&packed_entry->u.value.peeled);
+   } else {
+   packed_entry = create_ref_entry(refname, oid, REF_ISPACKED);
+   add_ref_entry(packed_refs, packed_entry);
+   }
 }
 
 /*
@@ -1526,12 +1536,10 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
files_downcast(ref_store, REF_STORE_WRITE | REF_STORE_ODB,
   "pack_refs");
struct ref_iterator *iter;
-   struct ref_dir *packed_refs;
int ok;
struct ref_to_prune *refs_to_prune = NULL;
 
lock_packed_refs(refs, LOCK_DIE_ON_ERROR);
-   packed_refs = get_packed_refs(refs);
 
iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0);
while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
@@ -1540,8 +1548,6 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
 * in the packed ref cache. If the reference should be
 * pruned, also add it to refs_to_prune.
 */
-   struct ref_entry *packed_entry;
-
if (!should_pack_ref(iter->refname, iter->oid, iter->flags,
 flags))
continue;
@@ -1552,17 +1558,7 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
 * we don't copy the peeled status, because we want it
 * to be re-peeled.
 */
-   packed_entry = find_ref_entry(packed_refs, iter->refname);
-   if (packed_entry) {
-   /* Overwrite existing packed entry with info from loose 
entry */
-   packed_entry->flag = REF_ISPACKED;
-   oidcpy(&packed_entry->u.value.oid, iter->oid);
-   } else {
-   packed_entry = create_ref_entry(iter->refname, 
iter->oid,
-   REF_ISPACKED);
-   add_ref_entry(packed_refs, packed_entry);
-   }
-   oidclr(&packed_entry->u.value.peeled);
+   add_packed_ref(refs, iter->refname, iter->oid);
 
/* Schedule the loose reference for pruning if requested. */
if ((flags & PACK_REFS_PRUNE)) {
-- 
2.11.0



[PATCH 03/28] packed_ref_store: move `packed_refs_path` here

2017-06-15 Thread Michael Haggerty
Move `packed_refs_path` from `files_ref_store` to `packed_ref_store`,
and rename it to `path` since its meaning is clear from its new
context.

Inline `files_packed_refs_path()`.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 25 -
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 2efb71cee9..c4b8e2f63b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -54,6 +54,9 @@ struct packed_ref_cache {
 struct packed_ref_store {
unsigned int store_flags;
 
+   /* The path of the "packed-refs" file: */
+   char *path;
+
/*
 * A cache of the values read from the `packed-refs` file, if
 * it might still be current; otherwise, NULL.
@@ -61,11 +64,13 @@ struct packed_ref_store {
struct packed_ref_cache *cache;
 };
 
-static struct packed_ref_store *packed_ref_store_create(unsigned int 
store_flags)
+static struct packed_ref_store *packed_ref_store_create(
+   const char *path, unsigned int store_flags)
 {
struct packed_ref_store *refs = xcalloc(1, sizeof(*refs));
 
refs->store_flags = store_flags;
+   refs->path = xstrdup(path);
return refs;
 }
 
@@ -79,7 +84,6 @@ struct files_ref_store {
 
char *gitdir;
char *gitcommondir;
-   char *packed_refs_path;
 
struct ref_cache *loose;
 
@@ -154,8 +158,8 @@ static struct ref_store *files_ref_store_create(const char 
*gitdir,
get_common_dir_noenv(&sb, gitdir);
refs->gitcommondir = strbuf_detach(&sb, NULL);
strbuf_addf(&sb, "%s/packed-refs", refs->gitcommondir);
-   refs->packed_refs_path = strbuf_detach(&sb, NULL);
-   refs->packed_ref_store = packed_ref_store_create(flags);
+   refs->packed_ref_store = packed_ref_store_create(sb.buf, flags);
+   strbuf_release(&sb);
 
return ref_store;
 }
@@ -343,11 +347,6 @@ static struct packed_ref_cache *read_packed_refs(const 
char *packed_refs_file)
return packed_refs;
 }
 
-static const char *files_packed_refs_path(struct files_ref_store *refs)
-{
-   return refs->packed_refs_path;
-}
-
 static void files_reflog_path(struct files_ref_store *refs,
  struct strbuf *sb,
  const char *refname)
@@ -401,7 +400,7 @@ static void validate_packed_ref_cache(struct 
files_ref_store *refs)
 {
if (refs->packed_ref_store->cache &&
!stat_validity_check(&refs->packed_ref_store->cache->validity,
-files_packed_refs_path(refs)))
+refs->packed_ref_store->path))
clear_packed_ref_cache(refs);
 }
 
@@ -415,7 +414,7 @@ static void validate_packed_ref_cache(struct 
files_ref_store *refs)
  */
 static struct packed_ref_cache *get_packed_ref_cache(struct files_ref_store 
*refs)
 {
-   const char *packed_refs_file = files_packed_refs_path(refs);
+   const char *packed_refs_file = refs->packed_ref_store->path;
 
if (!is_lock_file_locked(&refs->packed_refs_lock))
validate_packed_ref_cache(refs);
@@ -1352,7 +1351,7 @@ static int lock_packed_refs(struct files_ref_store *refs, 
int flags)
}
 
if (hold_lock_file_for_update_timeout(
-   &refs->packed_refs_lock, 
files_packed_refs_path(refs),
+   &refs->packed_refs_lock, 
refs->packed_ref_store->path,
flags, timeout_value) < 0)
return -1;
 
@@ -1633,7 +1632,7 @@ static int repack_without_refs(struct files_ref_store 
*refs,
return 0; /* no refname exists in packed refs */
 
if (lock_packed_refs(refs, 0)) {
-   unable_to_lock_message(files_packed_refs_path(refs), errno, 
err);
+   unable_to_lock_message(refs->packed_ref_store->path, errno, 
err);
return -1;
}
packed = get_packed_refs(refs);
-- 
2.11.0



[PATCH 07/28] get_packed_ref_cache(): take a `packed_ref_store *` parameter

2017-06-15 Thread Michael Haggerty
It only cares about the packed-refs part of the reference store.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 26 --
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index f061506bf0..b2ef7b3bb9 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -404,24 +404,22 @@ static void validate_packed_ref_cache(struct 
packed_ref_store *refs)
 }
 
 /*
- * Get the packed_ref_cache for the specified files_ref_store,
+ * Get the packed_ref_cache for the specified packed_ref_store,
  * creating and populating it if it hasn't been read before or if the
  * file has been changed (according to its `validity` field) since it
  * was last read. On the other hand, if we hold the lock, then assume
  * that the file hasn't been changed out from under us, so skip the
  * extra `stat()` call in `stat_validity_check()`.
  */
-static struct packed_ref_cache *get_packed_ref_cache(struct files_ref_store 
*refs)
+static struct packed_ref_cache *get_packed_ref_cache(struct packed_ref_store 
*refs)
 {
-   const char *packed_refs_file = refs->packed_ref_store->path;
+   if (!is_lock_file_locked(&refs->lock))
+   validate_packed_ref_cache(refs);
 
-   if (!is_lock_file_locked(&refs->packed_ref_store->lock))
-   validate_packed_ref_cache(refs->packed_ref_store);
-
-   if (!refs->packed_ref_store->cache)
-   refs->packed_ref_store->cache = 
read_packed_refs(packed_refs_file);
+   if (!refs->cache)
+   refs->cache = read_packed_refs(refs->path);
 
-   return refs->packed_ref_store->cache;
+   return refs->cache;
 }
 
 static struct ref_dir *get_packed_ref_dir(struct packed_ref_cache 
*packed_ref_cache)
@@ -431,7 +429,7 @@ static struct ref_dir *get_packed_ref_dir(struct 
packed_ref_cache *packed_ref_ca
 
 static struct ref_dir *get_packed_refs(struct files_ref_store *refs)
 {
-   return get_packed_ref_dir(get_packed_ref_cache(refs));
+   return get_packed_ref_dir(get_packed_ref_cache(refs->packed_ref_store));
 }
 
 /*
@@ -1151,7 +1149,7 @@ static struct ref_iterator *files_ref_iterator_begin(
loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs),
  prefix, 1);
 
-   iter->packed_ref_cache = get_packed_ref_cache(refs);
+   iter->packed_ref_cache = get_packed_ref_cache(refs->packed_ref_store);
acquire_packed_ref_cache(iter->packed_ref_cache);
packed_iter = cache_ref_iterator_begin(iter->packed_ref_cache->cache,
   prefix, 0);
@@ -1365,7 +1363,7 @@ static int lock_packed_refs(struct files_ref_store *refs, 
int flags)
 */
validate_packed_ref_cache(refs->packed_ref_store);
 
-   packed_ref_cache = get_packed_ref_cache(refs);
+   packed_ref_cache = get_packed_ref_cache(refs->packed_ref_store);
/* Increment the reference count to prevent it from being freed: */
acquire_packed_ref_cache(packed_ref_cache);
return 0;
@@ -1380,7 +1378,7 @@ static int lock_packed_refs(struct files_ref_store *refs, 
int flags)
 static int commit_packed_refs(struct files_ref_store *refs)
 {
struct packed_ref_cache *packed_ref_cache =
-   get_packed_ref_cache(refs);
+   get_packed_ref_cache(refs->packed_ref_store);
int ok, error = 0;
int save_errno = 0;
FILE *out;
@@ -1426,7 +1424,7 @@ static int commit_packed_refs(struct files_ref_store 
*refs)
 static void rollback_packed_refs(struct files_ref_store *refs)
 {
struct packed_ref_cache *packed_ref_cache =
-   get_packed_ref_cache(refs);
+   get_packed_ref_cache(refs->packed_ref_store);
 
files_assert_main_repository(refs, "rollback_packed_refs");
 
-- 
2.11.0



[PATCH 09/28] add_packed_ref(): take a `packed_ref_store *` parameter

2017-06-15 Thread Michael Haggerty
It only cares about the packed-refs part of the reference store.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index bc5c0de84e..4943207098 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -438,19 +438,19 @@ static struct ref_dir *get_packed_refs(struct 
packed_ref_store *refs)
  * (see lock_packed_refs()). To actually write the packed-refs file,
  * call commit_packed_refs().
  */
-static void add_packed_ref(struct files_ref_store *refs,
+static void add_packed_ref(struct packed_ref_store *refs,
   const char *refname, const struct object_id *oid)
 {
struct ref_dir *packed_refs;
struct ref_entry *packed_entry;
 
-   if (!is_lock_file_locked(&refs->packed_ref_store->lock))
+   if (!is_lock_file_locked(&refs->lock))
die("BUG: packed refs not locked");
 
if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
die("Reference has invalid format: '%s'", refname);
 
-   packed_refs = get_packed_refs(refs->packed_ref_store);
+   packed_refs = get_packed_refs(refs);
packed_entry = find_ref_entry(packed_refs, refname);
if (packed_entry) {
/* Overwrite the existing entry: */
@@ -1579,7 +1579,7 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
 * we don't copy the peeled status, because we want it
 * to be re-peeled.
 */
-   add_packed_ref(refs, iter->refname, iter->oid);
+   add_packed_ref(refs->packed_ref_store, iter->refname, 
iter->oid);
 
/* Schedule the loose reference for pruning if requested. */
if ((flags & PACK_REFS_PRUNE)) {
@@ -3210,7 +3210,7 @@ static int files_initial_transaction_commit(struct 
ref_store *ref_store,
 
if ((update->flags & REF_HAVE_NEW) &&
!is_null_oid(&update->new_oid))
-   add_packed_ref(refs, update->refname,
+   add_packed_ref(refs->packed_ref_store, update->refname,
   &update->new_oid);
}
 
-- 
2.11.0



[PATCH 08/28] get_packed_refs(): take a `packed_ref_store *` parameter

2017-06-15 Thread Michael Haggerty
It only cares about the packed-refs part of the reference store.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index b2ef7b3bb9..bc5c0de84e 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -427,9 +427,9 @@ static struct ref_dir *get_packed_ref_dir(struct 
packed_ref_cache *packed_ref_ca
return get_ref_dir(packed_ref_cache->cache->root);
 }
 
-static struct ref_dir *get_packed_refs(struct files_ref_store *refs)
+static struct ref_dir *get_packed_refs(struct packed_ref_store *refs)
 {
-   return get_packed_ref_dir(get_packed_ref_cache(refs->packed_ref_store));
+   return get_packed_ref_dir(get_packed_ref_cache(refs));
 }
 
 /*
@@ -450,7 +450,7 @@ static void add_packed_ref(struct files_ref_store *refs,
if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
die("Reference has invalid format: '%s'", refname);
 
-   packed_refs = get_packed_refs(refs);
+   packed_refs = get_packed_refs(refs->packed_ref_store);
packed_entry = find_ref_entry(packed_refs, refname);
if (packed_entry) {
/* Overwrite the existing entry: */
@@ -592,7 +592,7 @@ static struct ref_cache *get_loose_ref_cache(struct 
files_ref_store *refs)
 static struct ref_entry *get_packed_ref(struct files_ref_store *refs,
const char *refname)
 {
-   return find_ref_entry(get_packed_refs(refs), refname);
+   return find_ref_entry(get_packed_refs(refs->packed_ref_store), refname);
 }
 
 /*
@@ -1633,7 +1633,7 @@ static int repack_without_refs(struct files_ref_store 
*refs,
unable_to_lock_message(refs->packed_ref_store->path, errno, 
err);
return -1;
}
-   packed = get_packed_refs(refs);
+   packed = get_packed_refs(refs->packed_ref_store);
 
/* Remove refnames from the cache */
for_each_string_list_item(refname, refnames)
-- 
2.11.0



[PATCH 04/28] packed_ref_store: move `packed_refs_lock` member here

2017-06-15 Thread Michael Haggerty
Move the `packed_refs_lock` member from `files_ref_store` to
`packed_ref_store`, and rename it to `lock` since it's now more
obvious what it is locking.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 31 ---
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index c4b8e2f63b..de8293493f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -62,6 +62,12 @@ struct packed_ref_store {
 * it might still be current; otherwise, NULL.
 */
struct packed_ref_cache *cache;
+
+   /*
+* Lock used for the "packed-refs" file. Note that this (and
+* thus the enclosing `packed_ref_store`) must not be freed.
+*/
+   struct lock_file lock;
 };
 
 static struct packed_ref_store *packed_ref_store_create(
@@ -87,12 +93,6 @@ struct files_ref_store {
 
struct ref_cache *loose;
 
-   /*
-* Lock used for the "packed-refs" file. Note that this (and
-* thus the enclosing `files_ref_store`) must not be freed.
-*/
-   struct lock_file packed_refs_lock;
-
struct packed_ref_store *packed_ref_store;
 };
 
@@ -125,7 +125,7 @@ static void clear_packed_ref_cache(struct files_ref_store 
*refs)
if (refs->packed_ref_store->cache) {
struct packed_ref_cache *packed_refs = 
refs->packed_ref_store->cache;
 
-   if (is_lock_file_locked(&refs->packed_refs_lock))
+   if (is_lock_file_locked(&refs->packed_ref_store->lock))
die("BUG: packed-ref cache cleared while locked");
refs->packed_ref_store->cache = NULL;
release_packed_ref_cache(packed_refs);
@@ -416,7 +416,7 @@ static struct packed_ref_cache *get_packed_ref_cache(struct 
files_ref_store *ref
 {
const char *packed_refs_file = refs->packed_ref_store->path;
 
-   if (!is_lock_file_locked(&refs->packed_refs_lock))
+   if (!is_lock_file_locked(&refs->packed_ref_store->lock))
validate_packed_ref_cache(refs);
 
if (!refs->packed_ref_store->cache)
@@ -447,7 +447,7 @@ static void add_packed_ref(struct files_ref_store *refs,
struct ref_dir *packed_refs;
struct ref_entry *packed_entry;
 
-   if (!is_lock_file_locked(&refs->packed_refs_lock))
+   if (!is_lock_file_locked(&refs->packed_ref_store->lock))
die("BUG: packed refs not locked");
 
if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
@@ -1351,7 +1351,8 @@ static int lock_packed_refs(struct files_ref_store *refs, 
int flags)
}
 
if (hold_lock_file_for_update_timeout(
-   &refs->packed_refs_lock, 
refs->packed_ref_store->path,
+   &refs->packed_ref_store->lock,
+   refs->packed_ref_store->path,
flags, timeout_value) < 0)
return -1;
 
@@ -1388,10 +1389,10 @@ static int commit_packed_refs(struct files_ref_store 
*refs)
 
files_assert_main_repository(refs, "commit_packed_refs");
 
-   if (!is_lock_file_locked(&refs->packed_refs_lock))
+   if (!is_lock_file_locked(&refs->packed_ref_store->lock))
die("BUG: packed-refs not locked");
 
-   out = fdopen_lock_file(&refs->packed_refs_lock, "w");
+   out = fdopen_lock_file(&refs->packed_ref_store->lock, "w");
if (!out)
die_errno("unable to fdopen packed-refs descriptor");
 
@@ -1409,7 +1410,7 @@ static int commit_packed_refs(struct files_ref_store 
*refs)
if (ok != ITER_DONE)
die("error while iterating over references");
 
-   if (commit_lock_file(&refs->packed_refs_lock)) {
+   if (commit_lock_file(&refs->packed_ref_store->lock)) {
save_errno = errno;
error = -1;
}
@@ -1430,9 +1431,9 @@ static void rollback_packed_refs(struct files_ref_store 
*refs)
 
files_assert_main_repository(refs, "rollback_packed_refs");
 
-   if (!is_lock_file_locked(&refs->packed_refs_lock))
+   if (!is_lock_file_locked(&refs->packed_ref_store->lock))
die("BUG: packed-refs not locked");
-   rollback_lock_file(&refs->packed_refs_lock);
+   rollback_lock_file(&refs->packed_ref_store->lock);
release_packed_ref_cache(packed_ref_cache);
clear_packed_ref_cache(refs);
 }
-- 
2.11.0



[PATCH 16/28] packed_ref_store: support iteration

2017-06-15 Thread Michael Haggerty
Add the infrastructure to iterate over a `packed_ref_store`. It's a
lot of boilerplate, but it's all part of a campaign to make
`packed_ref_store` implement `ref_store`. In the future, this iterator
will work much differently.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 107 ++-
 1 file changed, 98 insertions(+), 9 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 185d05e1d6..e57cdeba36 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1062,10 +1062,102 @@ static int files_peel_ref(struct ref_store *ref_store,
return peel_object(base, sha1);
 }
 
+struct packed_ref_iterator {
+   struct ref_iterator base;
+
+   struct packed_ref_cache *cache;
+   struct ref_iterator *iter0;
+   unsigned int flags;
+};
+
+static int packed_ref_iterator_advance(struct ref_iterator *ref_iterator)
+{
+   struct packed_ref_iterator *iter =
+   (struct packed_ref_iterator *)ref_iterator;
+   int ok;
+
+   while ((ok = ref_iterator_advance(iter->iter0)) == ITER_OK) {
+   if (iter->flags & DO_FOR_EACH_PER_WORKTREE_ONLY &&
+   ref_type(iter->iter0->refname) != REF_TYPE_PER_WORKTREE)
+   continue;
+
+   if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+   !ref_resolves_to_object(iter->iter0->refname,
+   iter->iter0->oid,
+   iter->iter0->flags))
+   continue;
+
+   iter->base.refname = iter->iter0->refname;
+   iter->base.oid = iter->iter0->oid;
+   iter->base.flags = iter->iter0->flags;
+   return ITER_OK;
+   }
+
+   iter->iter0 = NULL;
+   if (ref_iterator_abort(ref_iterator) != ITER_DONE)
+   ok = ITER_ERROR;
+
+   return ok;
+}
+
+static int packed_ref_iterator_peel(struct ref_iterator *ref_iterator,
+  struct object_id *peeled)
+{
+   struct packed_ref_iterator *iter =
+   (struct packed_ref_iterator *)ref_iterator;
+
+   return ref_iterator_peel(iter->iter0, peeled);
+}
+
+static int packed_ref_iterator_abort(struct ref_iterator *ref_iterator)
+{
+   struct packed_ref_iterator *iter =
+   (struct packed_ref_iterator *)ref_iterator;
+   int ok = ITER_DONE;
+
+   if (iter->iter0)
+   ok = ref_iterator_abort(iter->iter0);
+
+   release_packed_ref_cache(iter->cache);
+   base_ref_iterator_free(ref_iterator);
+   return ok;
+}
+
+static struct ref_iterator_vtable packed_ref_iterator_vtable = {
+   packed_ref_iterator_advance,
+   packed_ref_iterator_peel,
+   packed_ref_iterator_abort
+};
+
+static struct ref_iterator *packed_ref_iterator_begin(
+   struct packed_ref_store *refs,
+   const char *prefix, unsigned int flags)
+{
+   struct packed_ref_iterator *iter;
+   struct ref_iterator *ref_iterator;
+
+   iter = xcalloc(1, sizeof(*iter));
+   ref_iterator = &iter->base;
+   base_ref_iterator_init(ref_iterator, &packed_ref_iterator_vtable);
+
+   /*
+* Note that get_packed_ref_cache() internally checks whether
+* the packed-ref cache is up to date with what is on disk,
+* and re-reads it if not.
+*/
+
+   iter->cache = get_packed_ref_cache(refs);
+   acquire_packed_ref_cache(iter->cache);
+   iter->iter0 = cache_ref_iterator_begin(iter->cache->cache, prefix, 0);
+
+   iter->flags = flags;
+
+   return ref_iterator;
+}
+
 struct files_ref_iterator {
struct ref_iterator base;
 
-   struct packed_ref_cache *packed_ref_cache;
struct ref_iterator *iter0;
unsigned int flags;
 };
@@ -1118,7 +1210,6 @@ static int files_ref_iterator_abort(struct ref_iterator 
*ref_iterator)
if (iter->iter0)
ok = ref_iterator_abort(iter->iter0);
 
-   release_packed_ref_cache(iter->packed_ref_cache);
base_ref_iterator_free(ref_iterator);
return ok;
 }
@@ -1160,18 +1251,16 @@ static struct ref_iterator *files_ref_iterator_begin(
 * (If they've already been read, that's OK; we only need to
 * guarantee that they're read before the packed refs, not
 * *how much* before.) After that, we call
-* get_packed_ref_cache(), which internally checks whether the
-* packed-ref cache is up to date with what is on disk, and
-* re-reads it if not.
+* packed_ref_iterator_begin(), which internally checks
+* whether the packed-ref cache is up to date with what is on
+* disk, and re-reads it if not.
 */
 
loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs),
  prefix, 1);
 
-   iter->packed_ref_cache = get_packed_ref_cache(refs->packed_ref_store);
-   

[PATCH 10/28] lock_packed_refs(): take a `packed_ref_store *` parameter

2017-06-15 Thread Michael Haggerty
It only cares about the packed-refs part of the reference store.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 4943207098..0d8f39089f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -80,6 +80,19 @@ static struct packed_ref_store *packed_ref_store_create(
return refs;
 }
 
+/*
+ * Die if refs is not the main ref store. caller is used in any
+ * necessary error messages.
+ */
+static void packed_assert_main_repository(struct packed_ref_store *refs,
+ const char *caller)
+{
+   if (refs->store_flags & REF_STORE_MAIN)
+   return;
+
+   die("BUG: operation %s only allowed for main ref store", caller);
+}
+
 /*
  * Future: need to be in "struct repository"
  * when doing a full libification.
@@ -1334,13 +1347,13 @@ static void write_packed_entry(FILE *fh, const char 
*refname,
  * hold_lock_file_for_update(). Return 0 on success. On errors, set
  * errno appropriately and return a nonzero value.
  */
-static int lock_packed_refs(struct files_ref_store *refs, int flags)
+static int lock_packed_refs(struct packed_ref_store *refs, int flags)
 {
static int timeout_configured = 0;
static int timeout_value = 1000;
struct packed_ref_cache *packed_ref_cache;
 
-   files_assert_main_repository(refs, "lock_packed_refs");
+   packed_assert_main_repository(refs, "lock_packed_refs");
 
if (!timeout_configured) {
git_config_get_int("core.packedrefstimeout", &timeout_value);
@@ -1348,8 +1361,8 @@ static int lock_packed_refs(struct files_ref_store *refs, 
int flags)
}
 
if (hold_lock_file_for_update_timeout(
-   &refs->packed_ref_store->lock,
-   refs->packed_ref_store->path,
+   &refs->lock,
+   refs->path,
flags, timeout_value) < 0)
return -1;
 
@@ -1361,9 +1374,9 @@ static int lock_packed_refs(struct files_ref_store *refs, 
int flags)
 * cache is still valid. We've just locked the file, but it
 * might have changed the moment *before* we locked it.
 */
-   validate_packed_ref_cache(refs->packed_ref_store);
+   validate_packed_ref_cache(refs);
 
-   packed_ref_cache = get_packed_ref_cache(refs->packed_ref_store);
+   packed_ref_cache = get_packed_ref_cache(refs);
/* Increment the reference count to prevent it from being freed: */
acquire_packed_ref_cache(packed_ref_cache);
return 0;
@@ -1560,7 +1573,7 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
int ok;
struct ref_to_prune *refs_to_prune = NULL;
 
-   lock_packed_refs(refs, LOCK_DIE_ON_ERROR);
+   lock_packed_refs(refs->packed_ref_store, LOCK_DIE_ON_ERROR);
 
iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0);
while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
@@ -1629,7 +1642,7 @@ static int repack_without_refs(struct files_ref_store 
*refs,
if (!needs_repacking)
return 0; /* no refname exists in packed refs */
 
-   if (lock_packed_refs(refs, 0)) {
+   if (lock_packed_refs(refs->packed_ref_store, 0)) {
unable_to_lock_message(refs->packed_ref_store->path, errno, 
err);
return -1;
}
@@ -3198,7 +3211,7 @@ static int files_initial_transaction_commit(struct 
ref_store *ref_store,
}
}
 
-   if (lock_packed_refs(refs, 0)) {
+   if (lock_packed_refs(refs->packed_ref_store, 0)) {
strbuf_addf(err, "unable to lock packed-refs file: %s",
strerror(errno));
ret = TRANSACTION_GENERIC_ERROR;
-- 
2.11.0



[PATCH 15/28] packed_peel_ref(): new function, extracted from `files_peel_ref()`

2017-06-15 Thread Michael Haggerty
This will later become a method of `packed_ref_store`.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index c206791b91..185d05e1d6 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1013,6 +1013,18 @@ static int lock_raw_ref(struct files_ref_store *refs,
return ret;
 }
 
+static int packed_peel_ref(struct packed_ref_store *refs,
+  const char *refname, unsigned char *sha1)
+{
+   struct ref_entry *r = get_packed_ref(refs, refname);
+
+   if (!r || peel_entry(r, 0))
+   return -1;
+
+   hashcpy(sha1, r->u.value.peeled.hash);
+   return 0;
+}
+
 static int files_peel_ref(struct ref_store *ref_store,
  const char *refname, unsigned char *sha1)
 {
@@ -1043,17 +1055,9 @@ static int files_peel_ref(struct ref_store *ref_store,
 * be expensive and (b) loose references anyway usually do not
 * have REF_KNOWS_PEELED.
 */
-   if (flag & REF_ISPACKED) {
-   struct ref_entry *r =
-   get_packed_ref(refs->packed_ref_store, refname);
-
-   if (r) {
-   if (peel_entry(r, 0))
-   return -1;
-   hashcpy(sha1, r->u.value.peeled.hash);
-   return 0;
-   }
-   }
+   if (flag & REF_ISPACKED &&
+   !packed_peel_ref(refs->packed_ref_store, refname, sha1))
+   return 0;
 
return peel_object(base, sha1);
 }
-- 
2.11.0



[PATCH 21/28] commit_packed_refs(): use a staging file separate from the lockfile

2017-06-15 Thread Michael Haggerty
We will want to be able to hold the lockfile for `packed-refs` even
after we have activated the new values. So use a separate tempfile,
`packed-refs.new`, as a place to stage the new contents of the
`packed-refs` file. For now this is all done within
`commit_packed_refs()`, but that will change shortly.

Signed-off-by: Michael Haggerty 
---
 refs/packed-backend.c | 40 
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 5bee49d497..6619052e96 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -68,6 +68,13 @@ struct packed_ref_store {
 * thus the enclosing `packed_ref_store`) must not be freed.
 */
struct lock_file lock;
+
+   /*
+* Temporary file used when rewriting new contents to the
+* "packed-refs" file. Note that this (and thus the enclosing
+* `packed_ref_store`) must not be freed.
+*/
+   struct tempfile tempfile;
 };
 
 struct ref_store *packed_ref_store_create(const char *path,
@@ -522,10 +529,16 @@ int lock_packed_refs(struct ref_store *ref_store, int 
flags)
timeout_configured = 1;
}
 
+   /*
+* Note that we close the lockfile immediately because we
+* don't write new content to it, but rather to a separate
+* tempfile.
+*/
if (hold_lock_file_for_update_timeout(
&refs->lock,
refs->path,
-   flags, timeout_value) < 0)
+   flags, timeout_value) < 0 ||
+   close_lock_file(&refs->lock))
return -1;
 
/*
@@ -567,13 +580,23 @@ int commit_packed_refs(struct ref_store *ref_store, 
struct strbuf *err)
get_packed_ref_cache(refs);
int ok;
int ret = -1;
+   struct strbuf sb = STRBUF_INIT;
FILE *out;
struct ref_iterator *iter;
 
if (!is_lock_file_locked(&refs->lock))
die("BUG: commit_packed_refs() called when unlocked");
 
-   out = fdopen_lock_file(&refs->lock, "w");
+   strbuf_addf(&sb, "%s.new", refs->path);
+   if (create_tempfile(&refs->tempfile, sb.buf) < 0) {
+   strbuf_addf(err, "unable to create file %s: %s",
+   sb.buf, strerror(errno));
+   strbuf_release(&sb);
+   goto out;
+   }
+   strbuf_release(&sb);
+
+   out = fdopen_tempfile(&refs->tempfile, "w");
if (!out) {
strbuf_addf(err, "unable to fdopen packed-refs tempfile: %s",
strerror(errno));
@@ -582,7 +605,7 @@ int commit_packed_refs(struct ref_store *ref_store, struct 
strbuf *err)
 
if (fprintf(out, "%s", PACKED_REFS_HEADER) < 0) {
strbuf_addf(err, "error writing to %s: %s",
-   get_lock_file_path(&refs->lock), strerror(errno));
+   get_tempfile_path(&refs->tempfile), 
strerror(errno));
goto error;
}
 
@@ -594,7 +617,7 @@ int commit_packed_refs(struct ref_store *ref_store, struct 
strbuf *err)
if (write_packed_entry(out, iter->refname, iter->oid->hash,
   peel_error ? NULL : peeled.hash)) {
strbuf_addf(err, "error writing to %s: %s",
-   get_lock_file_path(&refs->lock),
+   get_tempfile_path(&refs->tempfile),
strerror(errno));
ref_iterator_abort(iter);
goto error;
@@ -602,13 +625,13 @@ int commit_packed_refs(struct ref_store *ref_store, 
struct strbuf *err)
}
 
if (ok != ITER_DONE) {
-   strbuf_addf(err, "unable to write packed-refs file: "
+   strbuf_addf(err, "unable to rewrite packed-refs file: "
"error iterating over old contents");
goto error;
}
 
-   if (commit_lock_file(&refs->lock)) {
-   strbuf_addf(err, "error overwriting %s: %s",
+   if (rename_tempfile(&refs->tempfile, refs->path)) {
+   strbuf_addf(err, "error replacing %s: %s",
refs->path, strerror(errno));
goto out;
}
@@ -617,9 +640,10 @@ int commit_packed_refs(struct ref_store *ref_store, struct 
strbuf *err)
goto out;
 
 error:
-   rollback_lock_file(&refs->lock);
+   delete_tempfile(&refs->tempfile);
 
 out:
+   rollback_lock_file(&refs->lock);
release_packed_ref_cache(packed_ref_cache);
return ret;
 }
-- 
2.11.0



[PATCH 22/28] packed_refs_lock(): function renamed from lock_packed_refs()

2017-06-15 Thread Michael Haggerty
Rename `lock_packed_refs()` to `packed_refs_lock()` for consistency
with how other methods are named. Also, it's about to get some
companions.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c  |  4 ++--
 refs/packed-backend.c | 10 +-
 refs/packed-backend.h |  2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 4323394a52..3fc2254e33 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1084,7 +1084,7 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
struct ref_to_prune *refs_to_prune = NULL;
struct strbuf err = STRBUF_INIT;
 
-   lock_packed_refs(refs->packed_ref_store, LOCK_DIE_ON_ERROR);
+   packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR);
 
iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0);
while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
@@ -2667,7 +2667,7 @@ static int files_initial_transaction_commit(struct 
ref_store *ref_store,
}
}
 
-   if (lock_packed_refs(refs->packed_ref_store, 0)) {
+   if (packed_refs_lock(refs->packed_ref_store, 0)) {
strbuf_addf(err, "unable to lock packed-refs file: %s",
strerror(errno));
ret = TRANSACTION_GENERIC_ERROR;
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 6619052e96..eb9da04576 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -321,7 +321,7 @@ static struct ref_dir *get_packed_refs(struct 
packed_ref_store *refs)
 /*
  * Add or overwrite a reference in the in-memory packed reference
  * cache. This may only be called while the packed-refs file is locked
- * (see lock_packed_refs()). To actually write the packed-refs file,
+ * (see packed_refs_lock()). To actually write the packed-refs file,
  * call commit_packed_refs().
  */
 void add_packed_ref(struct ref_store *ref_store,
@@ -515,11 +515,11 @@ static int write_packed_entry(FILE *fh, const char 
*refname,
return 0;
 }
 
-int lock_packed_refs(struct ref_store *ref_store, int flags)
+int packed_refs_lock(struct ref_store *ref_store, int flags)
 {
struct packed_ref_store *refs =
packed_downcast(ref_store, REF_STORE_WRITE | REF_STORE_MAIN,
-   "lock_packed_refs");
+   "packed_refs_lock");
static int timeout_configured = 0;
static int timeout_value = 1000;
struct packed_ref_cache *packed_ref_cache;
@@ -567,7 +567,7 @@ static const char PACKED_REFS_HEADER[] =
 /*
  * Write the current version of the packed refs cache from memory to
  * disk. The packed-refs file must already be locked for writing (see
- * lock_packed_refs()). Return zero on success. On errors, rollback
+ * packed_refs_lock()). Return zero on success. On errors, rollback
  * the lockfile, write an error message to `err`, and return a nonzero
  * value.
  */
@@ -698,7 +698,7 @@ int repack_without_refs(struct ref_store *ref_store,
if (!needs_repacking)
return 0; /* no refname exists in packed refs */
 
-   if (lock_packed_refs(&refs->base, 0)) {
+   if (packed_refs_lock(&refs->base, 0)) {
unable_to_lock_message(refs->path, errno, err);
return -1;
}
diff --git a/refs/packed-backend.h b/refs/packed-backend.h
index 3d4057b65b..dbc00d3396 100644
--- a/refs/packed-backend.h
+++ b/refs/packed-backend.h
@@ -9,7 +9,7 @@ struct ref_store *packed_ref_store_create(const char *path,
  * hold_lock_file_for_update(). Return 0 on success. On errors, set
  * errno appropriately and return a nonzero value.
  */
-int lock_packed_refs(struct ref_store *ref_store, int flags);
+int packed_refs_lock(struct ref_store *ref_store, int flags);
 
 void add_packed_ref(struct ref_store *ref_store,
const char *refname, const struct object_id *oid);
-- 
2.11.0



[PATCH 20/28] commit_packed_refs(): report errors rather than dying

2017-06-15 Thread Michael Haggerty
Report errors via a `struct strbuf *err` rather than by calling
`die()`. To enable this goal, change `write_packed_entry()` to report
errors via a return value and `errno` rather than dying.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c  | 10 +++---
 refs/packed-backend.c | 85 +--
 refs/packed-backend.h |  2 +-
 3 files changed, 61 insertions(+), 36 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 039d0343cb..4323394a52 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1082,6 +1082,7 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
struct ref_iterator *iter;
int ok;
struct ref_to_prune *refs_to_prune = NULL;
+   struct strbuf err = STRBUF_INIT;
 
lock_packed_refs(refs->packed_ref_store, LOCK_DIE_ON_ERROR);
 
@@ -1116,10 +1117,11 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
if (ok != ITER_DONE)
die("error while iterating over references");
 
-   if (commit_packed_refs(refs->packed_ref_store))
-   die_errno("unable to overwrite old ref-pack file");
+   if (commit_packed_refs(refs->packed_ref_store, &err))
+   die("unable to overwrite old ref-pack file: %s", err.buf);
 
prune_refs(refs, refs_to_prune);
+   strbuf_release(&err);
return 0;
 }
 
@@ -2681,9 +2683,7 @@ static int files_initial_transaction_commit(struct 
ref_store *ref_store,
   &update->new_oid);
}
 
-   if (commit_packed_refs(refs->packed_ref_store)) {
-   strbuf_addf(err, "unable to commit packed-refs file: %s",
-   strerror(errno));
+   if (commit_packed_refs(refs->packed_ref_store, err)) {
ret = TRANSACTION_GENERIC_ERROR;
goto cleanup;
}
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 5dbe4e4e59..5bee49d497 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -493,15 +493,19 @@ static struct ref_iterator *packed_ref_iterator_begin(
 
 /*
  * Write an entry to the packed-refs file for the specified refname.
- * If peeled is non-NULL, write it as the entry's peeled value.
+ * If peeled is non-NULL, write it as the entry's peeled value. On
+ * error, return a nonzero value and leave errno set at the value left
+ * by the failing call to `fprintf()`.
  */
-static void write_packed_entry(FILE *fh, const char *refname,
-  const unsigned char *sha1,
-  const unsigned char *peeled)
+static int write_packed_entry(FILE *fh, const char *refname,
+ const unsigned char *sha1,
+ const unsigned char *peeled)
 {
-   fprintf_or_die(fh, "%s %s\n", sha1_to_hex(sha1), refname);
-   if (peeled)
-   fprintf_or_die(fh, "^%s\n", sha1_to_hex(peeled));
+   if (fprintf(fh, "%s %s\n", sha1_to_hex(sha1), refname) < 0 ||
+   (peeled && fprintf(fh, "^%s\n", sha1_to_hex(peeled)) < 0))
+   return -1;
+
+   return 0;
 }
 
 int lock_packed_refs(struct ref_store *ref_store, int flags)
@@ -550,49 +554,74 @@ static const char PACKED_REFS_HEADER[] =
 /*
  * Write the current version of the packed refs cache from memory to
  * disk. The packed-refs file must already be locked for writing (see
- * lock_packed_refs()). Return zero on success. On errors, set errno
- * and return a nonzero value.
+ * lock_packed_refs()). Return zero on success. On errors, rollback
+ * the lockfile, write an error message to `err`, and return a nonzero
+ * value.
  */
-int commit_packed_refs(struct ref_store *ref_store)
+int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err)
 {
struct packed_ref_store *refs =
packed_downcast(ref_store, REF_STORE_WRITE | REF_STORE_MAIN,
"commit_packed_refs");
struct packed_ref_cache *packed_ref_cache =
get_packed_ref_cache(refs);
-   int ok, error = 0;
-   int save_errno = 0;
+   int ok;
+   int ret = -1;
FILE *out;
struct ref_iterator *iter;
 
if (!is_lock_file_locked(&refs->lock))
-   die("BUG: packed-refs not locked");
+   die("BUG: commit_packed_refs() called when unlocked");
 
out = fdopen_lock_file(&refs->lock, "w");
-   if (!out)
-   die_errno("unable to fdopen packed-refs descriptor");
+   if (!out) {
+   strbuf_addf(err, "unable to fdopen packed-refs tempfile: %s",
+   strerror(errno));
+   goto error;
+   }
 
-   fprintf_or_die(out, "%s", PACKED_REFS_HEADER);
+   if (fprintf(out, "%s", PACKED_REFS_HEADER) < 0) {
+   strbuf_addf(err, "error writing to %s: %s",
+   get_lock_file_path(&refs->lock

[PATCH 11/28] commit_packed_refs(): take a `packed_ref_store *` parameter

2017-06-15 Thread Michael Haggerty
It only cares about the packed-refs part of the reference store.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0d8f39089f..5d159620f0 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1388,21 +1388,21 @@ static int lock_packed_refs(struct packed_ref_store 
*refs, int flags)
  * lock_packed_refs()). Return zero on success. On errors, set errno
  * and return a nonzero value
  */
-static int commit_packed_refs(struct files_ref_store *refs)
+static int commit_packed_refs(struct packed_ref_store *refs)
 {
struct packed_ref_cache *packed_ref_cache =
-   get_packed_ref_cache(refs->packed_ref_store);
+   get_packed_ref_cache(refs);
int ok, error = 0;
int save_errno = 0;
FILE *out;
struct ref_iterator *iter;
 
-   files_assert_main_repository(refs, "commit_packed_refs");
+   packed_assert_main_repository(refs, "commit_packed_refs");
 
-   if (!is_lock_file_locked(&refs->packed_ref_store->lock))
+   if (!is_lock_file_locked(&refs->lock))
die("BUG: packed-refs not locked");
 
-   out = fdopen_lock_file(&refs->packed_ref_store->lock, "w");
+   out = fdopen_lock_file(&refs->lock, "w");
if (!out)
die_errno("unable to fdopen packed-refs descriptor");
 
@@ -1420,7 +1420,7 @@ static int commit_packed_refs(struct files_ref_store 
*refs)
if (ok != ITER_DONE)
die("error while iterating over references");
 
-   if (commit_lock_file(&refs->packed_ref_store->lock)) {
+   if (commit_lock_file(&refs->lock)) {
save_errno = errno;
error = -1;
}
@@ -1606,7 +1606,7 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
if (ok != ITER_DONE)
die("error while iterating over references");
 
-   if (commit_packed_refs(refs))
+   if (commit_packed_refs(refs->packed_ref_store))
die_errno("unable to overwrite old ref-pack file");
 
prune_refs(refs, refs_to_prune);
@@ -1662,7 +1662,7 @@ static int repack_without_refs(struct files_ref_store 
*refs,
}
 
/* Write what remains */
-   ret = commit_packed_refs(refs);
+   ret = commit_packed_refs(refs->packed_ref_store);
if (ret)
strbuf_addf(err, "unable to overwrite old ref-pack file: %s",
strerror(errno));
@@ -3227,7 +3227,7 @@ static int files_initial_transaction_commit(struct 
ref_store *ref_store,
   &update->new_oid);
}
 
-   if (commit_packed_refs(refs)) {
+   if (commit_packed_refs(refs->packed_ref_store)) {
strbuf_addf(err, "unable to commit packed-refs file: %s",
strerror(errno));
ret = TRANSACTION_GENERIC_ERROR;
-- 
2.11.0



[PATCH 14/28] repack_without_refs(): take a `packed_ref_store *` parameter

2017-06-15 Thread Michael Haggerty
It only cares about the packed-refs part of the reference store.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 2b9d93d3b6..c206791b91 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1621,19 +1621,19 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
  *
  * The refs in 'refnames' needn't be sorted. `err` must not be NULL.
  */
-static int repack_without_refs(struct files_ref_store *refs,
+static int repack_without_refs(struct packed_ref_store *refs,
   struct string_list *refnames, struct strbuf *err)
 {
struct ref_dir *packed;
struct string_list_item *refname;
int ret, needs_repacking = 0, removed = 0;
 
-   files_assert_main_repository(refs, "repack_without_refs");
+   packed_assert_main_repository(refs, "repack_without_refs");
assert(err);
 
/* Look for a packed ref */
for_each_string_list_item(refname, refnames) {
-   if (get_packed_ref(refs->packed_ref_store, refname->string)) {
+   if (get_packed_ref(refs, refname->string)) {
needs_repacking = 1;
break;
}
@@ -1643,11 +1643,11 @@ static int repack_without_refs(struct files_ref_store 
*refs,
if (!needs_repacking)
return 0; /* no refname exists in packed refs */
 
-   if (lock_packed_refs(refs->packed_ref_store, 0)) {
-   unable_to_lock_message(refs->packed_ref_store->path, errno, 
err);
+   if (lock_packed_refs(refs, 0)) {
+   unable_to_lock_message(refs->path, errno, err);
return -1;
}
-   packed = get_packed_refs(refs->packed_ref_store);
+   packed = get_packed_refs(refs);
 
/* Remove refnames from the cache */
for_each_string_list_item(refname, refnames)
@@ -1658,12 +1658,12 @@ static int repack_without_refs(struct files_ref_store 
*refs,
 * All packed entries disappeared while we were
 * acquiring the lock.
 */
-   rollback_packed_refs(refs->packed_ref_store);
+   rollback_packed_refs(refs);
return 0;
}
 
/* Write what remains */
-   ret = commit_packed_refs(refs->packed_ref_store);
+   ret = commit_packed_refs(refs);
if (ret)
strbuf_addf(err, "unable to overwrite old ref-pack file: %s",
strerror(errno));
@@ -1681,7 +1681,7 @@ static int files_delete_refs(struct ref_store *ref_store, 
const char *msg,
if (!refnames->nr)
return 0;
 
-   result = repack_without_refs(refs, refnames, &err);
+   result = repack_without_refs(refs->packed_ref_store, refnames, &err);
if (result) {
/*
 * If we failed to rewrite the packed-refs file, then
@@ -3101,7 +3101,7 @@ static int files_transaction_finish(struct ref_store 
*ref_store,
}
}
 
-   if (repack_without_refs(refs, &refs_to_delete, err)) {
+   if (repack_without_refs(refs->packed_ref_store, &refs_to_delete, err)) {
ret = TRANSACTION_GENERIC_ERROR;
goto cleanup;
}
-- 
2.11.0



[PATCH 13/28] get_packed_ref(): take a `packed_ref_store *` parameter

2017-06-15 Thread Michael Haggerty
It only cares about the packed-refs part of the reference store.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index a08d3fbadf..2b9d93d3b6 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -602,10 +602,10 @@ static struct ref_cache *get_loose_ref_cache(struct 
files_ref_store *refs)
  * Return the ref_entry for the given refname from the packed
  * references.  If it does not exist, return NULL.
  */
-static struct ref_entry *get_packed_ref(struct files_ref_store *refs,
+static struct ref_entry *get_packed_ref(struct packed_ref_store *refs,
const char *refname)
 {
-   return find_ref_entry(get_packed_refs(refs->packed_ref_store), refname);
+   return find_ref_entry(get_packed_refs(refs), refname);
 }
 
 /*
@@ -621,7 +621,7 @@ static int resolve_packed_ref(struct files_ref_store *refs,
 * The loose reference file does not exist; check for a packed
 * reference.
 */
-   entry = get_packed_ref(refs, refname);
+   entry = get_packed_ref(refs->packed_ref_store, refname);
if (entry) {
hashcpy(sha1, entry->u.value.oid.hash);
*flags |= REF_ISPACKED;
@@ -1044,7 +1044,9 @@ static int files_peel_ref(struct ref_store *ref_store,
 * have REF_KNOWS_PEELED.
 */
if (flag & REF_ISPACKED) {
-   struct ref_entry *r = get_packed_ref(refs, refname);
+   struct ref_entry *r =
+   get_packed_ref(refs->packed_ref_store, refname);
+
if (r) {
if (peel_entry(r, 0))
return -1;
@@ -1631,7 +1633,7 @@ static int repack_without_refs(struct files_ref_store 
*refs,
 
/* Look for a packed ref */
for_each_string_list_item(refname, refnames) {
-   if (get_packed_ref(refs, refname->string)) {
+   if (get_packed_ref(refs->packed_ref_store, refname->string)) {
needs_repacking = 1;
break;
}
-- 
2.11.0



[PATCH 19/28] packed_ref_store: make class into a subclass of `ref_store`

2017-06-15 Thread Michael Haggerty
Add the infrastructure to make `packed_ref_store` implement
`ref_store`, at least formally (few of the methods are actually
implemented yet). Change the functions in its interface to take
`ref_store *` arguments. Change `files_ref_store` to store a pointer
to `ref_store *` and to call functions via the virtual `ref_store`
interface where possible. This also means that a few
`packed_ref_store` functions can become static.

This is a work in progress. Some more `ref_store` methods will soon be
implemented (e.g., those having to do with reference transactions).
But some of them will never be implemented (e.g., those having to do
with symrefs or reflogs).

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c  |  16 ++--
 refs/packed-backend.c | 231 +-
 refs/packed-backend.h |  23 ++---
 refs/refs-internal.h  |   1 +
 4 files changed, 226 insertions(+), 45 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 83ea773728..039d0343cb 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -28,7 +28,7 @@ struct files_ref_store {
 
struct ref_cache *loose;
 
-   struct packed_ref_store *packed_ref_store;
+   struct ref_store *packed_ref_store;
 };
 
 static void clear_loose_ref_cache(struct files_ref_store *refs)
@@ -311,8 +311,8 @@ static int files_read_raw_ref(struct ref_store *ref_store,
if (lstat(path, &st) < 0) {
if (errno != ENOENT)
goto out;
-   if (packed_read_raw_ref(refs->packed_ref_store, refname,
-   sha1, referent, type)) {
+   if (refs_read_raw_ref(refs->packed_ref_store, refname,
+ sha1, referent, type)) {
errno = ENOENT;
goto out;
}
@@ -351,8 +351,8 @@ static int files_read_raw_ref(struct ref_store *ref_store,
 * ref is supposed to be, there could still be a
 * packed ref:
 */
-   if (packed_read_raw_ref(refs->packed_ref_store, refname,
-   sha1, referent, type)) {
+   if (refs_read_raw_ref(refs->packed_ref_store, refname,
+ sha1, referent, type)) {
errno = EISDIR;
goto out;
}
@@ -683,7 +683,7 @@ static int files_peel_ref(struct ref_store *ref_store,
 * have REF_KNOWS_PEELED.
 */
if (flag & REF_ISPACKED &&
-   !packed_peel_ref(refs->packed_ref_store, refname, sha1))
+   !refs_peel_ref(refs->packed_ref_store, refname, sha1))
return 0;
 
return peel_object(base, sha1);
@@ -793,8 +793,8 @@ static struct ref_iterator *files_ref_iterator_begin(
loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs),
  prefix, 1);
 
-   packed_iter = packed_ref_iterator_begin(refs->packed_ref_store,
-   prefix, 0);
+   packed_iter = refs_ref_iterator_begin(refs->packed_ref_store,
+ prefix, 0, 0);
 
iter->iter0 = overlay_ref_iterator_begin(loose_iter, packed_iter);
iter->flags = flags;
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 6fa988c953..5dbe4e4e59 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -50,6 +50,8 @@ static int release_packed_ref_cache(struct packed_ref_cache 
*packed_refs)
  * `ref_store`.
  */
 struct packed_ref_store {
+   struct ref_store base;
+
unsigned int store_flags;
 
/* The path of the "packed-refs" file: */
@@ -68,14 +70,17 @@ struct packed_ref_store {
struct lock_file lock;
 };
 
-struct packed_ref_store *packed_ref_store_create(
-   const char *path, unsigned int store_flags)
+struct ref_store *packed_ref_store_create(const char *path,
+ unsigned int store_flags)
 {
struct packed_ref_store *refs = xcalloc(1, sizeof(*refs));
+   struct ref_store *ref_store = (struct ref_store *)refs;
 
+   base_ref_store_init(ref_store, &refs_be_packed);
refs->store_flags = store_flags;
+
refs->path = xstrdup(path);
-   return refs;
+   return ref_store;
 }
 
 /*
@@ -91,6 +96,31 @@ static void packed_assert_main_repository(struct 
packed_ref_store *refs,
die("BUG: operation %s only allowed for main ref store", caller);
 }
 
+/*
+ * Downcast `ref_store` to `packed_ref_store`. Die if `ref_store` is
+ * not a `packed_ref_store`. Also die if `packed_ref_store` doesn't
+ * support at least the flags specified in `required_flags`. `caller`
+ * is used in any necessary error messages.
+ */
+static struct packed_ref_store *packed_downcast(struct ref_store *ref_store,
+   unsi

[PATCH 24/28] packed_refs_unlock(), packed_refs_is_locked(): new functions

2017-06-15 Thread Michael Haggerty
Add two new public functions, `packed_refs_unlock()` and
`packed_refs_is_locked()`, with which callers can manage and query the
`packed-refs` lock externally.

Call `packed_refs_unlock()` from `commit_packed_refs()` and
`rollback_packed_refs()`.

Signed-off-by: Michael Haggerty 
---
 refs/packed-backend.c | 31 +--
 refs/packed-backend.h |  3 +++
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 24451343b8..ff6326ddb8 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -563,6 +563,29 @@ int packed_refs_lock(struct ref_store *ref_store, int 
flags, struct strbuf *err)
return 0;
 }
 
+void packed_refs_unlock(struct ref_store *ref_store)
+{
+   struct packed_ref_store *refs = packed_downcast(
+   ref_store,
+   REF_STORE_READ | REF_STORE_WRITE,
+   "packed_refs_unlock");
+
+   if (!is_lock_file_locked(&refs->lock))
+   die("BUG: packed_refs_unlock() called when not locked");
+   rollback_lock_file(&refs->lock);
+   release_packed_ref_cache(refs->cache);
+}
+
+int packed_refs_is_locked(struct ref_store *ref_store)
+{
+   struct packed_ref_store *refs = packed_downcast(
+   ref_store,
+   REF_STORE_READ | REF_STORE_WRITE,
+   "packed_refs_is_locked");
+
+   return is_lock_file_locked(&refs->lock);
+}
+
 /*
  * The packed-refs header line that we write out.  Perhaps other
  * traits will be added later.  The trailing space is required.
@@ -649,8 +672,7 @@ int commit_packed_refs(struct ref_store *ref_store, struct 
strbuf *err)
delete_tempfile(&refs->tempfile);
 
 out:
-   rollback_lock_file(&refs->lock);
-   release_packed_ref_cache(packed_ref_cache);
+   packed_refs_unlock(ref_store);
return ret;
 }
 
@@ -661,14 +683,11 @@ int commit_packed_refs(struct ref_store *ref_store, 
struct strbuf *err)
  */
 static void rollback_packed_refs(struct packed_ref_store *refs)
 {
-   struct packed_ref_cache *packed_ref_cache = get_packed_ref_cache(refs);
-
packed_assert_main_repository(refs, "rollback_packed_refs");
 
if (!is_lock_file_locked(&refs->lock))
die("BUG: packed-refs not locked");
-   rollback_lock_file(&refs->lock);
-   release_packed_ref_cache(packed_ref_cache);
+   packed_refs_unlock(&refs->base);
clear_packed_ref_cache(refs);
 }
 
diff --git a/refs/packed-backend.h b/refs/packed-backend.h
index 210e3f35ce..03b7c1de95 100644
--- a/refs/packed-backend.h
+++ b/refs/packed-backend.h
@@ -11,6 +11,9 @@ struct ref_store *packed_ref_store_create(const char *path,
  */
 int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf 
*err);
 
+void packed_refs_unlock(struct ref_store *ref_store);
+int packed_refs_is_locked(struct ref_store *ref_store);
+
 void add_packed_ref(struct ref_store *ref_store,
const char *refname, const struct object_id *oid);
 
-- 
2.11.0



[PATCH 17/28] packed_read_raw_ref(): new function, replacing `resolve_packed_ref()`

2017-06-15 Thread Michael Haggerty
Add a new function, `packed_read_raw_ref()`, which is nearly a
`read_raw_ref_fn`. Use it in place of `resolve_packed_ref()`.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 36 +---
 1 file changed, 17 insertions(+), 19 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index e57cdeba36..ac4764f6f7 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -608,27 +608,23 @@ static struct ref_entry *get_packed_ref(struct 
packed_ref_store *refs,
return find_ref_entry(get_packed_refs(refs), refname);
 }
 
-/*
- * A loose ref file doesn't exist; check for a packed ref.
- */
-static int resolve_packed_ref(struct files_ref_store *refs,
- const char *refname,
- unsigned char *sha1, unsigned int *flags)
+static int packed_read_raw_ref(struct packed_ref_store *refs,
+  const char *refname, unsigned char *sha1,
+  struct strbuf *referent, unsigned int *type)
 {
struct ref_entry *entry;
 
-   /*
-* The loose reference file does not exist; check for a packed
-* reference.
-*/
-   entry = get_packed_ref(refs->packed_ref_store, refname);
-   if (entry) {
-   hashcpy(sha1, entry->u.value.oid.hash);
-   *flags |= REF_ISPACKED;
-   return 0;
+   *type = 0;
+
+   entry = get_packed_ref(refs, refname);
+   if (!entry) {
+   errno = ENOENT;
+   return -1;
}
-   /* refname is not a packed reference. */
-   return -1;
+
+   hashcpy(sha1, entry->u.value.oid.hash);
+   *type = REF_ISPACKED;
+   return 0;
 }
 
 static int files_read_raw_ref(struct ref_store *ref_store,
@@ -674,7 +670,8 @@ static int files_read_raw_ref(struct ref_store *ref_store,
if (lstat(path, &st) < 0) {
if (errno != ENOENT)
goto out;
-   if (resolve_packed_ref(refs, refname, sha1, type)) {
+   if (packed_read_raw_ref(refs->packed_ref_store, refname,
+   sha1, referent, type)) {
errno = ENOENT;
goto out;
}
@@ -713,7 +710,8 @@ static int files_read_raw_ref(struct ref_store *ref_store,
 * ref is supposed to be, there could still be a
 * packed ref:
 */
-   if (resolve_packed_ref(refs, refname, sha1, type)) {
+   if (packed_read_raw_ref(refs->packed_ref_store, refname,
+   sha1, referent, type)) {
errno = EISDIR;
goto out;
}
-- 
2.11.0



[PATCH 18/28] packed-backend: new module for handling packed references

2017-06-15 Thread Michael Haggerty
Now that the interface between `files_ref_store` and
`packed_ref_store` is relatively narrow, move the latter into a new
module, "refs/packed-backend.h" and "refs/packed-backend.c". It still
doesn't quite implement the `ref_store` interface, but it will soon.

This commit moves code around and adjusts its visibility, but doesn't
change anything.

Signed-off-by: Michael Haggerty 
---
 Makefile  |   1 +
 refs.c|  18 ++
 refs/files-backend.c  | 640 +-
 refs/packed-backend.c | 624 
 refs/packed-backend.h |  33 +++
 refs/refs-internal.h  |   9 +
 6 files changed, 686 insertions(+), 639 deletions(-)
 create mode 100644 refs/packed-backend.c
 create mode 100644 refs/packed-backend.h

diff --git a/Makefile b/Makefile
index 33b888730f..478a8eac5d 100644
--- a/Makefile
+++ b/Makefile
@@ -821,6 +821,7 @@ LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += refs/files-backend.o
 LIB_OBJS += refs/iterator.o
+LIB_OBJS += refs/packed-backend.o
 LIB_OBJS += refs/ref-cache.o
 LIB_OBJS += ref-filter.o
 LIB_OBJS += remote.o
diff --git a/refs.c b/refs.c
index 32177969f0..5880c12372 100644
--- a/refs.c
+++ b/refs.c
@@ -173,6 +173,24 @@ int refname_is_safe(const char *refname)
return 1;
 }
 
+/*
+ * Return true if refname, which has the specified oid and flags, can
+ * be resolved to an object in the database. If the referred-to object
+ * does not exist, emit a warning and return false.
+ */
+int ref_resolves_to_object(const char *refname,
+  const struct object_id *oid,
+  unsigned int flags)
+{
+   if (flags & REF_ISBROKEN)
+   return 0;
+   if (!has_sha1_file(oid->hash)) {
+   error("%s does not point to a valid object!", refname);
+   return 0;
+   }
+   return 1;
+}
+
 char *refs_resolve_refdup(struct ref_store *refs,
  const char *refname, int resolve_flags,
  unsigned char *sha1, int *flags)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index ac4764f6f7..83ea773728 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2,6 +2,7 @@
 #include "../refs.h"
 #include "refs-internal.h"
 #include "ref-cache.h"
+#include "packed-backend.h"
 #include "../iterator.h"
 #include "../dir-iterator.h"
 #include "../lockfile.h"
@@ -14,85 +15,6 @@ struct ref_lock {
struct object_id old_oid;
 };
 
-/*
- * Return true if refname, which has the specified oid and flags, can
- * be resolved to an object in the database. If the referred-to object
- * does not exist, emit a warning and return false.
- */
-static int ref_resolves_to_object(const char *refname,
- const struct object_id *oid,
- unsigned int flags)
-{
-   if (flags & REF_ISBROKEN)
-   return 0;
-   if (!has_sha1_file(oid->hash)) {
-   error("%s does not point to a valid object!", refname);
-   return 0;
-   }
-   return 1;
-}
-
-struct packed_ref_cache {
-   struct ref_cache *cache;
-
-   /*
-* Count of references to the data structure in this instance,
-* including the pointer from files_ref_store::packed if any.
-* The data will not be freed as long as the reference count
-* is nonzero.
-*/
-   unsigned int referrers;
-
-   /* The metadata from when this packed-refs cache was read */
-   struct stat_validity validity;
-};
-
-/*
- * A container for `packed-refs`-related data. It is not (yet) a
- * `ref_store`.
- */
-struct packed_ref_store {
-   unsigned int store_flags;
-
-   /* The path of the "packed-refs" file: */
-   char *path;
-
-   /*
-* A cache of the values read from the `packed-refs` file, if
-* it might still be current; otherwise, NULL.
-*/
-   struct packed_ref_cache *cache;
-
-   /*
-* Lock used for the "packed-refs" file. Note that this (and
-* thus the enclosing `packed_ref_store`) must not be freed.
-*/
-   struct lock_file lock;
-};
-
-static struct packed_ref_store *packed_ref_store_create(
-   const char *path, unsigned int store_flags)
-{
-   struct packed_ref_store *refs = xcalloc(1, sizeof(*refs));
-
-   refs->store_flags = store_flags;
-   refs->path = xstrdup(path);
-   return refs;
-}
-
-/*
- * Die if refs is not the main ref store. caller is used in any
- * necessary error messages.
- */
-static void packed_assert_main_repository(struct packed_ref_store *refs,
- const char *caller)
-{
-   if (refs->store_flags & REF_STORE_MAIN)
-   return;
-
-   die("BUG: operation %s only allowed for main ref store", caller);
-}
-
 /*
  * Future: need to be in "struct repository"
  * when doing a full libification.
@@ -109,42

[PATCH 26/28] commit_packed_refs(): remove call to `packed_refs_unlock()`

2017-06-15 Thread Michael Haggerty
Instead, change the callers of `commit_packed_refs()` to call
`packed_refs_unlock()`.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c  |  2 ++
 refs/packed-backend.c | 18 --
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 802ed9e2e9..09dad2806e 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1119,6 +1119,7 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
 
if (commit_packed_refs(refs->packed_ref_store, &err))
die("unable to overwrite old ref-pack file: %s", err.buf);
+   packed_refs_unlock(refs->packed_ref_store);
 
prune_refs(refs, refs_to_prune);
strbuf_release(&err);
@@ -2687,6 +2688,7 @@ static int files_initial_transaction_commit(struct 
ref_store *ref_store,
}
 
 cleanup:
+   packed_refs_unlock(refs->packed_ref_store);
transaction->state = REF_TRANSACTION_CLOSED;
string_list_clear(&affected_refnames, 0);
return ret;
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 1732e3aad4..54b48d1f02 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -606,7 +606,6 @@ int commit_packed_refs(struct ref_store *ref_store, struct 
strbuf *err)
struct packed_ref_cache *packed_ref_cache =
get_packed_ref_cache(refs);
int ok;
-   int ret = -1;
struct strbuf sb = STRBUF_INIT;
FILE *out;
struct ref_iterator *iter;
@@ -619,7 +618,7 @@ int commit_packed_refs(struct ref_store *ref_store, struct 
strbuf *err)
strbuf_addf(err, "unable to create file %s: %s",
sb.buf, strerror(errno));
strbuf_release(&sb);
-   goto out;
+   return -1;
}
strbuf_release(&sb);
 
@@ -660,18 +659,14 @@ int commit_packed_refs(struct ref_store *ref_store, 
struct strbuf *err)
if (rename_tempfile(&refs->tempfile, refs->path)) {
strbuf_addf(err, "error replacing %s: %s",
refs->path, strerror(errno));
-   goto out;
+   return -1;
}
 
-   ret = 0;
-   goto out;
+   return 0;
 
 error:
delete_tempfile(&refs->tempfile);
-
-out:
-   packed_refs_unlock(ref_store);
-   return ret;
+   return -1;
 }
 
 /*
@@ -705,6 +700,7 @@ int repack_without_refs(struct ref_store *ref_store,
struct ref_dir *packed;
struct string_list_item *refname;
int needs_repacking = 0, removed = 0;
+   int ret;
 
packed_assert_main_repository(refs, "repack_without_refs");
assert(err);
@@ -740,7 +736,9 @@ int repack_without_refs(struct ref_store *ref_store,
}
 
/* Write what remains */
-   return commit_packed_refs(&refs->base, err);
+   ret = commit_packed_refs(&refs->base, err);
+   packed_refs_unlock(ref_store);
+   return ret;
 }
 
 static int packed_init_db(struct ref_store *ref_store, struct strbuf *err)
-- 
2.11.0



[PATCH 28/28] read_packed_refs(): die if `packed-refs` contains bogus data

2017-06-15 Thread Michael Haggerty
The old code ignored any lines that it didn't understand. This is
dangerous. Instead, `die()` if the `packed-refs` file contains any
lines that we don't know how to handle.

Signed-off-by: Michael Haggerty 
---
 refs/packed-backend.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 721afd066a..311fd014ce 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -253,9 +253,7 @@ static struct packed_ref_cache *read_packed_refs(const char 
*packed_refs_file)
(peeled == PEELED_TAGS && starts_with(refname, 
"refs/tags/")))
last->flag |= REF_KNOWS_PEELED;
add_ref_entry(dir, last);
-   continue;
-   }
-   if (last &&
+   } else if (last &&
line.buf[0] == '^' &&
line.len == PEELED_LINE_LENGTH &&
line.buf[PEELED_LINE_LENGTH - 1] == '\n' &&
@@ -267,6 +265,8 @@ static struct packed_ref_cache *read_packed_refs(const char 
*packed_refs_file)
 * reference:
 */
last->flag |= REF_KNOWS_PEELED;
+   } else {
+   die("unexpected line in %s: %s", packed_refs_file, 
line.buf);
}
}
 
-- 
2.11.0



[PATCH 06/28] validate_packed_ref_cache(): take a `packed_ref_store *` parameter

2017-06-15 Thread Michael Haggerty
It only cares about the packed-refs part of the reference store.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 2b0db60b2b..f061506bf0 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -396,12 +396,11 @@ static void files_ref_path(struct files_ref_store *refs,
  * Check that the packed refs cache (if any) still reflects the
  * contents of the file. If not, clear the cache.
  */
-static void validate_packed_ref_cache(struct files_ref_store *refs)
+static void validate_packed_ref_cache(struct packed_ref_store *refs)
 {
-   if (refs->packed_ref_store->cache &&
-   !stat_validity_check(&refs->packed_ref_store->cache->validity,
-refs->packed_ref_store->path))
-   clear_packed_ref_cache(refs->packed_ref_store);
+   if (refs->cache &&
+   !stat_validity_check(&refs->cache->validity, refs->path))
+   clear_packed_ref_cache(refs);
 }
 
 /*
@@ -417,7 +416,7 @@ static struct packed_ref_cache *get_packed_ref_cache(struct 
files_ref_store *ref
const char *packed_refs_file = refs->packed_ref_store->path;
 
if (!is_lock_file_locked(&refs->packed_ref_store->lock))
-   validate_packed_ref_cache(refs);
+   validate_packed_ref_cache(refs->packed_ref_store);
 
if (!refs->packed_ref_store->cache)
refs->packed_ref_store->cache = 
read_packed_refs(packed_refs_file);
@@ -1364,7 +1363,7 @@ static int lock_packed_refs(struct files_ref_store *refs, 
int flags)
 * cache is still valid. We've just locked the file, but it
 * might have changed the moment *before* we locked it.
 */
-   validate_packed_ref_cache(refs);
+   validate_packed_ref_cache(refs->packed_ref_store);
 
packed_ref_cache = get_packed_ref_cache(refs);
/* Increment the reference count to prevent it from being freed: */
-- 
2.11.0



[PATCH 02/28] packed_ref_store: new struct

2017-06-15 Thread Michael Haggerty
Start extracting the packed-refs-related data structures into a new
class, `packed_ref_store`. It doesn't yet implement `ref_store`, but
it will.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 42 +-
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 87cecde231..2efb71cee9 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -47,6 +47,28 @@ struct packed_ref_cache {
struct stat_validity validity;
 };
 
+/*
+ * A container for `packed-refs`-related data. It is not (yet) a
+ * `ref_store`.
+ */
+struct packed_ref_store {
+   unsigned int store_flags;
+
+   /*
+* A cache of the values read from the `packed-refs` file, if
+* it might still be current; otherwise, NULL.
+*/
+   struct packed_ref_cache *cache;
+};
+
+static struct packed_ref_store *packed_ref_store_create(unsigned int 
store_flags)
+{
+   struct packed_ref_store *refs = xcalloc(1, sizeof(*refs));
+
+   refs->store_flags = store_flags;
+   return refs;
+}
+
 /*
  * Future: need to be in "struct repository"
  * when doing a full libification.
@@ -60,13 +82,14 @@ struct files_ref_store {
char *packed_refs_path;
 
struct ref_cache *loose;
-   struct packed_ref_cache *packed;
 
/*
 * Lock used for the "packed-refs" file. Note that this (and
 * thus the enclosing `files_ref_store`) must not be freed.
 */
struct lock_file packed_refs_lock;
+
+   struct packed_ref_store *packed_ref_store;
 };
 
 /*
@@ -95,12 +118,12 @@ static int release_packed_ref_cache(struct 
packed_ref_cache *packed_refs)
 
 static void clear_packed_ref_cache(struct files_ref_store *refs)
 {
-   if (refs->packed) {
-   struct packed_ref_cache *packed_refs = refs->packed;
+   if (refs->packed_ref_store->cache) {
+   struct packed_ref_cache *packed_refs = 
refs->packed_ref_store->cache;
 
if (is_lock_file_locked(&refs->packed_refs_lock))
die("BUG: packed-ref cache cleared while locked");
-   refs->packed = NULL;
+   refs->packed_ref_store->cache = NULL;
release_packed_ref_cache(packed_refs);
}
 }
@@ -132,6 +155,7 @@ static struct ref_store *files_ref_store_create(const char 
*gitdir,
refs->gitcommondir = strbuf_detach(&sb, NULL);
strbuf_addf(&sb, "%s/packed-refs", refs->gitcommondir);
refs->packed_refs_path = strbuf_detach(&sb, NULL);
+   refs->packed_ref_store = packed_ref_store_create(flags);
 
return ref_store;
 }
@@ -375,8 +399,8 @@ static void files_ref_path(struct files_ref_store *refs,
  */
 static void validate_packed_ref_cache(struct files_ref_store *refs)
 {
-   if (refs->packed &&
-   !stat_validity_check(&refs->packed->validity,
+   if (refs->packed_ref_store->cache &&
+   !stat_validity_check(&refs->packed_ref_store->cache->validity,
 files_packed_refs_path(refs)))
clear_packed_ref_cache(refs);
 }
@@ -396,10 +420,10 @@ static struct packed_ref_cache 
*get_packed_ref_cache(struct files_ref_store *ref
if (!is_lock_file_locked(&refs->packed_refs_lock))
validate_packed_ref_cache(refs);
 
-   if (!refs->packed)
-   refs->packed = read_packed_refs(packed_refs_file);
+   if (!refs->packed_ref_store->cache)
+   refs->packed_ref_store->cache = 
read_packed_refs(packed_refs_file);
 
-   return refs->packed;
+   return refs->packed_ref_store->cache;
 }
 
 static struct ref_dir *get_packed_ref_dir(struct packed_ref_cache 
*packed_ref_cache)
-- 
2.11.0



[PATCH 27/28] repack_without_refs(): don't lock or unlock the packed refs

2017-06-15 Thread Michael Haggerty
Change `repack_without_refs()` to expect the packed-refs lock to be
held already, and not to release the lock before returning. Change the
callers to deal with lock management.

This change makes it possible for callers to hold the packed-refs lock
for a longer span of time, a possibility that will eventually make it
possible to fix some longstanding races.

The only semantic change here is that `repack_without_refs()` used to
forgot to release the lock in the `if (!removed)` exit path. That
omission is now fixed.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c  | 47 +++
 refs/packed-backend.c | 32 
 2 files changed, 39 insertions(+), 40 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 09dad2806e..972dd5c0c9 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1137,24 +1137,16 @@ static int files_delete_refs(struct ref_store 
*ref_store, const char *msg,
if (!refnames->nr)
return 0;
 
-   result = repack_without_refs(refs->packed_ref_store, refnames, &err);
-   if (result) {
-   /*
-* If we failed to rewrite the packed-refs file, then
-* it is unsafe to try to remove loose refs, because
-* doing so might expose an obsolete packed value for
-* a reference that might even point at an object that
-* has been garbage collected.
-*/
-   if (refnames->nr == 1)
-   error(_("could not delete reference %s: %s"),
- refnames->items[0].string, err.buf);
-   else
-   error(_("could not delete references: %s"), err.buf);
+   if (packed_refs_lock(refs->packed_ref_store, 0, &err))
+   goto error;
 
-   goto out;
+   if (repack_without_refs(refs->packed_ref_store, refnames, &err)) {
+   packed_refs_unlock(refs->packed_ref_store);
+   goto error;
}
 
+   packed_refs_unlock(refs->packed_ref_store);
+
for (i = 0; i < refnames->nr; i++) {
const char *refname = refnames->items[i].string;
 
@@ -1162,9 +1154,24 @@ static int files_delete_refs(struct ref_store 
*ref_store, const char *msg,
result |= error(_("could not remove reference %s"), 
refname);
}
 
-out:
strbuf_release(&err);
return result;
+
+error:
+   /*
+* If we failed to rewrite the packed-refs file, then it is
+* unsafe to try to remove loose refs, because doing so might
+* expose an obsolete packed value for a reference that might
+* even point at an object that has been garbage collected.
+*/
+   if (refnames->nr == 1)
+   error(_("could not delete reference %s: %s"),
+ refnames->items[0].string, err.buf);
+   else
+   error(_("could not delete references: %s"), err.buf);
+
+   strbuf_release(&err);
+   return -1;
 }
 
 /*
@@ -2557,11 +2564,19 @@ static int files_transaction_finish(struct ref_store 
*ref_store,
}
}
 
+   if (packed_refs_lock(refs->packed_ref_store, 0, err)) {
+   ret = TRANSACTION_GENERIC_ERROR;
+   goto cleanup;
+   }
+
if (repack_without_refs(refs->packed_ref_store, &refs_to_delete, err)) {
ret = TRANSACTION_GENERIC_ERROR;
+   packed_refs_unlock(refs->packed_ref_store);
goto cleanup;
}
 
+   packed_refs_unlock(refs->packed_ref_store);
+
/* Delete the reflogs of any references that were deleted: */
for_each_string_list_item(ref_to_delete, &refs_to_delete) {
strbuf_reset(&sb);
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 54b48d1f02..721afd066a 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -669,25 +669,12 @@ int commit_packed_refs(struct ref_store *ref_store, 
struct strbuf *err)
return -1;
 }
 
-/*
- * Rollback the lockfile for the packed-refs file, and discard the
- * in-memory packed reference cache.  (The packed-refs file will be
- * read anew if it is needed again after this function is called.)
- */
-static void rollback_packed_refs(struct packed_ref_store *refs)
-{
-   packed_assert_main_repository(refs, "rollback_packed_refs");
-
-   if (!is_lock_file_locked(&refs->lock))
-   die("BUG: packed-refs not locked");
-   packed_refs_unlock(&refs->base);
-   clear_packed_ref_cache(refs);
-}
-
 /*
  * Rewrite the packed-refs file, omitting any refs listed in
  * 'refnames'. On error, leave packed-refs unchanged, write an error
- * message to 'err', and return a nonzero value.
+ * message to 'err', and return a nonzero value. The packed refs lock
+ * must be held when calling this function; it will still be held when
+ * the function r

[PATCH 12/28] rollback_packed_refs(): take a `packed_ref_store *` parameter

2017-06-15 Thread Michael Haggerty
It only cares about the packed-refs part of the reference store.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 5d159620f0..a08d3fbadf 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1434,18 +1434,17 @@ static int commit_packed_refs(struct packed_ref_store 
*refs)
  * in-memory packed reference cache.  (The packed-refs file will be
  * read anew if it is needed again after this function is called.)
  */
-static void rollback_packed_refs(struct files_ref_store *refs)
+static void rollback_packed_refs(struct packed_ref_store *refs)
 {
-   struct packed_ref_cache *packed_ref_cache =
-   get_packed_ref_cache(refs->packed_ref_store);
+   struct packed_ref_cache *packed_ref_cache = get_packed_ref_cache(refs);
 
-   files_assert_main_repository(refs, "rollback_packed_refs");
+   packed_assert_main_repository(refs, "rollback_packed_refs");
 
-   if (!is_lock_file_locked(&refs->packed_ref_store->lock))
+   if (!is_lock_file_locked(&refs->lock))
die("BUG: packed-refs not locked");
-   rollback_lock_file(&refs->packed_ref_store->lock);
+   rollback_lock_file(&refs->lock);
release_packed_ref_cache(packed_ref_cache);
-   clear_packed_ref_cache(refs->packed_ref_store);
+   clear_packed_ref_cache(refs);
 }
 
 struct ref_to_prune {
@@ -1657,7 +1656,7 @@ static int repack_without_refs(struct files_ref_store 
*refs,
 * All packed entries disappeared while we were
 * acquiring the lock.
 */
-   rollback_packed_refs(refs);
+   rollback_packed_refs(refs->packed_ref_store);
return 0;
}
 
-- 
2.11.0



[PATCH 05/28] clear_packed_ref_cache(): take a `packed_ref_store *` parameter

2017-06-15 Thread Michael Haggerty
It only cares about the packed-refs part of the reference store.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index de8293493f..2b0db60b2b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -120,15 +120,15 @@ static int release_packed_ref_cache(struct 
packed_ref_cache *packed_refs)
}
 }
 
-static void clear_packed_ref_cache(struct files_ref_store *refs)
+static void clear_packed_ref_cache(struct packed_ref_store *refs)
 {
-   if (refs->packed_ref_store->cache) {
-   struct packed_ref_cache *packed_refs = 
refs->packed_ref_store->cache;
+   if (refs->cache) {
+   struct packed_ref_cache *cache = refs->cache;
 
-   if (is_lock_file_locked(&refs->packed_ref_store->lock))
+   if (is_lock_file_locked(&refs->lock))
die("BUG: packed-ref cache cleared while locked");
-   refs->packed_ref_store->cache = NULL;
-   release_packed_ref_cache(packed_refs);
+   refs->cache = NULL;
+   release_packed_ref_cache(cache);
}
 }
 
@@ -401,7 +401,7 @@ static void validate_packed_ref_cache(struct 
files_ref_store *refs)
if (refs->packed_ref_store->cache &&
!stat_validity_check(&refs->packed_ref_store->cache->validity,
 refs->packed_ref_store->path))
-   clear_packed_ref_cache(refs);
+   clear_packed_ref_cache(refs->packed_ref_store);
 }
 
 /*
@@ -1435,7 +1435,7 @@ static void rollback_packed_refs(struct files_ref_store 
*refs)
die("BUG: packed-refs not locked");
rollback_lock_file(&refs->packed_ref_store->lock);
release_packed_ref_cache(packed_ref_cache);
-   clear_packed_ref_cache(refs);
+   clear_packed_ref_cache(refs->packed_ref_store);
 }
 
 struct ref_to_prune {
-- 
2.11.0



[PATCH 23/28] packed_refs_lock(): report errors via a `struct strbuf *err`

2017-06-15 Thread Michael Haggerty
That way the callers don't have to come up with error messages
themselves.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c  |  6 ++
 refs/packed-backend.c | 17 +++--
 refs/packed-backend.h |  6 +++---
 3 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 3fc2254e33..802ed9e2e9 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1084,7 +1084,7 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
struct ref_to_prune *refs_to_prune = NULL;
struct strbuf err = STRBUF_INIT;
 
-   packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR);
+   packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR, &err);
 
iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0);
while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
@@ -2667,9 +2667,7 @@ static int files_initial_transaction_commit(struct 
ref_store *ref_store,
}
}
 
-   if (packed_refs_lock(refs->packed_ref_store, 0)) {
-   strbuf_addf(err, "unable to lock packed-refs file: %s",
-   strerror(errno));
+   if (packed_refs_lock(refs->packed_ref_store, 0, err)) {
ret = TRANSACTION_GENERIC_ERROR;
goto cleanup;
}
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index eb9da04576..24451343b8 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -515,7 +515,7 @@ static int write_packed_entry(FILE *fh, const char *refname,
return 0;
 }
 
-int packed_refs_lock(struct ref_store *ref_store, int flags)
+int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf 
*err)
 {
struct packed_ref_store *refs =
packed_downcast(ref_store, REF_STORE_WRITE | REF_STORE_MAIN,
@@ -537,9 +537,15 @@ int packed_refs_lock(struct ref_store *ref_store, int 
flags)
if (hold_lock_file_for_update_timeout(
&refs->lock,
refs->path,
-   flags, timeout_value) < 0 ||
-   close_lock_file(&refs->lock))
+   flags, timeout_value) < 0) {
+   unable_to_lock_message(refs->path, errno, err);
+   return -1;
+   }
+
+   if (close_lock_file(&refs->lock)) {
+   strbuf_addf(err, "unable to close %s: %s", refs->path, 
strerror(errno));
return -1;
+   }
 
/*
 * Now that we hold the `packed-refs` lock, make sure that our
@@ -698,10 +704,9 @@ int repack_without_refs(struct ref_store *ref_store,
if (!needs_repacking)
return 0; /* no refname exists in packed refs */
 
-   if (packed_refs_lock(&refs->base, 0)) {
-   unable_to_lock_message(refs->path, errno, err);
+   if (packed_refs_lock(&refs->base, 0, err))
return -1;
-   }
+
packed = get_packed_refs(refs);
 
/* Remove refnames from the cache */
diff --git a/refs/packed-backend.h b/refs/packed-backend.h
index dbc00d3396..210e3f35ce 100644
--- a/refs/packed-backend.h
+++ b/refs/packed-backend.h
@@ -6,10 +6,10 @@ struct ref_store *packed_ref_store_create(const char *path,
 
 /*
  * Lock the packed-refs file for writing. Flags is passed to
- * hold_lock_file_for_update(). Return 0 on success. On errors, set
- * errno appropriately and return a nonzero value.
+ * hold_lock_file_for_update(). Return 0 on success. On errors, write
+ * an error message to `err` and return a nonzero value.
  */
-int packed_refs_lock(struct ref_store *ref_store, int flags);
+int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf 
*err);
 
 void add_packed_ref(struct ref_store *ref_store,
const char *refname, const struct object_id *oid);
-- 
2.11.0



[PATCH 25/28] clear_packed_ref_cache(): don't protest if the lock is held

2017-06-15 Thread Michael Haggerty
The existing callers already check that the lock isn't held just
before calling `clear_packed_ref_cache()`, and in the near future we
want to be able to call this function when the lock is held.

Signed-off-by: Michael Haggerty 
---
 refs/packed-backend.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index ff6326ddb8..1732e3aad4 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -133,8 +133,6 @@ static void clear_packed_ref_cache(struct packed_ref_store 
*refs)
if (refs->cache) {
struct packed_ref_cache *cache = refs->cache;
 
-   if (is_lock_file_locked(&refs->lock))
-   die("BUG: packed-ref cache cleared while locked");
refs->cache = NULL;
release_packed_ref_cache(cache);
}
-- 
2.11.0



Re: [PATCH] checkout: don't write merge results into the object database

2017-06-15 Thread René Scharfe
Am 15.06.2017 um 15:57 schrieb Jeff King:
> On Thu, Jun 15, 2017 at 01:33:42PM +0200, René Scharfe wrote:
> 
>> Merge results need to be written to the worktree, of course, but we
>> don't necessarily need object entries for them, especially if they
>> contain conflict markers.  Use pretend_sha1_file() to create fake
>> blobs to pass to make_cache_entry() and checkout_entry() instead.
> 
> Conceptually this makes sense, although I have a comment below.
> 
> Out of curiosity, did this come up when looking into the cherry-pick
> segfault/error from a few hours ago?

No, it came up in the discussion of Dscho's memory leak plug series [1].

>> @@ -225,8 +225,8 @@ static int checkout_merged(int pos, const struct 
>> checkout *state)
>>   * (it also writes the merge result to the object database even
>>   * when it may contain conflicts).
>>   */
>> -if (write_sha1_file(result_buf.ptr, result_buf.size,
>> -blob_type, oid.hash))
>> +if (pretend_sha1_file(result_buf.ptr, result_buf.size,
>> +  OBJ_BLOB, oid.hash))
>>  die(_("Unable to add merge result for '%s'"), path);
>>  free(result_buf.ptr);
> 
> I wondered if pretend_sha1_file() makes a copy of the buffer, and indeed
> it does. So this is correct.
> 
> But that raises an interesting question: how big are these objects, and
> is it a good idea to store them in RAM? Obviously they're in RAM
> already, but I'm not sure if it's one-at-a-time. We could be bumping the
> peak RAM used if there's a large number of these entries. Touching the
> on-disk odb does feel hacky, but it also means they behave like other
> objects.
> 
> If it is a concern, I think it could be solved by "unpretending" after
> our call to checkout_entry completes. That would need a new call in
> sha1_file.c, but it should be easy to write.

Good point; we'd accumulate fake entries that we'll never need again.
The patch should clean them up.

Alternatively we could finally address the NEEDSWORK comment and
provide a way to checkout a file without faking an index entry..

René


[1] 
https://public-inbox.org/git/2704e145927c851c4163a68cfdfd5ada48fff21d.1493906085.git.johannes.schinde...@gmx.de/


Re: What's cooking in git.git (Jun 2017, #04; Tue, 13)

2017-06-15 Thread Junio C Hamano
Jonathan Nieder  writes:

>>  It is not known if a simple "yes/no" is sufficient in the longer
>>  term, and what should happen when --recurse-submodules option starts
>>  taking "recurse into them how?" parameter, though.
>
> Any pointers for where this has been discussed, if anywhere (e.g. was

If this were discussed, then the answer to the question we may know
by now ;-)  I do not think anybody gave a serious thought to convince
the public why a boolean is enough, hence this comment.

>> * bw/config-h (2017-06-13) 4 commits
>>  - config: don't implicitly use gitdir
>>  - config: don't include config.h by default
>>  - config: remove git_config_iter
>>  - config: create config.h
>>
>>  Code clean-up.
>
> Patches 1-3 are good to go IMHO.
>
> Patch 4 in pu is marked with my Reviewed-by.  I think it's getting
> there but not there yet.  Did some script pull the tag from my reply
> to the cover letter?

No, nothing that elaborate.  

I go through each message in Gnus newsreader and feed the article to
a shell command, e.g. "Meta/add-by -r jrnieder@ | git am -s3c".  The
UI remembers the last command I used when I choose to feed the next
article to a shell command, and after running it to first three, I
forgot to remove the 'add-by' bit from the command line for the fourth
one.



Re: [PATCH 2/2] date: use localtime() for "-local" time formats

2017-06-15 Thread René Scharfe

Am 15.06.2017 um 15:52 schrieb Jeff King:

But for the special case of the "-local" formats, we can
just skip the adjustment and use localtime() instead of
gmtime(). This makes --date=format-local:%Z work correctly,
showing the local timezone instead of an empty string.


Documentation/rev-list-options.txt should be updated to mention that %Z
is passed to strftime in the local case, no?


The new test checks the result for "UTC", our default
test-lib value for $TZ. Using something like EST5 might be
more interesting, but the actual zone string is
system-dependent (for instance, on my system it expands to
just EST). Hopefully "UTC" is vanilla enough that every
system treats it the same.

Signed-off-by: Jeff King 
---
I don't have a Windows system to test this on, but from the output Dscho
provided earlier, I believe this should pass.


The first patch applies with some fuzz on master of Git for Windows, the
second one applies cleanly.  A "typedef unsigned long timestamp_t;" is
required to compile it; such a fixup won't be needed for long, I guess.
t0006 succeeds.

René


git diff sometimes brings up buggy pager

2017-06-15 Thread Matthew Groth

When I do `git diff` sometimes I get this:


...skipping...
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
...skipping...
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
...skipping...
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~


 it goes on like this for about 10 times the length. Looks like this 
happens exclusively when I use git diff with a github remote that is at 
the same commit. I will update if I find any other case where this happens.







Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan

2017-06-15 Thread Ævar Arnfjörð Bjarmason

On Thu, Jun 15 2017, Jeff King jotted:

> On Thu, Jun 15, 2017 at 08:05:18PM +0900, Mike Hommey wrote:
>
>> On Thu, Jun 15, 2017 at 12:30:46PM +0200, Johannes Schindelin wrote:
>> > Footnote *1*: SHA-256, as all hash functions whose output is essentially
>> > the entire internal state, are susceptible to a so-called "length
>> > extension attack", where the hash of a secret+message can be used to
>> > generate the hash of secret+message+piggyback without knowing the secret.
>> > This is not the case for Git: only visible data are hashed. The type of
>> > attacks Git has to worry about is very different from the length extension
>> > attacks, and it is highly unlikely that that weakness of SHA-256 leads to,
>> > say, a collision attack.
>>
>> What do the experts think or SHA512/256, which completely removes the
>> concerns over length extension attack? (which I'd argue is better than
>> sweeping them under the carpet)
>
> I don't think it's sweeping them under the carpet. Git does not use the
> hash as a MAC, so length extension attacks aren't a thing (and even if
> we later wanted to use the same algorithm as a MAC, the HMAC
> construction is a well-studied technique for dealing with it).
>
> That said, SHA-512 is typically a little faster than SHA-256 on 64-bit
> platforms. I don't know if that will change with the advent of hardware
> instructions oriented towards SHA-256.

Quoting my own
cacbzzx7jra2niwt9wsgaxnzs+gws8htugzwm8nay1gs87o8...@mail.gmail.com sent
~2 weeks ago to the list:

On Fri, Jun 2, 2017 at 7:54 PM, Jonathan Nieder  wrote:
[...]
> 4. When choosing a hash function, people may argue about performance.
>It would be useful for run some benchmarks for git (running
>the test suite, t/perf tests, etc) using a variety of hash
>functions as input to such a discussion.

To the extent that such benchmarks matter, it seems prudent to heavily
weigh them in favor of whatever seems to be likely to be the more
common hash function going forward, since those are likely to get
faster through future hardware acceleration.

E.g. Intel announced Goldmont last year which according to one SHA-1
implementation improved from 9.5 cycles per byte to 2.7 cpb[1]. They
only have acceleration for SHA-1 and SHA-256[2]

1. https://github.com/weidai11/cryptopp/issues/139#issuecomment-264283385

2. https://en.wikipedia.org/wiki/Goldmont

Maybe someone else knows of better numbers / benchmarks, but such a
reduction in CBP likely makes it faster than SHA-512.


[PATCH] diff-highlight: split code into module

2017-06-15 Thread Jeff King
The diff-so-fancy project is also written in perl, and most
of its users pipe diffs through both diff-highlight and
diff-so-fancy. It would be nice if this could be done in a
single script. So let's pull most of diff-highlight's code
into its own module which can be used by diff-so-fancy.

In addition, we'll abstract a few basic items like reading
from stdio so that a script using the module can do more
processing before or after diff-highlight handles the lines.
See the README update for more details.

One small downside is that the diff-highlight script must
now be built using the Makefile. There are ways around this,
but it quickly gets into perl arcana. Let's go with the
simple solution. As a bonus, our Makefile now respects the
PERL_PATH variable if it is set.

Signed-off-by: Jeff King 
---
Scott and I discussed this off-list, and this was the least-gross
solution I came up with.  The plan would be for diff-so-fancy to pull in
this copy of diff-highlight from git.git and have a wrapper script
similar to the diff-highlight.perl found here.

 contrib/diff-highlight/.gitignore  |  2 ++
 .../{diff-highlight => DiffHighlight.pm}   | 40 +-
 contrib/diff-highlight/Makefile| 21 ++--
 contrib/diff-highlight/README  | 30 
 contrib/diff-highlight/diff-highlight.perl |  8 +
 5 files changed, 82 insertions(+), 19 deletions(-)
 create mode 100644 contrib/diff-highlight/.gitignore
 rename contrib/diff-highlight/{diff-highlight => DiffHighlight.pm} (91%)
 mode change 100755 => 100644
 create mode 100644 contrib/diff-highlight/diff-highlight.perl

diff --git a/contrib/diff-highlight/.gitignore 
b/contrib/diff-highlight/.gitignore
new file mode 100644
index 0..c07454824
--- /dev/null
+++ b/contrib/diff-highlight/.gitignore
@@ -0,0 +1,2 @@
+shebang.perl
+diff-highlight
diff --git a/contrib/diff-highlight/diff-highlight 
b/contrib/diff-highlight/DiffHighlight.pm
old mode 100755
new mode 100644
similarity index 91%
rename from contrib/diff-highlight/diff-highlight
rename to contrib/diff-highlight/DiffHighlight.pm
index 81bd8040e..663992e53
--- a/contrib/diff-highlight/diff-highlight
+++ b/contrib/diff-highlight/DiffHighlight.pm
@@ -1,4 +1,4 @@
-#!/usr/bin/perl
+package DiffHighlight;
 
 use 5.008;
 use warnings FATAL => 'all';
@@ -29,13 +29,14 @@ my @removed;
 my @added;
 my $in_hunk;
 
-# Some scripts may not realize that SIGPIPE is being ignored when launching the
-# pager--for instance scripts written in Python.
-$SIG{PIPE} = 'DEFAULT';
+our $line_cb = sub { print @_ };
+our $flush_cb = sub { local $| = 1 };
+
+sub handle_line {
+   local $_ = shift;
 
-while (<>) {
if (!$in_hunk) {
-   print;
+   $line_cb->($_);
$in_hunk = /^$GRAPH*$COLOR*\@\@ /;
}
elsif (/^$GRAPH*$COLOR*-/) {
@@ -49,7 +50,7 @@ while (<>) {
@removed = ();
@added = ();
 
-   print;
+   $line_cb->($_);
$in_hunk = /^$GRAPH*$COLOR*[\@ ]/;
}
 
@@ -62,15 +63,22 @@ while (<>) {
# place to flush. Flushing on a blank line is a heuristic that
# happens to match git-log output.
if (!length) {
-   local $| = 1;
+   $flush_cb->();
}
 }
 
-# Flush any queued hunk (this can happen when there is no trailing context in
-# the final diff of the input).
-show_hunk(\@removed, \@added);
+sub flush {
+   # Flush any queued hunk (this can happen when there is no trailing
+   # context in the final diff of the input).
+   show_hunk(\@removed, \@added);
+}
 
-exit 0;
+sub highlight_stdin {
+   while () {
+   handle_line($_);
+   }
+   flush();
+}
 
 # Ideally we would feed the default as a human-readable color to
 # git-config as the fallback value. But diff-highlight does
@@ -88,7 +96,7 @@ sub show_hunk {
 
# If one side is empty, then there is nothing to compare or highlight.
if (!@$a || !@$b) {
-   print @$a, @$b;
+   $line_cb->(@$a, @$b);
return;
}
 
@@ -97,17 +105,17 @@ sub show_hunk {
# stupid, and only handle multi-line hunks that remove and add the same
# number of lines.
if (@$a != @$b) {
-   print @$a, @$b;
+   $line_cb->(@$a, @$b);
return;
}
 
my @queue;
for (my $i = 0; $i < @$a; $i++) {
my ($rm, $add) = highlight_pair($a->[$i], $b->[$i]);
-   print $rm;
+   $line_cb->($rm);
push @queue, $add;
}
-   print @queue;
+   $line_cb->(@queue);
 }
 
 sub highlight_pair {
diff --git a/contrib/diff-highlight/Makefile b/contrib/diff-highlight/Makefile
index 901872452..fbf5c5824 100644
--- a/contrib/diff-highlight/Makefile
+++ b/contrib/diff-highlight/Makefile
@@ -1,5 +1,20 @@
-# nothing to build

Re: [PATCH v1] Configure Git contribution guidelines for github.com

2017-06-15 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> There are things we get out of this that would regress if
> submitGit were changed this way:
>
>  * Now when you submit a pull request you get a Travis build for
> git/git, I don't get this if I push to any random branch in my
> avar/git, and although I could probably set it up, it's extra work
> etc.

Thanks for pointing this out.  I agree that this indeed is a
downside.

>  * I like being able to "git fetch" patches I'm reviewing. Now I can
> just set "+refs/pull/*:refs/remotes/origin/pull/*" in the refspec for
> git/git, if it were pulling from target repos I'd need to "git remote
> add" for each one, not a big deal, but less convenient.
>
>  * There would be no single place to list submitGit requests, which
> you can do now through the pull page, although I think this is largely
> stale now because nothing auto-closes them if they're merged (by which
> point they'll have different sha1s), but that could be improved with
> some bot...

I do not think these two are 'regressions', unless your definition
of 'regression' is a "this thing I used to be able to do, now I no
longer can", which is slightly different from mine, which is "this
useful thing I used to be able to do, now I no longer can".

It would be useful to be able to do the above two things, if the
list of submitGit requests and what you can get from pull/*
hierarchy (1) covered most of the changes proposed, allowing people
to check only this place and nowhere else, and/or (2) covered the
more interesting changes that are worth looking at than other
proposed changes.

I do not think either is the case.  The submitGit mechanism being an
easier way to propose a change to the mailing list, by definition,
(1) will not hold true.  And among the changes proposed to be made
to Git, the "selection criteria" to be in the set to be discoverable
by going to github.com/git/git.git and checking the submitGit pull
requests is not that they are of higher quality or touch interesting
topics.  The only common trait these proposed changes share, compared
to other proposed changes we see on the mailing list, are that they
were originally made as pull requests and (2) will not hold true.

Another thing that may regress that you did not mention is that we
would lose a convenient way to _count_ proposed changes coming via
submitGit (i.e. you can simply go to the pull-request page), so that
the number can be compared with the number of proposed changes
directly made on the mailing list, which would be a good way to
gauge how submitGit service is helping our community.  But even for
that, you'd need to go to the list to find the denominator
(i.e. total number of changes proposed), and by the time you do
that, counting the numerator (i.e. those come via submitGit) by
finding the telltale sign submitGit leaves in its output among these
denominator messages should be trivial.

So in all, I see the only downside is the loss of automated
triggering of Travis.  But I do agree with you that it is a rather
significant downside.

> There's probably ways to do this without git/git pull requests, but I
> don't see what problem would be solved by moving this off git/git,
> even if there's open requests submitGit is the only thing using these,
> and any confusion about that pull UI would remain if it wasn't (AFAIK
> there's no way to completely disable pull requests on github, but I
> may be wrong).

Hopefully the pull-request template update Lars proposes will keep
the pull request useful, in that they create one and then have
submitGit relay it to the official channel.


Re: [PATCH v2 1/2] git-compat-util: add a FREEZ() wrapper around free(ptr); ptr = NULL

2017-06-15 Thread Junio C Hamano
Jeff King  writes:

> On Sat, Jun 10, 2017 at 03:21:43AM +, Eric Wong wrote:
>
>> > So make Jonathan's freez_impl a public API and rename it to
>> > free_and_null(), perhaps?
>> 
>> Perhaps...  I think it needs to take "void *" to avoid warnings:
>> 
>>  static inline void free_and_null(void *ptrptr)
>>  {
>>  void **tmp = ptrptr;
>> 
>>  free(*tmp);
>>  *tmp = NULL;
>>  }
>
> That unfortunately makes it very easy to get it wrong in the callers.
> Both:
>
>   free_and_null(&p);
>
> and
>
>   free_and_null(p);
>
> would be accepted by the compiler, but one of them causes undefined
> behavior.
>
> Unfortunately using "void **" in the declaration doesn't work, because
> C's implicit casting rules don't apply to pointer-to-pointer types.

All true.  

I still think the macro FREEZ() is too confusing a name to live;
perhaps we can take Ævar's patch with s/FREEZ/FREE_AND_NULL/ and be
done with it?  By spelling it in all caps, readers will know that
there is something special going on in that macro, and Eric's
"forcing the readers to type & in front to let them be aware that
the ptr variable is being manipulated" may become less necessary.



Re: [PATCH v2 2/4] sha1_file: move delta base cache code up

2017-06-15 Thread Junio C Hamano
Jonathan Tan  writes:

> In a subsequent patch, packed_object_info() will be modified to use the
> delta base cache, so move the relevant code to before
> packed_object_info().
>
> Signed-off-by: Jonathan Tan 
> ---
>  sha1_file.c | 226 
> +++-
>  1 file changed, 116 insertions(+), 110 deletions(-)

Hmph, is this meant to be just moving two whole functions?

> diff --git a/sha1_file.c b/sha1_file.c
> index a52b27541..a158907d1 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -2239,116 +2239,6 @@ static enum ...
> ...
> -int packed_object_info(struct packed_git *p, off_t obj_offset,
> -struct object_info *oi)
> -{
> -...
> - if (oi->delta_base_sha1) {
> - if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
> - const unsigned char *base;
> -
> - base = get_delta_base_sha1(p, &w_curs, curpos,
> -type, obj_offset);
> - if (!base) {
> - type = OBJ_BAD;
> - goto out;
> - }
> -
> - hashcpy(oi->delta_base_sha1, base);
> - } else
> - hashclr(oi->delta_base_sha1);
> - }
> -
> -out:
> - unuse_pack(&w_curs);
> - return type;
> -}
> -...

The above is what was removed, while ...

> @@ -2486,6 +2376,122 @@ static void ...
> ...
> +int packed_object_info(struct packed_git *p, off_t obj_offset,
> +struct object_info *oi)
> +{
> +...
> + if (oi->delta_base_sha1) {
> + if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
> + const unsigned char *base;
> +
> + base = get_delta_base_sha1(p, &w_curs, curpos,
> +type, obj_offset);
> + if (!base) {
> + type = OBJ_BAD;
> + goto out;
> + }
> +
> + hashcpy(oi->delta_base_sha1, base);
> + } else
> + hashclr(oi->delta_base_sha1);
> + }
> +
> + oi->whence = OI_PACKED;
> + oi->u.packed.offset = obj_offset;
> + oi->u.packed.pack = p;
> + oi->u.packed.is_delta = (type == OBJ_REF_DELTA ||
> +  type == OBJ_OFS_DELTA);
> +
> +out:
> + unuse_pack(&w_curs);
> + return type;
> +}

... we somehow gained code to update *oi that used to be (and still
is) done by its sole caller, sha1_object_info_extended().

Perhaps this is a rebase-gotcha?


Re: [PATCH v2 1/2] git-compat-util: add a FREEZ() wrapper around free(ptr); ptr = NULL

2017-06-15 Thread Ævar Arnfjörð Bjarmason
On Thu, Jun 15, 2017 at 6:48 PM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> On Sat, Jun 10, 2017 at 03:21:43AM +, Eric Wong wrote:
>>
>>> > So make Jonathan's freez_impl a public API and rename it to
>>> > free_and_null(), perhaps?
>>>
>>> Perhaps...  I think it needs to take "void *" to avoid warnings:
>>>
>>>  static inline void free_and_null(void *ptrptr)
>>>  {
>>>  void **tmp = ptrptr;
>>>
>>>  free(*tmp);
>>>  *tmp = NULL;
>>>  }
>>
>> That unfortunately makes it very easy to get it wrong in the callers.
>> Both:
>>
>>   free_and_null(&p);
>>
>> and
>>
>>   free_and_null(p);
>>
>> would be accepted by the compiler, but one of them causes undefined
>> behavior.
>>
>> Unfortunately using "void **" in the declaration doesn't work, because
>> C's implicit casting rules don't apply to pointer-to-pointer types.
>
> All true.
>
> I still think the macro FREEZ() is too confusing a name to live;
> perhaps we can take Ævar's patch with s/FREEZ/FREE_AND_NULL/ and be
> done with it?  By spelling it in all caps, readers will know that
> there is something special going on in that macro, and Eric's
> "forcing the readers to type & in front to let them be aware that
> the ptr variable is being manipulated" may become less necessary.

I'll change it to FREE_AND_NULL and submit my patch as-is, my reading
of the rest of this thread is that making it a function instead of a
macro would be interesting, but has its own caveats that are likely
better considered as part of its own series, whereas this just changes
existing code to its macro-expanded functional equivalent.


Re: [PATCH v1] Configure Git contribution guidelines for github.com

2017-06-15 Thread Andreas Heiduk
Am 15.06.2017 um 18:43 schrieb Junio C Hamano:
> Another thing that may regress that you did not mention is that we
> would lose a convenient way to _count_ proposed changes coming via
> submitGit (i.e. you can simply go to the pull-request page), so that
> the number can be compared with the number of proposed changes
> directly made on the mailing list, which would be a good way to
> gauge how submitGit service is helping our community.  But even for
> that, you'd need to go to the list to find the denominator
> (i.e. total number of changes proposed), and by the time you do
> that, counting the numerator (i.e. those come via submitGit) by
> finding the telltale sign submitGit leaves in its output among these
> denominator messages should be trivial.

This numbers can be aquired quite easily if submitGit adds a defined
trailer to the converted commit message like this:

Signed-off-by: Foo Bar 
Submit-git-id: url-or-id-of-pr



Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan

2017-06-15 Thread Brandon Williams
On 06/15, Johannes Schindelin wrote:
> Hi,
> 
> I thought it better to revive this old thread rather than start a new
> thread, so as to automatically reach everybody who chimed in originally.
> 
> On Mon, 6 Mar 2017, Brandon Williams wrote:
> 
> > On 03/06, brian m. carlson wrote:
> >
> > > On Sat, Mar 04, 2017 at 06:35:38PM -0800, Linus Torvalds wrote:
> > >
> > > > Btw, I do think the particular choice of hash should still be on the
> > > > table. sha-256 may be the obvious first choice, but there are
> > > > definitely a few reasons to consider alternatives, especially if
> > > > it's a complete switch-over like this.
> > > > 
> > > > One is large-file behavior - a parallel (or tree) mode could improve
> > > > on that noticeably. BLAKE2 does have special support for that, for
> > > > example. And SHA-256 does have known attacks compared to SHA-3-256
> > > > or BLAKE2 - whether that is due to age or due to more effort, I
> > > > can't really judge. But if we're switching away from SHA1 due to
> > > > known attacks, it does feel like we should be careful.
> > > 
> > > I agree with Linus on this.  SHA-256 is the slowest option, and it's
> > > the one with the most advanced cryptanalysis.  SHA-3-256 is faster on
> > > 64-bit machines (which, as we've seen on the list, is the overwhelming
> > > majority of machines using Git), and even BLAKE2b-256 is stronger.
> > > 
> > > Doing this all over again in another couple years should also be a
> > > non-goal.
> > 
> > I agree that when we decide to move to a new algorithm that we should
> > select one which we plan on using for as long as possible (much longer
> > than a couple years).  While writing the document we simply used
> > "sha256" because it was more tangible and easier to reference.
> 
> The SHA-1 transition *requires* a knob telling Git that the current
> repository uses a hash function different from SHA-1.
> 
> It would make *a whole of a lot of sense* to make that knob *not* Boolean,
> but to specify *which* hash function is in use.

100% agree on this point.  I believe the current plan is to have the
hashing function used for a repository be a repository format extension
which would be a value (most likely a string like 'sha1', 'sha256',
'black2', etc) stored in a repository's .git/config.  This way, upon
startup git will die or ignore a repository which uses a hashing
function which it does not recognize or does not compiled to handle.

I hope (and expect) that the end produce of this transition is a nice,
clean hashing API and interface with sufficient abstractions such that
if I wanted to switch to a different hashing function I would just need
to implement the interface with the new hashing function and ensure that
'verify_repository_format' allows the new function.

> 
> That way, it will be easier to switch another time when it becomes
> necessary.
> 
> And it will also make it easier for interested parties to use a different
> hash function in their infrastructure if they want.
> 
> And it lifts part of that burden that we have to consider *very carefully*
> which function to pick. We still should be more careful than in 2005, when
> Git was born, and when, incidentally, when the first attacks on SHA-1
> became known, of course. We were just lucky for almost 12 years.
> 
> Now, with Dunning-Kruger in mind, I feel that my degree in mathematics
> equips me with *just enough* competence to know just how little *even I*
> know about cryptography.
> 
> The smart thing to do, hence, was to get involved in this discussion and
> act as Lt Tawney Madison between us Git developers and experts in
> cryptography.
> 
> It just so happens that I work at a company with access to excellent
> cryptographers, and as we own the largest Git repository on the planet, we
> have a vested interest in ensuring Git's continued success.
> 
> After a couple of conversations with a couple of experts who I cannot
> thank enough for their time and patience, let alone their knowledge about
> this matter, it would appear that we may not have had a complete enough
> picture yet to even start to make the decision on the hash function to
> use.
> 

-- 
Brandon Williams


Re: git diff sometimes brings up buggy pager

2017-06-15 Thread Samuel Lijin
Any chance you can tell us what repo this happens on? + git version,
OS, and what the triggering diff invocation is.

On Thu, Jun 15, 2017 at 12:19 PM, Matthew Groth  wrote:
> When I do `git diff` sometimes I get this:
>
>
> ...skipping...
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ...skipping...
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ...skipping...
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
>
>
>  it goes on like this for about 10 times the length. Looks like this
> happens exclusively when I use git diff with a github remote that is at the
> same commit. I will update if I find any other case where this happens.
>
>
>
>


Re: [PATCH v2 3/4] sha1_file: consolidate storage-agnostic object fns

2017-06-15 Thread Junio C Hamano
Jonathan Tan  writes:

> Looking at the 3 primary functions (sha1_object_info_extended,
> read_object, has_sha1_file_with_flags), they independently implement
> mechanisms such as object replacement, retrying the packed store after
> failing to find the object in the packed store then the loose store, and
> being able to mark a packed object as bad and then retrying the whole
> process. Consolidating these mechanisms would be a great help to
> maintainability.
>
> Therefore, consolidate all 3 functions by extending
> sha1_object_info_extended() to support the functionality needed by all 3
> functions, and then modifying the other 2 to use
> sha1_object_info_extended().

This is a rather "ugly" looking patch ;-) but I followed what
has_sha1_file_with_flags() and read_object() do before and after
this change, and I think this patch is a no-op wrt their behaviour
(which is a good thing).

But I have a very mixed feelings on one aspect of the resulting
sha1_object_info_extended().

>  int sha1_object_info_extended(const unsigned char *sha1, struct object_info 
> *oi, unsigned flags)
>  {
> ...
>   if (!find_pack_entry(real, &e)) {
>   /* Most likely it's a loose object. */
> - if (!sha1_loose_object_info(real, oi, flags)) {
> + if (oi && !sha1_loose_object_info(real, oi, flags)) {
>   oi->whence = OI_LOOSE;
>   return 0;
>   }
> + if (!oi && has_loose_object(real))
> + return 0;

This conversion is not incorrect per-se.  

We can see that has_sha1_file_with_flags() after this patch still
calls has_loose_object().  But it bothers me that there is no hint
to future developers to warn that a rewrite of the above like this
is incorrect:

if (!find_pack_entry(read, &e)) {
/* Most likely it's a loose object. */
   +struct object_info dummy_oi;
   +if (!oi) {
   +memset(&dummy_oi, 0, sizeof(dummy_oi);
   +oi = &dummy_oi;
   +}
   -if (oi && !sha1_loose_object_info(real, oi, flags)) {
   +if (!sha1_loose_object_info(real, oi, flags)) {
oi->whence = OI_LOOSE;
return 0;
}
   -if (!oi && has_loose_object(real))
   -return 0;

It used to be very easy to see that has_sha1_file_with_flags() will
call has_loose_object() when it does not find the object in a pack,
which will result in the loose object file freshened.  In the new
code, it is very subtle to see that---it will happen when the caller
passes oi == NULL, and has_sha1_file_with_flags() is such a caller,
but it is unclear if there are other callers of this "consolidated"
sha1_object_info_extended() that passes oi == NULL, and if they do
also want to freshen the loose object file when they do so.

> @@ -3480,18 +3491,12 @@ int has_sha1_pack(const unsigned char *sha1)
>  
>  int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
>  {
> - struct pack_entry e;
> + int f = OBJECT_INFO_SKIP_CACHED |
> + ((flags & HAS_SHA1_QUICK) ? OBJECT_INFO_QUICK : 0);
>  
>   if (!startup_info->have_repository)
>   return 0;
> - if (find_pack_entry(sha1, &e))
> - return 1;
> - if (has_loose_object(sha1))
> - return 1;
> - if (flags & HAS_SHA1_QUICK)
> - return 0;
> - reprepare_packed_git();
> - return find_pack_entry(sha1, &e);
> + return !sha1_object_info_extended(sha1, NULL, f);
>  }

I would have preferred to see the new variable not to be called 'f',
as that makes it unclear what it is (is that a callback function
pointer?).  In this case, uyou are forcing the flag bits passed
down, so perhaps you can reuse the same variable?  

If you allocated a separate variable because
has_sha1_file_with_flags() and sha1_object_info_extended() take flag
bits from two separate vocabularies, that is a valid reasoning, but
if that is the case, then I would have named 'f' to reflect that
fact that this is different from parameter 'flag' that is defined in
the has_sha1_file_with_flags() world, but a different thing that is
defined in sha1_object_info_extended() world, e.g. "soie_flag" or
something like that.

Thanks.


Re: [PATCH v1] Configure Git contribution guidelines for github.com

2017-06-15 Thread Junio C Hamano
Andreas Heiduk  writes:

> Am 15.06.2017 um 18:43 schrieb Junio C Hamano:
>> Another thing that may regress that you did not mention is that we
>> would lose a convenient way to _count_ proposed changes coming via
>> submitGit (i.e. you can simply go to the pull-request page), so that
>> the number can be compared with the number of proposed changes
>> directly made on the mailing list, which would be a good way to
>> gauge how submitGit service is helping our community.  But even for
>> that, you'd need to go to the list to find the denominator
>> (i.e. total number of changes proposed), and by the time you do
>> that, counting the numerator (i.e. those come via submitGit) by
>> finding the telltale sign submitGit leaves in its output among these
>> denominator messages should be trivial.
>
> This numbers can be aquired quite easily if submitGit adds a defined
> trailer to the converted commit message like this:
>
>   Signed-off-by: Foo Bar 
>   Submit-git-id: url-or-id-of-pr

I do not think you would want the noise _in_ the log message.  The
"telltale sign" I had in mind was these "signature" lines at the end
of the message:

--
https://github.com/git/git/pull/538



Re: [PATCH v2 3/4] sha1_file: consolidate storage-agnostic object fns

2017-06-15 Thread Jonathan Tan
On Thu, 15 Jun 2017 10:50:46 -0700
Junio C Hamano  wrote:

> Jonathan Tan  writes:
> 
> > Looking at the 3 primary functions (sha1_object_info_extended,
> > read_object, has_sha1_file_with_flags), they independently implement
> > mechanisms such as object replacement, retrying the packed store after
> > failing to find the object in the packed store then the loose store, and
> > being able to mark a packed object as bad and then retrying the whole
> > process. Consolidating these mechanisms would be a great help to
> > maintainability.
> >
> > Therefore, consolidate all 3 functions by extending
> > sha1_object_info_extended() to support the functionality needed by all 3
> > functions, and then modifying the other 2 to use
> > sha1_object_info_extended().
> 
> This is a rather "ugly" looking patch ;-) but I followed what
> has_sha1_file_with_flags() and read_object() do before and after
> this change, and I think this patch is a no-op wrt their behaviour
> (which is a good thing).
> 
> But I have a very mixed feelings on one aspect of the resulting
> sha1_object_info_extended().
> 
> >  int sha1_object_info_extended(const unsigned char *sha1, struct 
> > object_info *oi, unsigned flags)
> >  {
> > ...
> > if (!find_pack_entry(real, &e)) {
> > /* Most likely it's a loose object. */
> > -   if (!sha1_loose_object_info(real, oi, flags)) {
> > +   if (oi && !sha1_loose_object_info(real, oi, flags)) {
> > oi->whence = OI_LOOSE;
> > return 0;
> > }
> > +   if (!oi && has_loose_object(real))
> > +   return 0;
> 
> This conversion is not incorrect per-se.  
> 
> We can see that has_sha1_file_with_flags() after this patch still
> calls has_loose_object().  But it bothers me that there is no hint
> to future developers to warn that a rewrite of the above like this
> is incorrect:
> 
> if (!find_pack_entry(read, &e)) {
> /* Most likely it's a loose object. */
>+struct object_info dummy_oi;
>+if (!oi) {
>+memset(&dummy_oi, 0, sizeof(dummy_oi);
>+oi = &dummy_oi;
>+}
>-if (oi && !sha1_loose_object_info(real, oi, flags)) {
>+if (!sha1_loose_object_info(real, oi, flags)) {
> oi->whence = OI_LOOSE;
> return 0;
> }
>-if (!oi && has_loose_object(real))
>-return 0;
> 
> It used to be very easy to see that has_sha1_file_with_flags() will
> call has_loose_object() when it does not find the object in a pack,
> which will result in the loose object file freshened.  In the new
> code, it is very subtle to see that---it will happen when the caller
> passes oi == NULL, and has_sha1_file_with_flags() is such a caller,
> but it is unclear if there are other callers of this "consolidated"
> sha1_object_info_extended() that passes oi == NULL, and if they do
> also want to freshen the loose object file when they do so.

Good point - sorry, I didn't pay much attention to the freshening
behavior. After some thought, I now think it might be better to avoid
modifying has_sha1_file_with_flags(). As it is,
sha1_object_info_extended() already needs special handling (special
flags and handling the possibility of "oi" being NULL) to handle the
functionality required by has_sha1_file_with_flags(); adding yet another
thing to handle (freshen or not) would make it much too complicated.

This means that subsequent patches that modify the handling of
storage-agnostic objects would still need to modify 2 functions, but at
least that is fewer than the current 3.

I'll reroll with these changes so that you (and others) can see what it
looks like.

> > @@ -3480,18 +3491,12 @@ int has_sha1_pack(const unsigned char *sha1)
> >  
> >  int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
> >  {
> > -   struct pack_entry e;
> > +   int f = OBJECT_INFO_SKIP_CACHED |
> > +   ((flags & HAS_SHA1_QUICK) ? OBJECT_INFO_QUICK : 0);
> >  
> > if (!startup_info->have_repository)
> > return 0;
> > -   if (find_pack_entry(sha1, &e))
> > -   return 1;
> > -   if (has_loose_object(sha1))
> > -   return 1;
> > -   if (flags & HAS_SHA1_QUICK)
> > -   return 0;
> > -   reprepare_packed_git();
> > -   return find_pack_entry(sha1, &e);
> > +   return !sha1_object_info_extended(sha1, NULL, f);
> >  }
> 
> I would have preferred to see the new variable not to be called 'f',
> as that makes it unclear what it is (is that a callback function
> pointer?).  In this case, uyou are forcing the flag bits passed
> down, so perhaps you can reuse the same variable?  
> 
> If you allocated a separate variable because
> has_sha1_file_with_flags() and sha1_object_info_extended() take flag
> bits from two separate vocabularies, that is a valid reasoning, but
> if that is the cas

Re: [PATCH v2 4/4] sha1_file, fsck: add missing blob support

2017-06-15 Thread Junio C Hamano
Jonathan Tan  writes:

> diff --git a/sha1_file.c b/sha1_file.c
> index 98086e21e..75fe2174d 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -27,6 +27,9 @@
>  #include "list.h"
>  #include "mergesort.h"
>  #include "quote.h"
> +#include "iterator.h"
> +#include "dir-iterator.h"
> +#include "sha1-lookup.h"
>  
>  #define SZ_FMT PRIuMAX
>  static inline uintmax_t sz_fmt(size_t s) { return s; }
> @@ -1624,6 +1627,72 @@ static const struct packed_git 
> *has_packed_and_bad(const unsigned char *sha1)
>   return NULL;
>  }
>  
> +struct missing_blob_manifest {
> + struct missing_blob_manifest *next;
> + const char *data;
> +};
> +struct missing_blob_manifest *missing_blobs;
> +int missing_blobs_initialized;

I do not think you meant to make these non-static.  The type of the
former is not even visible to the outside world, and the latter is
something that could be made into static to prepare_missing_blobs()
function (unless and until you start allowing the missing-blobs
manifest to be re-initialized).  Your ensure_configured() below
seems to do the "static" right, on the other hand ;-).

Do we expect that we will have only a handful of these missing blob
manifests?  Each manifest seems to be efficiently looked-up with a
binary search, but it makes me wonder if it is a good idea to
consolidate these manifests into a single list of object names to
eliminate the outer loop in has_missing_blob().  Unlike pack .idx
files that must stay one-to-one with .pack files, it appears to me
that there is no reason why we need to keep multiple ones separate
for extended period of time (e.g. whenever we learn that we receieved
an incomplete pack from the other side with a list of newly missing
blobs, we could incorporate that into existing missing blob list).

> +int has_missing_blob(const unsigned char *sha1, unsigned long *size)
> +{

This function that answers "is it expected to be missing?" is
confusingly named.  Is it missing, or does it exist?

> @@ -2981,11 +3050,55 @@ static int sha1_loose_object_info(const unsigned char 
> *sha1,
>   return (status < 0) ? status : 0;
>  }
>  
> +static char *missing_blob_command;
> +static int sha1_file_config(const char *conf_key, const char *value, void 
> *cb)
> +{
> + if (!strcmp(conf_key, "core.missingblobcommand")) {
> + missing_blob_command = xstrdup(value);
> + }
> + return 0;
> +}
> +
> +static int configured;
> +static void ensure_configured(void)
> +{
> + if (configured)
> + return;

Do not be selfish and pretend that this is the _only_ kind of
configuration that needs to be done inside sha1_file.c.  Call the
function ensure__is_configured() and rename the run-once
guard to match.

The run-once guard can be made static to the "ensure" function, and
if you do so, then its name can stay to be "configured", as at that
point it is clear what it is guarding.

> diff --git a/t/t3907-missing-blob.sh b/t/t3907-missing-blob.sh
> new file mode 100755
> index 0..e0ce0942d
> --- /dev/null
> +++ b/t/t3907-missing-blob.sh
> @@ -0,0 +1,69 @@
> +#!/bin/sh
> +
> +test_description='core.missingblobcommand option'
> +
> +. ./test-lib.sh
> +
> +pack() {

Style: "pack () {"

> + perl -e '$/ = undef; $input = <>; print pack("H*", $input)'

high-nybble first to match ntohll() done in has_missing_blob()?  OK.

> +}
> +
> +test_expect_success 'sha1_object_info_extended and read_sha1_file (through 
> git cat-file -p)' '
> + rm -rf server client &&
> +
> + git init server &&
> + test_commit -C server 1 &&
> + test_config -C server uploadpack.allowanysha1inwant 1 &&
> + HASH=$(git hash-object server/1.t) &&
> +
> + git init client &&
> + test_config -C client core.missingblobcommand \
> + "git -C \"$(pwd)/server\" pack-objects --stdout | git 
> unpack-objects" &&
> +
> + # does not work if missing blob is not registered
> + test_must_fail git -C client cat-file -p "$HASH" &&
> +
> + mkdir -p client/.git/objects/missing &&
> + printf "%016x%s%016x" 1 "$HASH" "$(wc -c  + pack >client/.git/objects/missing/x &&
> +
> + # works when missing blob is registered
> + git -C client cat-file -p "$HASH"
> +'

OK, by passing printf '%016x', implementations of "$(wc -c)" that
gives extra whitespace around its output can still work correctly.
Good.


Re: [BUG] add_again() off-by-one error in custom format

2017-06-15 Thread Junio C Hamano
René Scharfe  writes:

> Am 13.06.2017 um 23:20 schrieb Junio C Hamano:
>
>> I think the real question is how likely people use more than one
>> occurrence of the same thing in their custom format, and how deeply
>> they care that --format='%h %h' costs more than --format='%h'.  The
>> cost won't of course be double (because the main traversal costs
>> without any output), but it would be rather unreasonable to expect
>> that --format='%h %h %h %h %h' to cost the same as --format='%h';
>> after all, Git is doing more for them ;-)
>
> The answer to the first half is obviously "very likely" -- otherwise
> this bug wouldn't have been found, right? :)

Not really.  There was only one (this one) after all these years.
The question we are asking is not "very rarely this is used and we
can afford to leave it broken?"  It is "very rarely this is used
and we can afford not to optimize for that rare use case?".

> Regarding the question of how bad a 50% slowdown for a second %h
> would be: No idea.  If ran interactively it may not even be noticeable
> because the user can read the first few lines in less while the rest
> is prepared in the background.  We don't have a perf test for formats
> with duplicate short hashes, so we don't promise anything, right? :)

OK.

> -- >8 --
> Subject: [PATCH] pretty: recalculate duplicate short hashes
>
> b9c6232138 (--format=pretty: avoid calculating expensive expansions
> twice) optimized adding short hashes multiple times by using the
> fact that the output strbuf was only ever simply appended to and
> copying the added string from the previous run.  That prerequisite
> is no longer given; we now have modfiers like %< and %+ that can
> cause the cache to lose track of the correct offsets.  Remove it.
>
> Reported-by: Michael Giuffrida 
> Signed-off-by: Rene Scharfe 
> ---
> I'm sending this out in the hope that there might be a simple way
> to fix it after all, like Gábor's patch does for %+.  %< and %>
> seem to be the only other problematic modifiers for now -- I'm
> actually surprised that %w seems to be OK.

Thanks, this looks like a sensible first step.  Will queue.


[ANNOUNCE] Git for Windows 2.13.1(2)

2017-06-15 Thread Johannes Schindelin
Dear Git users,

It is my pleasure to announce that Git for Windows 2.13.1(2) is available from:

https://git-for-windows.github.io/

Changes since Git for Windows v2.13.1 (June 13th 2017)

Bug Fixes

  * git commit and git status no longer randomly throw segmentation
faults.

Filename | SHA-256
 | ---
Git-2.13.1.2-64-bit.exe | 
cd11e57bd25c4d8fde0a7568d19bf3fc6418dd23080901414309b144e2bf0b32
Git-2.13.1.2-32-bit.exe | 
5eb854b666a77a2efc0119fc144cbba1e01a716c542f4259af1dbd4323d68fe9
PortableGit-2.13.1.2-64-bit.7z.exe | 
2c98f6cab688d585d68896c8954e4849c70b33a34f8b5b6009d2ba56ddd95c43
PortableGit-2.13.1.2-32-bit.7z.exe | 
7eeccb6aa3aa294a05538a913f465b9ddeb36160126caf709b378bb78630216b
MinGit-2.13.1.2-64-bit.zip | 
9d3d572f275ebf69ea14bb4abfda64af78c738d2db8a54ee1f9f9db7cdfadf74
MinGit-2.13.1.2-32-bit.zip | 
4b643c986a8c2455cddd2338a3c892acf111d3833384e866410785f9ea073a1a
Git-2.13.1.2-64-bit.tar.bz2 | 
6fc4fa4903ff974f3960c4422269beeb3f02176029b69db0d6090986b21a9206
Git-2.13.1.2-32-bit.tar.bz2 | 
9476b762c4eb007d82627e34b7b0fde6ddfae8c78f3b1d29518c68dd65f4a4e2

Ciao,
Johannes


Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan

2017-06-15 Thread Jonathan Nieder
Hi Dscho,

Johannes Schindelin wrote:

> From what I read, pretty much everybody who participated in the discussion
> was aware that the essential question is: performance vs security.

I don't completely agree with this framing.  The essential question is:
how to get the right security properties without abysmal performance.

> It turns out that we can have essentially both.
>
> SHA-256 is most likely the best-studied hash function we currently know
[... etc ...]

Thanks for a thoughtful restart to the discussion.  This is much more
concrete than your previous objections about process, and that is very
helpful.

In the interest of transparency: here are my current questions for
cryptographers to whom I have forwarded this thread.  Several of these
questions involve predictions or opinions, so in my ideal world we'd
want multiple, well reasoned answers to them.  Please feel free to
forward them to appropriate people or add more.

 1. Now it sounds like SHA-512/256 is the safest choice (see also Mike
Hommey's response to Dscho's message).  Please poke holes in my
understanding.

 2. Would you be willing to weigh in publicly on the mailing list? I
think that would be the most straightforward way to move this
forward (and it would give you a chance to ask relevant questions,
etc).  Feel free to contact me privately if you have any questions
about how this particular mailing list works.

 3. On the speed side, Dscho states "SHA-256 will be faster than BLAKE
(and even than BLAKE2) once the Intel and AMD CPUs with hardware
support for SHA-256 become common."  Do you agree?

 4. On the security side, Dscho states "to compete in the SHA-3
contest, BLAKE added complexity so that it would be roughly on par
with its competitors.  To allow for faster execution in software,
this complexity was *removed* from BLAKE to create BLAKE2, making
it weaker than SHA-256."  Putting aside the historical questions,
do you agree with this "weaker than" claim?

 5. On the security side, Dscho states, "The type of attacks Git has to
worry about is very different from the length extension attacks,
and it is highly unlikely that that weakness of SHA-256 leads to,
say, a collision attack", and Jeff King states, "Git does not use
the hash as a MAC, so length extension attacks aren't a thing (and
even if we later wanted to use the same algorithm as a MAC, the
HMAC construction is a well-studied technique for dealing with
it)."  Is this correct in spirit?  Is SHA-256 equally strong to
SHA-512/256 for Git's purposes, or are the increased bits of
internal state (or other differences) relevant?  How would you
compare the two functions' properties?

 6. On the speed side, Jeff King states "That said, SHA-512 is
typically a little faster than SHA-256 on 64-bit platforms. I
don't know if that will change with the advent of hardware
instructions oriented towards SHA-256."  Thoughts?

 7. If the answer to (2) is "no", do I have permission to quote or
paraphrase your replies that were given here?

Thanks, sincerely,
Jonathan


Re: [PATCH] diff-highlight: split code into module

2017-06-15 Thread Junio C Hamano
Jeff King  writes:

> The diff-so-fancy project is also written in perl, and most
> of its users pipe diffs through both diff-highlight and
> diff-so-fancy. It would be nice if this could be done in a
> single script. So let's pull most of diff-highlight's code
> into its own module which can be used by diff-so-fancy.
>
> In addition, we'll abstract a few basic items like reading
> from stdio so that a script using the module can do more
> processing before or after diff-highlight handles the lines.
> See the README update for more details.
>
> One small downside is that the diff-highlight script must
> now be built using the Makefile. There are ways around this,
> but it quickly gets into perl arcana. Let's go with the
> simple solution. As a bonus, our Makefile now respects the
> PERL_PATH variable if it is set.
>
> Signed-off-by: Jeff King 
> ---
> Scott and I discussed this off-list, and this was the least-gross
> solution I came up with.  The plan would be for diff-so-fancy to pull in
> this copy of diff-highlight from git.git and have a wrapper script
> similar to the diff-highlight.perl found here.
>
>  contrib/diff-highlight/.gitignore  |  2 ++
>  .../{diff-highlight => DiffHighlight.pm}   | 40 
> +-
>  contrib/diff-highlight/Makefile| 21 ++--
>  contrib/diff-highlight/README  | 30 
>  contrib/diff-highlight/diff-highlight.perl |  8 +
>  5 files changed, 82 insertions(+), 19 deletions(-)
>  create mode 100644 contrib/diff-highlight/.gitignore
>  rename contrib/diff-highlight/{diff-highlight => DiffHighlight.pm} (91%)
>  mode change 100755 => 100644
>  create mode 100644 contrib/diff-highlight/diff-highlight.perl

Do you want +x bit for the last one?  I could throw that bit in
while queuing if you want.

Thanks.


Re: [PATCH] diff-highlight: split code into module

2017-06-15 Thread Junio C Hamano
Junio C Hamano  writes:

>>  contrib/diff-highlight/.gitignore  |  2 ++
>>  .../{diff-highlight => DiffHighlight.pm}   | 40 
>> +-
>>  contrib/diff-highlight/Makefile| 21 ++--
>>  contrib/diff-highlight/README  | 30 
>>  contrib/diff-highlight/diff-highlight.perl |  8 +
>>  5 files changed, 82 insertions(+), 19 deletions(-)
>>  create mode 100644 contrib/diff-highlight/.gitignore
>>  rename contrib/diff-highlight/{diff-highlight => DiffHighlight.pm} (91%)
>>  mode change 100755 => 100644
>>  create mode 100644 contrib/diff-highlight/diff-highlight.perl
>
> Do you want +x bit for the last one?  I could throw that bit in
> while queuing if you want.

Ah, I do not think you want it, as this is not something to be
executed as-is (it is a source file, which diff-highlight that is
executable is made out of).



Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan

2017-06-15 Thread Junio C Hamano
Brandon Williams  writes:

>> It would make a whole of a lot of sense to make that knob not Boolean,
>> but to specify which hash function is in use.
>
> 100% agree on this point.  I believe the current plan is to have the
> hashing function used for a repository be a repository format extension
> which would be a value (most likely a string like 'sha1', 'sha256',
> 'black2', etc) stored in a repository's .git/config.  This way, upon
> startup git will die or ignore a repository which uses a hashing
> function which it does not recognize or does not compiled to handle.
>
> I hope (and expect) that the end produce of this transition is a nice,
> clean hashing API and interface with sufficient abstractions such that
> if I wanted to switch to a different hashing function I would just need
> to implement the interface with the new hashing function and ensure that
> 'verify_repository_format' allows the new function.

Yup.  I thought that part has already been agreed upon, but it is a
good thing that somebody is writing it down (perhaps "again", if not
"for the first time").

Thanks.



Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan

2017-06-15 Thread Johannes Schindelin
Hi,

On Thu, 15 Jun 2017, Ævar Arnfjörð Bjarmason wrote:

> On Thu, Jun 15 2017, Jeff King jotted:
> 
> > On Thu, Jun 15, 2017 at 08:05:18PM +0900, Mike Hommey wrote:
> >
> >> On Thu, Jun 15, 2017 at 12:30:46PM +0200, Johannes Schindelin wrote:
> >>
> >> > Footnote *1*: SHA-256, as all hash functions whose output is
> >> > essentially the entire internal state, are susceptible to a
> >> > so-called "length extension attack", where the hash of a
> >> > secret+message can be used to generate the hash of
> >> > secret+message+piggyback without knowing the secret.  This is not
> >> > the case for Git: only visible data are hashed. The type of attacks
> >> > Git has to worry about is very different from the length extension
> >> > attacks, and it is highly unlikely that that weakness of SHA-256
> >> > leads to, say, a collision attack.
> >>
> >> What do the experts think or SHA512/256, which completely removes the
> >> concerns over length extension attack? (which I'd argue is better than
> >> sweeping them under the carpet)
> >
> > I don't think it's sweeping them under the carpet. Git does not use the
> > hash as a MAC, so length extension attacks aren't a thing (and even if
> > we later wanted to use the same algorithm as a MAC, the HMAC
> > construction is a well-studied technique for dealing with it).

I really tried to drive that point home, as it had been made very clear to
me that the length extension attack is something that Git need not concern
itself.

The length extension attack *only* comes into play when there are secrets
that are hashed. In that case, one would not want others to be able to
produce a valid hash *without* knowing the secrets. And SHA-256 allows to
"reconstruct" the internal state (which is the hash value) in order to
continue at any point, i.e. if the hash for secret+message is known, it is
easy to calculate the hash for secret+message+addition, without knowing
the secret at all.

That is exactly *not* the case with Git. In Git, what we want to hash is
known in its entirety. If the hash value were not identical to the
internal state, it would be easy enough to reconstruct, because *there are
no secrets*.

So please understand that even the direction that the length extension
attack takes is completely different than the direction any attack would
have to take that weakens SHA-256 for Git's purposes. As far as Git's
usage is concerned, SHA-256 has no known weaknesses.

It is *really, really, really* important to understand this before going
on to suggest another hash function such as SHA-512/256 (i.e. SHA-512
truncated to 256 bits), based only on that perceived weakness of SHA-256.

> > That said, SHA-512 is typically a little faster than SHA-256 on 64-bit
> > platforms. I don't know if that will change with the advent of
> > hardware instructions oriented towards SHA-256.
> 
> Quoting my own
> cacbzzx7jra2niwt9wsgaxnzs+gws8htugzwm8nay1gs87o8...@mail.gmail.com sent
> ~2 weeks ago to the list:
> 
> On Fri, Jun 2, 2017 at 7:54 PM, Jonathan Nieder 
> wrote:
> [...]
> > 4. When choosing a hash function, people may argue about performance.
> >It would be useful for run some benchmarks for git (running
> >the test suite, t/perf tests, etc) using a variety of hash
> >functions as input to such a discussion.
> 
> To the extent that such benchmarks matter, it seems prudent to heavily
> weigh them in favor of whatever seems to be likely to be the more
> common hash function going forward, since those are likely to get
> faster through future hardware acceleration.
> 
> E.g. Intel announced Goldmont last year which according to one SHA-1
> implementation improved from 9.5 cycles per byte to 2.7 cpb[1]. They
> only have acceleration for SHA-1 and SHA-256[2]
> 
> 1. https://github.com/weidai11/cryptopp/issues/139#issuecomment-264283385
> 
> 2. https://en.wikipedia.org/wiki/Goldmont
> 
> Maybe someone else knows of better numbers / benchmarks, but such a
> reduction in CBP likely makes it faster than SHA-512.

Very, very likely faster than SHA-512.

I'd like to stress explicitly that the Intel SHA extensions do *not* cover
SHA-512:

https://en.wikipedia.org/wiki/Intel_SHA_extensions

In other words, once those extensions become commonplace, SHA-256 will be
faster than SHA-512, hands down.

Ciao,
Dscho

Re: [PATCH v4 6/6] Use the early config machinery to expand aliases

2017-06-15 Thread Junio C Hamano
Johannes Schindelin  writes:

> +struct config_alias_data
> +{

Style: "struct config_alias_data {"

which I can tweak while applying.

Other than that, everything looks good.

Thanks.


Re: [PATCH v4 6/6] Use the early config machinery to expand aliases

2017-06-15 Thread Johannes Schindelin
Hi Junio,

On Thu, 15 Jun 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > +struct config_alias_data
> > +{
> 
> Style: "struct config_alias_data {"
> 
> which I can tweak while applying.

Please do.

> Other than that, everything looks good.

Thanks,
Dscho


Re: [PATCH v3 0/6] config.h

2017-06-15 Thread Junio C Hamano
Brandon Williams  writes:

> Changes in v3:
>
> * tweaked the discover_git_directory function's API based on Peff's feedback
> * reordered the last three patches so that they flowed a bit better
> * renamed 'git_config_with_options'
> * rebased ontop of v4 of Dscho's alias series
>   
> https://public-inbox.org/git/cover.1497440104.git.johannes.schinde...@gmx.de/

Applying this series was messier than necessary, I'd have to say, as
this series would not apply cleanly on top of the result of applying
Dscho's v4 on top of the same base as Dscho's v3 was applied (which
is v2.13.0).  It applied cleanly only when Dscho's v4 and then this
series were applied on top of 02a2850a ("Sync with maint",
2017-06-13), which is a much newer commit than v2.13.0.

Which in turn means these two fixes cannot be merged to 'maint' as
you two collaboratively prepared.

I've applied Dscho's v4 on top of v2.13.0 (just like his v3 was
queued in my tree on the same base), and then tweaked this series to
apply on top of that, so that they can go to 'maint' if we chose to.

But it is possible that during this unnecessary patch shuffling, I
may have made some mistake, so please eyeball the resulting 12
patches carefully when they are pushed out.

Thanks.
>


Re: [PATCH] diff-highlight: split code into module

2017-06-15 Thread Scott Baker
On 06/15/2017 12:15 PM, Junio C Hamano wrote:
> Do you want +x bit for the last one?  I could throw that bit in
> while queuing if you want.
>
> Thanks.
>
Ya probably best.




Re: [WIP v2 2/2] pack-objects: support --blob-max-bytes

2017-06-15 Thread Jeff Hostetler



On 6/2/2017 6:26 PM, Jeff King wrote:

On Fri, Jun 02, 2017 at 12:38:45PM -0700, Jonathan Tan wrote:

...

We have a name-hash cache extension in the bitmap file, but it doesn't
carry enough information to deduce the .git-ness of a file. I don't
think it would be too hard to add a "flags" extension, and give a single
bit to "this is a .git file".

I do also wonder if the two features would need to be separated for a
GVFS-style solution. If you're not just avoiding large blobs but trying
to get a narrow clone, you don't want the .git files from the
uninteresting parts of the tree. You want to get no blobs at all, and
then fault them in as they become relevant due to user action.

-Peff



I agree with Peff here.  I've been working on my partial/narrow/sparse
clone/fetch ideas since my original RFC and have come to the conclusion
that the server can do the size limiting efficiently, but we should
leave the pathname filtering to the client.  That is, let the client
get the commits and trees and then locally apply pattern matching,
whether that be a sparse-checkout definition or simple ".git*"
matching and then make a later request for the "blobs of interest".

Whether we "fault-in" the missing blobs or have a "fetch-blobs"
command (like fetch-pack) to get them is open to debate.

There are concerns about the size of the requested blob-id list in a
fetch-blobs approach, but I think there are ways to say I need all
of the blobs referenced by the directory /foo in commit  (and
optionally, that aren't present in directory /foo in commit 
or in the range ..).  (handwave)

Jeff


Re: [PATCH v2 4/4] sha1_file, fsck: add missing blob support

2017-06-15 Thread Jonathan Tan
A reroll is coming soon, but there is an interesting discussion point
here so I'll reply to this e-mail first.

On Thu, 15 Jun 2017 11:34:45 -0700
Junio C Hamano  wrote:

> Jonathan Tan  writes:
> 
> > +struct missing_blob_manifest {
> > +   struct missing_blob_manifest *next;
> > +   const char *data;
> > +};
> > +struct missing_blob_manifest *missing_blobs;
> > +int missing_blobs_initialized;
> 
> I do not think you meant to make these non-static.  The type of the
> former is not even visible to the outside world, and the latter is
> something that could be made into static to prepare_missing_blobs()
> function (unless and until you start allowing the missing-blobs
> manifest to be re-initialized).  Your ensure_configured() below
> seems to do the "static" right, on the other hand ;-).

Good catch - done.

> Do we expect that we will have only a handful of these missing blob
> manifests?  Each manifest seems to be efficiently looked-up with a
> binary search, but it makes me wonder if it is a good idea to
> consolidate these manifests into a single list of object names to
> eliminate the outer loop in has_missing_blob().  Unlike pack .idx
> files that must stay one-to-one with .pack files, it appears to me
> that there is no reason why we need to keep multiple ones separate
> for extended period of time (e.g. whenever we learn that we receieved
> an incomplete pack from the other side with a list of newly missing
> blobs, we could incorporate that into existing missing blob list).

There is indeed no reason why we need to keep multiple ones separate for
an extended period of time - my thinking was to let fetch/clone be fast
by not needing to scan through the entire existing manifest (in order to
create the new one), letting GC take care of consolidating them (since
it would have to check individual entries to delete those corresponding
to objects that have entered the repo through other means). But this is
at the expense of making the individual object lookups a bit slower.

For now, I'll leave the possibility of multiple files open while I try
to create a set of patches that can implement missing blob support from
fetch to day-to-day usage. But I am not opposed to changing it to a
single-file manifest.

> > +int has_missing_blob(const unsigned char *sha1, unsigned long *size)
> > +{
> 
> This function that answers "is it expected to be missing?" is
> confusingly named.  Is it missing, or does it exist?

Renamed to in_missing_blob_manifest().

> > @@ -2981,11 +3050,55 @@ static int sha1_loose_object_info(const unsigned 
> > char *sha1,
> > return (status < 0) ? status : 0;
> >  }
> >  
> > +static char *missing_blob_command;
> > +static int sha1_file_config(const char *conf_key, const char *value, void 
> > *cb)
> > +{
> > +   if (!strcmp(conf_key, "core.missingblobcommand")) {
> > +   missing_blob_command = xstrdup(value);
> > +   }
> > +   return 0;
> > +}
> > +
> > +static int configured;
> > +static void ensure_configured(void)
> > +{
> > +   if (configured)
> > +   return;
> 
> Do not be selfish and pretend that this is the _only_ kind of
> configuration that needs to be done inside sha1_file.c.  Call the
> function ensure__is_configured() and rename the run-once
> guard to match.

My thinking was that any additional configuration could be added to this
function, but individual configuration for each feature is fine too. I
have renamed things according to your suggestion.

> The run-once guard can be made static to the "ensure" function, and
> if you do so, then its name can stay to be "configured", as at that
> point it is clear what it is guarding.

Done.

> > +pack() {
> 
> Style: "pack () {"

Done.

> 
> > +   perl -e '$/ = undef; $input = <>; print pack("H*", $input)'
> 
> high-nybble first to match ntohll() done in has_missing_blob()?  OK.

Actually it's to match the printf behavior below that prints the high
nybble first (like in English).


Re: [PATCH v3 0/6] config.h

2017-06-15 Thread Brandon Williams
On 06/15, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> > Changes in v3:
> >
> > * tweaked the discover_git_directory function's API based on Peff's feedback
> > * reordered the last three patches so that they flowed a bit better
> > * renamed 'git_config_with_options'
> > * rebased ontop of v4 of Dscho's alias series
> >   
> > https://public-inbox.org/git/cover.1497440104.git.johannes.schinde...@gmx.de/
> 
> Applying this series was messier than necessary, I'd have to say, as
> this series would not apply cleanly on top of the result of applying
> Dscho's v4 on top of the same base as Dscho's v3 was applied (which
> is v2.13.0).  It applied cleanly only when Dscho's v4 and then this
> series were applied on top of 02a2850a ("Sync with maint",
> 2017-06-13), which is a much newer commit than v2.13.0.
> 
> Which in turn means these two fixes cannot be merged to 'maint' as
> you two collaboratively prepared.
> 
> I've applied Dscho's v4 on top of v2.13.0 (just like his v3 was
> queued in my tree on the same base), and then tweaked this series to
> apply on top of that, so that they can go to 'maint' if we chose to.
> 
> But it is possible that during this unnecessary patch shuffling, I
> may have made some mistake, so please eyeball the resulting 12
> patches carefully when they are pushed out.

Ugh, I'm terribly sorry.  Completely my bad as I didn't consider what
you would need to do on your end.  When I built my patches on top of his
I naively just applied his v4 to what ever the current origin/master was
at that point in time.  I'll be sure to be more careful with this
next time.

-- 
Brandon Williams


[PATCH v3 1/4] sha1_file: teach packed_object_info about typename

2017-06-15 Thread Jonathan Tan
In commit 46f0344 ("sha1_file: support reading from a loose object of
unknown type", 2015-05-06), "struct object_info" gained a "typename"
field that could represent a type name from a loose object file, whether
valid or invalid, as opposed to the existing "typep" which could only
represent valid types. Some relatively complex manipulations were added
to avoid breaking packed_object_info() without modifying it, but it is
much easier to just teach packed_object_info() about the new field.
Therefore, teach packed_object_info() as described above.

Signed-off-by: Jonathan Tan 
---
 sha1_file.c | 29 -
 1 file changed, 12 insertions(+), 17 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 59a4ed2ed..a52b27541 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2277,9 +2277,18 @@ int packed_object_info(struct packed_git *p, off_t 
obj_offset,
*oi->disk_sizep = revidx[1].offset - obj_offset;
}
 
-   if (oi->typep) {
-   *oi->typep = packed_to_object_type(p, obj_offset, type, 
&w_curs, curpos);
-   if (*oi->typep < 0) {
+   if (oi->typep || oi->typename) {
+   enum object_type ptot;
+   ptot = packed_to_object_type(p, obj_offset, type, &w_curs,
+curpos);
+   if (oi->typep)
+   *oi->typep = ptot;
+   if (oi->typename) {
+   const char *tn = typename(ptot);
+   if (tn)
+   strbuf_addstr(oi->typename, tn);
+   }
+   if (ptot < 0) {
type = OBJ_BAD;
goto out;
}
@@ -2960,7 +2969,6 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
struct cached_object *co;
struct pack_entry e;
int rtype;
-   enum object_type real_type;
const unsigned char *real = lookup_replace_object_extended(sha1, flags);
 
co = find_cached_object(real);
@@ -2992,18 +3000,9 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
return -1;
}
 
-   /*
-* packed_object_info() does not follow the delta chain to
-* find out the real type, unless it is given oi->typep.
-*/
-   if (oi->typename && !oi->typep)
-   oi->typep = &real_type;
-
rtype = packed_object_info(e.p, e.offset, oi);
if (rtype < 0) {
mark_bad_packed_object(e.p, real);
-   if (oi->typep == &real_type)
-   oi->typep = NULL;
return sha1_object_info_extended(real, oi, 0);
} else if (in_delta_base_cache(e.p, e.offset)) {
oi->whence = OI_DBCACHED;
@@ -3014,10 +3013,6 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
oi->u.packed.is_delta = (rtype == OBJ_REF_DELTA ||
 rtype == OBJ_OFS_DELTA);
}
-   if (oi->typename)
-   strbuf_addstr(oi->typename, typename(*oi->typep));
-   if (oi->typep == &real_type)
-   oi->typep = NULL;
 
return 0;
 }
-- 
2.13.1.518.g3df882009-goog



[PATCH v3 4/4] sha1_file, fsck: add missing blob support

2017-06-15 Thread Jonathan Tan
Currently, Git does not support repos with very large numbers of blobs
or repos that wish to minimize manipulation of certain blobs (for
example, because they are very large) very well, even if the user
operates mostly on part of the repo, because Git is designed on the
assumption that every blob referenced by a tree object is available
somewhere in the repo storage.

As a first step to reducing this problem, add rudimentary support for
missing blobs by teaching sha1_file to invoke a hook whenever a blob is
requested and unavailable but registered to be missing, and by updating
fsck to tolerate such blobs.  The hook is a shell command that can be
configured through "git config"; this hook takes in a list of hashes and
writes (if successful) the corresponding objects to the repo's local
storage.

This commit does not include support for generating such a repo; neither
has any command (other than fsck) been modified to either tolerate
missing blobs (without invoking the hook) or be more efficient in
invoking the missing blob hook. Only a fallback is provided in the form
of sha1_file invoking the missing blob hook when necessary.

In order to determine the code changes in sha1_file.c necessary, I
investigated the following:
 (1) functions in sha1_file that take in a hash, without the user
 regarding how the object is stored (loose or packed)
 (2) functions in sha1_file that operate on packed objects (because I
 need to check callers that know about the loose/packed distinction
 and operate on both differently, and ensure that they can handle
 the concept of objects that are neither loose nor packed)

(1) is handled by the modification to sha1_object_info_extended() and
has_sha1_file_with_flags().

For (2), I looked through the same functions as in (1) and also
for_each_packed_object. The ones that are relevant are:
 - parse_pack_index
   - http - indirectly from http_get_info_packs
 - find_pack_entry_one
   - this searches a single pack that is provided as an argument; the
 caller already knows (through other means) that the sought object
 is in a specific pack
 - find_sha1_pack
   - fast-import - appears to be an optimization to not store a
 file if it is already in a pack
   - http-walker - to search through a struct alt_base
   - http-push - to search through remote packs
 - has_sha1_pack
   - builtin/fsck - fixed in this commit
   - builtin/count-objects - informational purposes only (check if loose
 object is also packed)
   - builtin/prune-packed - check if object to be pruned is packed (if
 not, don't prune it)
   - revision - used to exclude packed objects if requested by user
   - diff - just for optimization
 - for_each_packed_object
   - reachable - only to find recent objects
   - builtin/fsck - fixed in this commit
   - builtin/cat-file - see below

As described in the list above, builtin/fsck has been updated. I have
left builtin/cat-file alone; this means that cat-file
--batch-all-objects will only operate on objects physically in the repo.

An alternative design that I considered but rejected:

 - Adding a hook whenever a packed blob is requested, not on any blob.
   That is, whenever we attempt to search the packfiles for a blob, if
   it is missing (from the packfiles and from the loose object storage),
   to invoke the hook (which must then store it as a packfile), open the
   packfile the hook generated, and report that the blob is found in
   that new packfile. This reduces the amount of analysis needed (in
   that we only need to look at how packed blobs are handled), but
   requires that the hook generate packfiles (or for sha1_file to pack
   whatever loose objects are generated), creating one packfile for each
   missing blob and potentially very many packfiles that must be
   linearly searched. This may be tolerable now for repos that only have
   a few missing blobs (for example, repos that only want to exclude
   large blobs), and might be tolerable in the future if we have
   batching support for the most commonly used commands, but is not
   tolerable now for repos that exclude a large amount of blobs.

Signed-off-by: Jonathan Tan 
---
 Documentation/config.txt |  10 +++
 builtin/fsck.c   |   7 ++
 cache.h  |   7 ++
 sha1_file.c  | 171 +++
 t/t3907-missing-blob.sh  |  69 +++
 5 files changed, 250 insertions(+), 14 deletions(-)
 create mode 100755 t/t3907-missing-blob.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dd4beec39..10da5fde1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -390,6 +390,16 @@ The default is false, except linkgit:git-clone[1] or 
linkgit:git-init[1]
 will probe and set core.ignoreCase true if appropriate when the repository
 is created.
 
+core.missingBlobCommand::
+   If set, whenever a blob in the local repo is attempted to be
+   read but is missing, invoke this shel

[PATCH v3 3/4] sha1_file: consolidate storage-agnostic object fns

2017-06-15 Thread Jonathan Tan
In sha1_file.c, there are a few functions that provide information on an
object regardless of its storage (cached, loose, or packed). Looking
through all non-static functions in sha1_file.c that take in an unsigned
char * pointer, the relevant ones are:
 - sha1_object_info_extended
 - sha1_object_info (auto-fixed by sha1_object_info_extended)
 - read_sha1_file_extended (uses read_object)
 - read_object_with_reference (auto-fixed by read_sha1_file_extended)
 - has_sha1_file_with_flags
 - assert_sha1_type (auto-fixed by sha1_object_info)

Looking at the 3 primary functions (sha1_object_info_extended,
read_object, has_sha1_file_with_flags), they independently implement
mechanisms such as object replacement, retrying the packed store after
failing to find the object in the packed store then the loose store, and
being able to mark a packed object as bad and then retrying the whole
process. Consolidating these mechanisms would be a great help to
maintainability.

However, has_sha1_file_with_flags() does things that the other 2 don't
(skipping cached storage, allowing a "quick" mode that skips retrying
the packed storage after trying the loose storage, and refreshing any
loose files found).

Therefore, consolidate only the other 2 functions by extending
sha1_object_info_extended() to support the functionality needed, and
then modifying read_object() to use sha1_object_info_extended().

Signed-off-by: Jonathan Tan 
---
 cache.h |  1 +
 sha1_file.c | 84 ++---
 2 files changed, 43 insertions(+), 42 deletions(-)

diff --git a/cache.h b/cache.h
index 4d92aae0e..63a73af17 100644
--- a/cache.h
+++ b/cache.h
@@ -1835,6 +1835,7 @@ struct object_info {
off_t *disk_sizep;
unsigned char *delta_base_sha1;
struct strbuf *typename;
+   void **contentp;
 
/* Response */
enum {
diff --git a/sha1_file.c b/sha1_file.c
index a38319443..60b487c70 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2005,19 +2005,6 @@ int parse_sha1_header(const char *hdr, unsigned long 
*sizep)
return parse_sha1_header_extended(hdr, &oi, LOOKUP_REPLACE_OBJECT);
 }
 
-static void *unpack_sha1_file(void *map, unsigned long mapsize, enum 
object_type *type, unsigned long *size, const unsigned char *sha1)
-{
-   int ret;
-   git_zstream stream;
-   char hdr[8192];
-
-   ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr));
-   if (ret < Z_OK || (*type = parse_sha1_header(hdr, size)) < 0)
-   return NULL;
-
-   return unpack_sha1_rest(&stream, hdr, *size, sha1);
-}
-
 unsigned long get_size_from_delta(struct packed_git *p,
  struct pack_window **w_curs,
  off_t curpos)
@@ -2326,8 +2313,10 @@ static void *cache_or_unpack_entry(struct packed_git *p, 
off_t base_offset,
if (!ent)
return unpack_entry(p, base_offset, type, base_size);
 
-   *type = ent->type;
-   *base_size = ent->size;
+   if (type)
+   *type = ent->type;
+   if (base_size)
+   *base_size = ent->size;
return xmemdupz(ent->data, ent->size);
 }
 
@@ -2388,9 +2377,16 @@ int packed_object_info(struct packed_git *p, off_t 
obj_offset,
 * We always get the representation type, but only convert it to
 * a "real" type later if the caller is interested.
 */
-   type = unpack_object_header(p, &w_curs, &curpos, &size);
+   if (oi->contentp) {
+   *oi->contentp = cache_or_unpack_entry(p, obj_offset, oi->sizep,
+ &type);
+   if (!*oi->contentp)
+   type = OBJ_BAD;
+   } else {
+   type = unpack_object_header(p, &w_curs, &curpos, &size);
+   }
 
-   if (oi->sizep) {
+   if (!oi->contentp && oi->sizep) {
if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
off_t tmp_pos = curpos;
off_t base_offset = get_delta_base(p, &w_curs, &tmp_pos,
@@ -2679,8 +2675,10 @@ void *unpack_entry(struct packed_git *p, off_t 
obj_offset,
free(external_base);
}
 
-   *final_type = type;
-   *final_size = size;
+   if (final_type)
+   *final_type = type;
+   if (final_size)
+   *final_size = size;
 
unuse_pack(&w_curs);
 
@@ -2914,6 +2912,7 @@ static int sha1_loose_object_info(const unsigned char 
*sha1,
git_zstream stream;
char hdr[32];
struct strbuf hdrbuf = STRBUF_INIT;
+   unsigned long size_scratch;
 
if (oi->delta_base_sha1)
hashclr(oi->delta_base_sha1);
@@ -2926,7 +2925,7 @@ static int sha1_loose_object_info(const unsigned char 
*sha1,
 * return value implicitly indicates whether the
 * object even exists.
 */
-   if (!oi->typep && !oi->typename && !oi->sizep) {
+   i

[PATCH v3 0/4] Improvements to sha1_file

2017-06-15 Thread Jonathan Tan
Thanks - this has been updated following Junio's comments.

Patch 1 is unmodified from the previous version.

Patch 2 has been modified to remove the extraneous code pointed out by
Junio. I previously had an idea of populating those fields in
packed_object_info(), but later changed my mind, but a rebase went
wrong.

Patches 3-4 have been updated as I have described in [1] and [2].

[1] 
https://public-inbox.org/git/20170615111447.1208e...@twelve2.svl.corp.google.com/
[2] 
https://public-inbox.org/git/20170615111447.1208e...@twelve2.svl.corp.google.com/

As before, I would like review on patches 1-3 to go into the tree.
(Patch 4 is a work in progress, and is here just to demonstrate the
effectiveness of the refactoring.)

Jonathan Tan (4):
  sha1_file: teach packed_object_info about typename
  sha1_file: move delta base cache code up
  sha1_file: consolidate storage-agnostic object fns
  sha1_file, fsck: add missing blob support

 Documentation/config.txt |  10 +
 builtin/fsck.c   |   7 +
 cache.h  |   8 +
 sha1_file.c  | 474 ++-
 t/t3907-missing-blob.sh  |  69 +++
 5 files changed, 400 insertions(+), 168 deletions(-)
 create mode 100755 t/t3907-missing-blob.sh

-- 
2.13.1.518.g3df882009-goog



[PATCH v3 2/4] sha1_file: move delta base cache code up

2017-06-15 Thread Jonathan Tan
In a subsequent patch, packed_object_info() will be modified to use the
delta base cache, so move the relevant code to before
packed_object_info().

Signed-off-by: Jonathan Tan 
---
 sha1_file.c | 220 ++--
 1 file changed, 110 insertions(+), 110 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index a52b27541..a38319443 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2239,116 +2239,6 @@ static enum object_type packed_to_object_type(struct 
packed_git *p,
goto out;
 }
 
-int packed_object_info(struct packed_git *p, off_t obj_offset,
-  struct object_info *oi)
-{
-   struct pack_window *w_curs = NULL;
-   unsigned long size;
-   off_t curpos = obj_offset;
-   enum object_type type;
-
-   /*
-* We always get the representation type, but only convert it to
-* a "real" type later if the caller is interested.
-*/
-   type = unpack_object_header(p, &w_curs, &curpos, &size);
-
-   if (oi->sizep) {
-   if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
-   off_t tmp_pos = curpos;
-   off_t base_offset = get_delta_base(p, &w_curs, &tmp_pos,
-  type, obj_offset);
-   if (!base_offset) {
-   type = OBJ_BAD;
-   goto out;
-   }
-   *oi->sizep = get_size_from_delta(p, &w_curs, tmp_pos);
-   if (*oi->sizep == 0) {
-   type = OBJ_BAD;
-   goto out;
-   }
-   } else {
-   *oi->sizep = size;
-   }
-   }
-
-   if (oi->disk_sizep) {
-   struct revindex_entry *revidx = find_pack_revindex(p, 
obj_offset);
-   *oi->disk_sizep = revidx[1].offset - obj_offset;
-   }
-
-   if (oi->typep || oi->typename) {
-   enum object_type ptot;
-   ptot = packed_to_object_type(p, obj_offset, type, &w_curs,
-curpos);
-   if (oi->typep)
-   *oi->typep = ptot;
-   if (oi->typename) {
-   const char *tn = typename(ptot);
-   if (tn)
-   strbuf_addstr(oi->typename, tn);
-   }
-   if (ptot < 0) {
-   type = OBJ_BAD;
-   goto out;
-   }
-   }
-
-   if (oi->delta_base_sha1) {
-   if (type == OBJ_OFS_DELTA || type == OBJ_REF_DELTA) {
-   const unsigned char *base;
-
-   base = get_delta_base_sha1(p, &w_curs, curpos,
-  type, obj_offset);
-   if (!base) {
-   type = OBJ_BAD;
-   goto out;
-   }
-
-   hashcpy(oi->delta_base_sha1, base);
-   } else
-   hashclr(oi->delta_base_sha1);
-   }
-
-out:
-   unuse_pack(&w_curs);
-   return type;
-}
-
-static void *unpack_compressed_entry(struct packed_git *p,
-   struct pack_window **w_curs,
-   off_t curpos,
-   unsigned long size)
-{
-   int st;
-   git_zstream stream;
-   unsigned char *buffer, *in;
-
-   buffer = xmallocz_gently(size);
-   if (!buffer)
-   return NULL;
-   memset(&stream, 0, sizeof(stream));
-   stream.next_out = buffer;
-   stream.avail_out = size + 1;
-
-   git_inflate_init(&stream);
-   do {
-   in = use_pack(p, w_curs, curpos, &stream.avail_in);
-   stream.next_in = in;
-   st = git_inflate(&stream, Z_FINISH);
-   if (!stream.avail_out)
-   break; /* the payload is larger than it should be */
-   curpos += stream.next_in - in;
-   } while (st == Z_OK || st == Z_BUF_ERROR);
-   git_inflate_end(&stream);
-   if ((st != Z_STREAM_END) || stream.total_out != size) {
-   free(buffer);
-   return NULL;
-   }
-
-   return buffer;
-}
-
 static struct hashmap delta_base_cache;
 static size_t delta_base_cached;
 
@@ -2486,6 +2376,116 @@ static void add_delta_base_cache(struct packed_git *p, 
off_t base_offset,
hashmap_add(&delta_base_cache, ent);
 }
 
+int packed_object_info(struct packed_git *p, off_t obj_offset,
+  struct object_info *oi)
+{
+   struct pack_window *w_curs = NULL;
+   unsigned long size;
+   off_t curpos = obj_offset;
+   enum object_type type;
+
+   /*
+* We always get the representation type, but only convert i

Re: [PATCH v2 4/4] sha1_file, fsck: add missing blob support

2017-06-15 Thread Junio C Hamano
Jonathan Tan  writes:

> There is indeed no reason why we need to keep multiple ones separate for
> an extended period of time - my thinking was to let fetch/clone be fast
> by not needing to scan through the entire existing manifest (in order to
> create the new one),  letting GC take care of consolidating them ...

Given that fetch/clone already incur network cost and the users
expect to wait for them to finish, I wouldn't have made such a
trade-off.

>> > +int has_missing_blob(const unsigned char *sha1, unsigned long *size)
>> > +{
>> 
>> This function that answers "is it expected to be missing?" is
>> confusingly named.  Is it missing, or does it exist?
>
> Renamed to in_missing_blob_manifest().

Either that, or "is_known_to_be_missing()", would be OK.

Thanks.


Re: [WIP v2 2/2] pack-objects: support --blob-max-bytes

2017-06-15 Thread Jonathan Tan
On Thu, 15 Jun 2017 16:28:24 -0400
Jeff Hostetler  wrote:

> I agree with Peff here.  I've been working on my partial/narrow/sparse
> clone/fetch ideas since my original RFC and have come to the conclusion
> that the server can do the size limiting efficiently, but we should
> leave the pathname filtering to the client.  That is, let the client
> get the commits and trees and then locally apply pattern matching,
> whether that be a sparse-checkout definition or simple ".git*"
> matching and then make a later request for the "blobs of interest".

This means that the client would need to inflate all the trees it
received, but that might not be that bad.

> Whether we "fault-in" the missing blobs or have a "fetch-blobs"
> command (like fetch-pack) to get them is open to debate.
> 
> There are concerns about the size of the requested blob-id list in a
> fetch-blobs approach, but I think there are ways to say I need all
> of the blobs referenced by the directory /foo in commit  (and
> optionally, that aren't present in directory /foo in commit 
> or in the range ..).  (handwave)

Unless the server no longer has the relevant commit (e.g. because a
branch was rewound), but even then, even if we only sent blob hashes,
the list would be probably much smaller than the downloaded pack anyway,
so things might still be OK.


[PATCH v3 0/2] Add a FREE_AND_NULL() wrapper macro

2017-06-15 Thread Ævar Arnfjörð Bjarmason
On Thu, Jun 15 2017, Ævar Arnfjörð Bjarmason jotted:
> I'll change it to FREE_AND_NULL and submit my patch as-is, my reading
> of the rest of this thread is that making it a function instead of a
> macro would be interesting, but has its own caveats that are likely
> better considered as part of its own series, whereas this just changes
> existing code to its macro-expanded functional equivalent.

Here's v3 with that change. Nothing but the macro name (and comments,
commit messages etc. referring to it) have changed.

Ævar Arnfjörð Bjarmason (2):
  git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr =
NULL
  *.[ch] refactoring: make use of the FREE_AND_NULL() macro

 alias.c  |  6 ++
 apply.c  |  3 +--
 attr.c   |  6 ++
 blame.c  |  3 +--
 branch.c |  3 +--
 builtin/am.c | 18 +-
 builtin/clean.c  |  6 ++
 builtin/config.c |  6 ++
 builtin/index-pack.c |  6 ++
 builtin/pack-objects.c   | 12 
 builtin/unpack-objects.c |  3 +--
 builtin/worktree.c   |  6 ++
 commit-slab.h|  3 +--
 commit.c |  3 +--
 config.c |  3 +--
 credential.c |  9 +++--
 diff-lib.c   |  3 +--
 diff.c   |  6 ++
 diffcore-rename.c|  6 ++
 dir.c|  9 +++--
 fast-import.c|  6 ++
 git-compat-util.h|  6 ++
 gpg-interface.c  | 15 +--
 grep.c   | 12 
 help.c   |  3 +--
 http-push.c  | 24 
 http.c   | 15 +--
 imap-send.c  |  3 +--
 line-log.c   |  6 ++
 ll-merge.c   |  3 +--
 mailinfo.c   |  3 +--
 object.c |  3 +--
 pathspec.c   |  3 +--
 prio-queue.c |  3 +--
 read-cache.c |  6 ++
 ref-filter.c |  3 +--
 refs/files-backend.c |  3 +--
 refs/ref-cache.c |  3 +--
 remote-testsvn.c |  3 +--
 rerere.c |  3 +--
 sequencer.c  |  3 +--
 sha1-array.c |  3 +--
 sha1_file.c  |  3 +--
 split-index.c|  3 +--
 transport-helper.c   | 27 +--
 transport.c  |  3 +--
 tree-diff.c  |  6 ++
 tree-walk.c  |  3 +--
 tree.c   |  3 +--
 49 files changed, 103 insertions(+), 197 deletions(-)

-- 
2.13.1.508.gb3defc5cc



[PATCH v3 1/2] git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL

2017-06-15 Thread Ævar Arnfjörð Bjarmason
Add a FREE_AND_NULL() wrapper marco for the common pattern of freeing
a pointer and assigning NULL to it right afterwards.

The implementation is similar to the (currently unused) XDL_PTRFREE
macro in xdiff/xmacros.h added in commit 3443546f6e ("Use a *real*
built-in diff generator", 2006-03-24). The only difference is that
free() is called unconditionally, see [1].

See [2] for a suggested alternative which does this via a function
instead of a macro. As covered in replies to that message, while it's
a viable approach, it would introduce caveats which this approach
doesn't have, so that potential change is left to a future follow-up
change.

This merely allows us to translate exactly what we're doing now to a
less verbose & idiomatic form using a macro, while guaranteeing that
we don't introduce any functional changes.

1. 
   
(http://public-inbox.org/git/alpine.DEB.2.20.1608301948310.129229@virtualbox/)

2. <20170610032143.GA7880@starla>
   (https://public-inbox.org/git/20170610032143.GA7880@starla/)

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 git-compat-util.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 51ba4e6b3b..047172d173 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -808,6 +808,12 @@ extern char *xgetcwd(void);
 extern FILE *fopen_for_writing(const char *path);
 extern FILE *fopen_or_warn(const char *path, const char *mode);
 
+/*
+ * FREE_AND_NULL(ptr) is like free(ptr) followed by ptr = NULL. Note
+ * that ptr is used twice, so don't pass e.g. ptr++.
+ */
+#define FREE_AND_NULL(p) do { free(p); (p) = NULL; } while (0)
+
 #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult(sizeof(*(x)), (alloc)))
 #define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult(sizeof(*(x)), 
(alloc)))
 
-- 
2.13.1.508.gb3defc5cc



[PATCH v3 2/2] *.[ch] refactoring: make use of the FREE_AND_NULL() macro

2017-06-15 Thread Ævar Arnfjörð Bjarmason
Replace occurrences of `free(ptr); ptr = NULL` with
`FREE_AND_NULL(ptr)`. This introduces no functional changes, but
reduces the line count and establishes this pattern as a common idiom
with a wrapper macro.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 alias.c  |  6 ++
 apply.c  |  3 +--
 attr.c   |  6 ++
 blame.c  |  3 +--
 branch.c |  3 +--
 builtin/am.c | 18 +-
 builtin/clean.c  |  6 ++
 builtin/config.c |  6 ++
 builtin/index-pack.c |  6 ++
 builtin/pack-objects.c   | 12 
 builtin/unpack-objects.c |  3 +--
 builtin/worktree.c   |  6 ++
 commit-slab.h|  3 +--
 commit.c |  3 +--
 config.c |  3 +--
 credential.c |  9 +++--
 diff-lib.c   |  3 +--
 diff.c   |  6 ++
 diffcore-rename.c|  6 ++
 dir.c|  9 +++--
 fast-import.c|  6 ++
 gpg-interface.c  | 15 +--
 grep.c   | 12 
 help.c   |  3 +--
 http-push.c  | 24 
 http.c   | 15 +--
 imap-send.c  |  3 +--
 line-log.c   |  6 ++
 ll-merge.c   |  3 +--
 mailinfo.c   |  3 +--
 object.c |  3 +--
 pathspec.c   |  3 +--
 prio-queue.c |  3 +--
 read-cache.c |  6 ++
 ref-filter.c |  3 +--
 refs/files-backend.c |  3 +--
 refs/ref-cache.c |  3 +--
 remote-testsvn.c |  3 +--
 rerere.c |  3 +--
 sequencer.c  |  3 +--
 sha1-array.c |  3 +--
 sha1_file.c  |  3 +--
 split-index.c|  3 +--
 transport-helper.c   | 27 +--
 transport.c  |  3 +--
 tree-diff.c  |  6 ++
 tree-walk.c  |  3 +--
 tree.c   |  3 +--
 48 files changed, 97 insertions(+), 197 deletions(-)

diff --git a/alias.c b/alias.c
index 3b90397a99..911481855f 100644
--- a/alias.c
+++ b/alias.c
@@ -47,8 +47,7 @@ int split_cmdline(char *cmdline, const char ***argv)
src++;
c = cmdline[src];
if (!c) {
-   free(*argv);
-   *argv = NULL;
+   FREE_AND_NULL(*argv);
return -SPLIT_CMDLINE_BAD_ENDING;
}
}
@@ -60,8 +59,7 @@ int split_cmdline(char *cmdline, const char ***argv)
cmdline[dst] = 0;
 
if (quoted) {
-   free(*argv);
-   *argv = NULL;
+   FREE_AND_NULL(*argv);
return -SPLIT_CMDLINE_UNCLOSED_QUOTE;
}
 
diff --git a/apply.c b/apply.c
index 854faa6779..e78de0affa 100644
--- a/apply.c
+++ b/apply.c
@@ -3705,8 +3705,7 @@ static int check_preimage(struct apply_state *state,
  is_new:
patch->is_new = 1;
patch->is_delete = 0;
-   free(patch->old_name);
-   patch->old_name = NULL;
+   FREE_AND_NULL(patch->old_name);
return 0;
 }
 
diff --git a/attr.c b/attr.c
index 821203e2a9..ebdcfb0b8a 100644
--- a/attr.c
+++ b/attr.c
@@ -638,13 +638,11 @@ void attr_check_reset(struct attr_check *check)
 
 void attr_check_clear(struct attr_check *check)
 {
-   free(check->items);
-   check->items = NULL;
+   FREE_AND_NULL(check->items);
check->alloc = 0;
check->nr = 0;
 
-   free(check->all_attrs);
-   check->all_attrs = NULL;
+   FREE_AND_NULL(check->all_attrs);
check->all_attrs_nr = 0;
 
drop_attr_stack(&check->stack);
diff --git a/blame.c b/blame.c
index 843c845cba..1183943960 100644
--- a/blame.c
+++ b/blame.c
@@ -314,8 +314,7 @@ static void fill_origin_blob(struct diff_options *opt,
 static void drop_origin_blob(struct blame_origin *o)
 {
if (o->file.ptr) {
-   free(o->file.ptr);
-   o->file.ptr = NULL;
+   FREE_AND_NULL(o->file.ptr);
}
 }
 
diff --git a/branch.c b/branch.c
index 985316eb76..2347cb8649 100644
--- a/branch.c
+++ b/branch.c
@@ -24,8 +24,7 @@ static int find_tracked_branch(struct remote *remote, void 
*priv)
} else {
free(tracking->spec.src);
if (tracking->src) {
-   free(tracking->src);
-   tracking->src = NULL;
+   FREE_AND_NULL(tracking->src);
}
}
tracking->spec.src = NULL;
diff --git a/builtin/am.c b/builtin/am.c
index 8881d73615..80368b6fe6 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -483,8 +48

  1   2   >