Re: Is there any way to "interrupt" a rebase?

2018-02-19 Thread Jeff King
On Mon, Feb 19, 2018 at 11:35:25AM -0800, Hilco Wijbenga wrote:

> git rebase --onto base-branch HEAD~7
> commit A --> conflicts
> ... lots of work ...
> commit B --> conflicts
> ... lots of work ...
> commit C (Git handles conflicts)
> commit D (no conflict)
> commit E --> conflicts
> ... er, that should have been fixed in commit C
> 
> How do I keep all the work I did for commits A and B? I get the
> impression that rerere does not help here because I did not finish the
> rebase succesfully (and that makes perfect sense, of course). Is there
> a way at this point in the rebase to "go back" to commit C (so without
> "git rebase --abort")?

rerere should help, since the resolutions are stored when you run "git
commit". Of course, sometimes there are changes outside of the
conflicted regions that are necessary.

One thing you can do is to repeat the rebase, but simply pick the
resolved state at each step. E.g.:

  # remember the failed attempt; you could also store the sha1 in an
  # environment variable, or even store individual commits.
  git tag failed

  # now abort and retry the rebase
  git rebase --abort
  git rebase -i

  # we should have stopped on commit A with conflicts. Now we can say
  # "make the tree just like the other time we saw A".
  git checkout failed~4 -- .

  # or if you want to review the changes, try this:
  git checkout -p failed~4

And so on. You visit each commit as before, but you can always grab your
previous work (or parts of it, with pathspec limiting or "-p") instead
of repeating the work.

I often do something like this for complicated merges (e.g., of two
long-running branches). I try the merge first, only to find that some
preparatory steps would make it a lot easier. So I commit the merged
result (even sometimes half-finished), "reset --hard" back to the
original, do the early steps, and then re-merge. And then I "checkout
-p" to pick my work out of the earlier failed attempt.

> (Surely, it's not as simple as doing a "git reset --hard
> sha-of-commit-C" is it?)

Not quite that simple, but that's another approach. If you "git reset
--hard" you'll lose D and E, since the rebase has already applied them.
But you could then replay them manually, like this:

  # go back to C and fix it
  git reset --hard C
  fix fix fix
  git commit --amend -m 'fixed C'

  # now replay the other commits on top
  git cherry-pick D
  git cherry-pick E

  # and then continue on with any other rebased patches
  git rebase --continue

-Peff


Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-02-19 Thread Igor Djordjevic
Hi Sergey,

On 16/02/2018 14:08, Sergey Organov wrote:
> 
> By accepting the challenges raised in recent discussion of advanced
> support for history rebasing and editing in Git, I hopefully figured out
> a clean and elegant method of rebasing merges that I think is "The Right
> Way (TM)" to perform this so far troublesome operation. ["(TM)" here has
> second meaning: a "Trivial Merge (TM)", see below.]
> 
> Let me begin by outlining the method in git terms, and special thanks
> here must go to "Johannes Sixt"  for his original bright
> idea to use "cherry-pick -m1" to rebase merge commits.
> 
> End of preface -- here we go.
> 
> Given 2 original branches, b1 and b2, and a merge commit M that joins
> them, suppose we've already rebased b1 to b1', and b2 to b2'. Suppose
> also that B1' and B2' happen to be the tip commits on b1' and b2',
> respectively.
> 
> To produce merge commit M' that joins b1' and b2', the following
> operations will suffice:
> 
> 1. Checkout b2' and cherry-pick -m2 M, to produce U2' (and new b2').
> 2. Checkout b1' and cherry-pick -m1 M, to produce U1' (and new b1').
> 3. Merge --no-ff new b2' to current new b1', to produce UM'.
> 4. Get rid of U1' and U2' by re-writing parent references of UM' from
>U1' and U2' to  B1' and B2', respectively, to produce M'.
> 5. Mission complete.
> 
> Let's now see why and how the method actually works.
> 
> Firs off, let me introduce you to my new friend, the Trivial Merge, or
> (TM) for short. By definition, (TM) is a merge that introduces
> absolutely no differences to the sides of the merge. (I also like to
> sometimes call him "Angel Merge", both as the most beautiful of all
> merges, and as direct antithesis to "evil merge".)
> 
> One very nice thing about (TM) is that to safely rebase it, it suffices
> to merge its (rebased) parents. It is safe in this case, as (TM) itself
> doesn't posses any content changes, and thus none could be missed by
> replacing it with another merge commit.
> 
> I bet most of us have never seen (TM) in practice though, so let's see
> how (TM) can help us handle general case of some random merge. What I'm
> going to do is to create a virtual (TM) and see how it goes from there.
> 
> Let's start with this history:
> 
>   M
>  / \
> B1  B2
> 
> And let's transform it to the following one, contextually equivalent to
> the original, by introducing 2 simple utility commits U1 and U2, and a
> new utility merge commit UM:
> 
>   UM
>  /  \
> U1   U2
> ||
> B1   B2
> 
> Here content of any of the created UM, U1, and U2 is the same, and is
> exact copy of original content of M. I.e., provided [A] denotes
> "content of commit A", we have:
> 
> [UM] = [U1] = [U2] = [M]
> 
> Stress again how these changes to the history preserve the exact content
> of the original merge ([UM] = [M]), and how U1 an U2 represent content
> changes due to merge on either side[*], and how neither preceding nor
> subsequent commits content would be affected by the change of
> representation.
> 
> Now observe that as [U1] = [UM], and [U2] = [UM], the UM happens to be
> exactly our new friend -- the "Trivial Merge (TM)" his true self,
> introducing zero changes to content.
> 
> Next we rebase our new representation of the history and we get:
> 
>   UM'
>  /  \
> U1'  U2'
> ||
> B1'  B2'
> 
> Here UM' is bare merge of U1' and U2', in exact accordance with the
> method of rebasing a (TM) we've already discussed above, and U1' and U2'
> are rebased versions of U1 and U2, obtained by usual rebasing methods
> for non-merge commits.
> 
> (Note, however, that at this point UM' is not necessarily a (TM)
> anymore, so in real implementation it may make sense to check if UM' is
> not a (TM) and stop for possible user amendment.)
> 
> Finally, to get to our required merge commit M', we get the content of
> UM' and record two actual parents of the merge:
> 
>   M'
>  / \
> B1' B2'
> 
> Where [M'] = [UM'].
> 
> That's it. Mission complete.
> 
> I expect the method to have the following nice features:
> 
> - it carefully preserves user changes by rebasing the merge commit
> itself, in a way that is semantically similar to rebasing simple
> (non-merge) commits, yet it allows changes made to branches during
> history editing to propagate over corresponding merge commit that joins
> the branches, even automatically when the changes don't conflict, as
> expected.
> 
> - it has provision for detection of even slightest chances of ending up
> with surprising merge (just check if UM' is still (TM)), so that
> implementation could stop for user inspection and amendment when
> appropriate, yet it is capable of handling trivial cases smoothly and
> automatically.
> 
> - it never falls back to simple invocation of merge operation on rebased
> original branches themselves, thus avoiding the problem of lack of
> knowledge of how the merge at hand has been performed in the first
> place. It doesn't prevent implementation from letting user to manually
> 

Re: Is there any way to "interrupt" a rebase?

2018-02-19 Thread Hilco Wijbenga
On Mon, Feb 19, 2018 at 2:36 PM, brian m. carlson
 wrote:
> On Mon, Feb 19, 2018 at 11:35:25AM -0800, Hilco Wijbenga wrote:
>> So a scenario like this:
>>
>> my-branch : X -> A -> B -> C -> D -> E -> F -> G
>> base-branch : X -> Y
>>
>> git rebase --onto base-branch HEAD~7
>> commit A --> conflicts
>> ... lots of work ...
>> commit B --> conflicts
>> ... lots of work ...
>> commit C (Git handles conflicts)
>> commit D (no conflict)
>> commit E --> conflicts
>> ... er, that should have been fixed in commit C
>>
>> How do I keep all the work I did for commits A and B? I get the
>> impression that rerere does not help here because I did not finish the
>> rebase succesfully (and that makes perfect sense, of course). Is there
>> a way at this point in the rebase to "go back" to commit C (so without
>> "git rebase --abort")?
>
> What I do in this case is I unstage all the changes from the index, make
> the change that should have gone into commit C, use git commit --fixup
> (or --squash), and then restage the rest of the changes and continue
> with the rebase.  I can then use git rebase -i --autosquash afterwards
> to insert the commit into the right place.

Yes, that's essentially what I end up doing too. Obviously, in cases
like this, Murphy likes to drop by so commit D will have made changes
to the same files as commit C and you can't cleanly move the fix-up
commit to commit C. :-( I had hoped there might be an easier/cleaner
way to do it.


Re: Git should preserve modification times at least on request

2018-02-19 Thread Theodore Ts'o
On Mon, Feb 19, 2018 at 11:08:19PM +0100, Peter Backes wrote:
> Is thetre some existing code that could be used? I think I read 
> somewhere that git once did preserve mtimes, but that this code was 
> removed because of the build tool issues. Perhaps that code could 
> simply be put back in, and surrounded by conditions.

I don't believe that was ever true, because the mod times is simply
not *stored* anywhere.

You might want to consider trying to implement it as hook scripts
first, and see how well/poorly it works for you.  I do have a use
case, which is to maintain the timestamps for guilt (a quilt-like
patch management system which uses git).  At the moment I just use a
manual script, save-timestamps, which looks like this:

#!/bin/sh
stat -c "touch -d @%Y %n" * | sort -k 3 | grep -v "~$" | sort -k3 > timestamps

and then I just include the timestamps file in thhe commit.  When I
unpack the file elsewhere, I just run the command ". timestamps", or
if I am manually editing a single file, I might do:

grep file-name-of-patch timestamps | sht

This works because the timestamps file has lines which look like
this:

touch -d @1519007593 jbd2-clarify-recovery-checksum-error-msg

I've been too lazy to automate this using a "pre-commit" and
"post-checkout" hook, but it *really* wouldn't be that hard.  Right
now it also only works for files in the top-level of the repo, which
is all I have in my guilt patch repo.  Making this work in a
multiple-directory environment is also left as an exercise to the
reader.  :-)

Cheers,

- Ted

P.S.  Also left to the reader is making it work on legacy OS's like
Windows.  :-)


Re: Fetch-hooks

2018-02-19 Thread Jacob Keller
On Mon, Feb 19, 2018 at 2:50 PM, Leo Gaspard  wrote:
> On 02/19/2018 10:23 PM, Jeff King wrote:
>> [...]
>> If you do go this route, please model it after "pre-receive" rather than
>> "update". We had "update" originally but found it was too limiting for
>> hooks to see only one ref at a time. So we introduced pre-receive. The
>> "update" hook remains for historical reasons, but I don't think we'd
>> want to reproduce the mistake. :)
>
> Hmm, what bothered me with “pre-receive” was that it was an
> all-or-nothing decision, without the ability to allow some references
> through and not others.
>
> Is there a way for “pre-receive” to individually filter hooks? I was
> under the impression that the only way to do that was to use the
> “update” hook, which was the reason I wanted to model it after “update”
> rather than “pre-receive” (my use case being a check independent for
> each pushed ref)

At least in the "push" case I think there is value in saying "if one
ref fails, please fail the entire push, always". This makes sense,
because if a user happens to violate some rule in one thing, they very
likely may not wish any of the push to succeed, and it also creates
problems with whether a push is atomic or not.

For fetch, I could see either method being useful.

Thanks,
Jake


Re: Git should preserve modification times at least on request

2018-02-19 Thread Hilco Wijbenga
On Mon, Feb 19, 2018 at 2:37 PM, Randall S. Becker
 wrote:
> On February 19, 2018 4:58 PM Johannes wrote:
>> On Mon, 19 Feb 2018, Peter Backes wrote:
>>
>> > please ensure to CC me if you reply as I am not subscribed to the list.
>> >
>> > https://git.wiki.kernel.org/index.php/Git_FAQ#Why_isn.27t_Git_preservi
>> > ng_modification_time_on_files.3F argues that git isn't preserving
>> > modification times because it needs to ensure that build tools work
>> > properly.
>> >
>> > I agree that modification times should not be restored by default,
>> > because of the principle of least astonishment. But should it be
>> > impossible? The principle of least astonishment does not mandate this;
>> > it is not a paternalistic principle.
>> >
>> > Thus, I do not get at all
>> > - why git doesn't *store* modification times, perhaps by default, but
>> > at least on request
>> > - why git doesn't restore modification times *on request*
>> >
>> > It is pretty annoying that git cannot, even if I know what I am doing,
>> > and explicitly want it to, preserve the modification time.
>> >
>> > One use case: I have lots of file lying around in my build directory
>> > and for some of them, the modification time in important information
>> > to me. Those files are not at all used with the build tool. In
>> > contrast to git pull, git pull --rebase needs those to be stashed. But
>> > after the pull and unstash, the mtime is gone. Boo.
>> >
>> > Please provide options to store and restore modification times. It
>> > shouldn't be hard to do, given that other metadata such as the mode is
>> > already stored. It would make live so much easier. And the fact that
>> > this has made into the FAQ clearly suggests that there are many others
>> > who think so.
>>
>> Since you already assessed that it shouldn't be hard to do, you probably
>> want to put your money where your mouth is and come up with a patch, and
>> then offer it up for discussion on this here mailing list.
>
> Putting my large-production-user hat on, there are (at least) three
> conditions that exist in this space:
>
> 1. Build systems - this typically need the file modification time to be set
> to the time at which git touches a file (e.g., checkout). This permits build
> systems to detect that files are modified (even if an older version is
> checked out, make, for example, still needs to see the change to initiate a
> build. My understanding is that current git behaviour is modeled on this use
> case.
>
> 2. Commit linkage - in some environments, files that are checked out are set
> to the timestamp of the commit rather than the original file time or the
> checkout time. This permits a faster production resolution of when changes
> were run through the system as a group. I have implemented this strategy
> (somewhat grudgingly) in a few places. It is a possible desire for some
> users. I particularly dislike this approach because merge/cherry-pick/rebase
> can mess with the preceptive "when" of a change and if you are going to do
> this, make sure that your metadata is suitably managed.
>
> 3. Original file times - as Peter asked, storing the original file time has
> some legacy advantages. This emulates the behaviour of some legacy SCM
> systems and makes people feel better about things. From an audit point of
> view, this has value for systems other than git. In git, you use the
> hash-object to figure out what the file really is, so there is no real audit
> need anymore for timestamps, which can be spoofed at whim anyway. The
> hash-object comment applies to 2 also. Same comment here for dealing with
> non-touching but modifying. For example: what is the timestamp on a
> merge-squash? I would contend that it is the time of the merge-squash, not
> the original time. It could also be an interim term, when a conflict was
> resolved.
>
> Just remember that #2 and #3 break #1, unless you essentially rebuild from
> scratch in every build (ant/maven models). With that said, I seen many repo
> admins who want all of the above, so making them all available would make
> their lives easier.

Aside from exactly which modification times should be used (which I
would love to have a bit more control over as well), something else
I'd like to see is that, when switching between branches, files that
are the same on both branches should not have their modification time
changed.


NEED TOOLS

2018-02-19 Thread sophie_rodrigo
hhh


Re: Fetch-hooks

2018-02-19 Thread Jeff King
On Mon, Feb 19, 2018 at 11:50:37PM +0100, Leo Gaspard wrote:

> On 02/19/2018 10:23 PM, Jeff King wrote:
> > [...]
> > If you do go this route, please model it after "pre-receive" rather than
> > "update". We had "update" originally but found it was too limiting for
> > hooks to see only one ref at a time. So we introduced pre-receive. The
> > "update" hook remains for historical reasons, but I don't think we'd
> > want to reproduce the mistake. :)
> 
> Hmm, what bothered me with “pre-receive” was that it was an
> all-or-nothing decision, without the ability to allow some references
> through and not others.
> 
> Is there a way for “pre-receive” to individually filter hooks? I was
> under the impression that the only way to do that was to use the
> “update” hook, which was the reason I wanted to model it after “update”
> rather than “pre-receive” (my use case being a check independent for
> each pushed ref)

No, pre-receive is always atomic. However, that may actually be what you
want, if the point is to keep untrusted data out of the repository. By
rejecting the whole thing, we could in theory keep the objects from
entering the repository at all. This is how the push side works via the
"quarantine" system (which is explained in git-receive-pack(1)).
Fetching doesn't currently quarantine objects, but it probably wouldn't
be very hard to make it do so.

-Peff


[no subject]

2018-02-19 Thread Alfred Cheuk Yu Chow




Good Day,

This is the second time i am sending you this mail.

I am Mr. Alfred Cheuk Yu Chow, the Director for Credit & Marketing Chong
Hing Bank, Hong Kong, need your alliance in a deal that will be of  
mutual benefit.


Email me personally for more details.

Regards.







[PATCH v1 3/3] add -p: optimize line selection for short hunks

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

If there are fewer than ten changes in a hunk then make spaces
optional when selecting individual lines. This means that for short
hunks one can just type -357 to stage lines 1, 2, 3, 5 & 7.

Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl  | 30 ++
 t/t3701-add-interactive.sh |  2 +-
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
0e3960b1ecf004bff51d28d540f685a5dc91fad1..3d9720af03eb113c7f3d2b73f17b4b51a7685bf3
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -1067,6 +1067,33 @@ sub check_hunk_label {
return 1;
 }
 
+sub split_hunk_selection {
+   local $_;
+   my @fields = @_;
+   my @ret;
+   for (@fields) {
+   if (/^(-[0-9])(.*)/) {
+   push @ret, $1;
+   $_ = $2;
+   }
+   while ($_ ne '') {
+   if (/^[0-9]-$/) {
+   push @ret, $_;
+   last;
+   } elsif (/^([0-9](?:-[0-9])?)(.*)/) {
+   push @ret, $1;
+   $_ = $2;
+   } else {
+   error_msg sprintf
+   __("invalid hunk line '%s'\n"),
+   substr($_, 0, 1);
+   return ();
+   }
+   }
+   }
+   return @ret;
+}
+
 sub parse_hunk_selection {
local $_;
my ($hunk, $line) = @_;
@@ -1085,6 +1112,9 @@ sub parse_hunk_selection {
}
}
}
+   if ($max_label < 10) {
+   @fields = split_hunk_selection(@fields) or return undef;
+   }
for (@fields) {
if (/^([0-9]*)-([0-9]*)$/) {
if ($1 eq '' and $2 eq '') {
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 
4ae706fd121f157e9cbd93ec293f45ce2a3a53b5..c6d847dc495c92782e37ef7b0e2800d7936aabd7
 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -390,7 +390,7 @@ test_expect_success 'setup expected diff' '
 '
 
 test_expect_success 'can reset individual lines of patch' '
-   printf "%s\n" l "^1 3" |
+   printf "%s\n" l ^13 |
EDITOR=: git reset -p 2>error &&
test_must_be_empty error &&
git diff --cached HEAD | sed /^index/d >actual &&
-- 
2.16.1



[PATCH v1 2/3] add -p: allow line selection to be inverted

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

If the list of lines to be selected begins with '^' select all the
lines except the ones listed.

Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl  | 15 ++-
 t/t3701-add-interactive.sh |  2 +-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
8a33796e1f6a564d0a27ba06c216dbb9848827b9..0e3960b1ecf004bff51d28d540f685a5dc91fad1
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -1070,9 +1070,21 @@ sub check_hunk_label {
 sub parse_hunk_selection {
local $_;
my ($hunk, $line) = @_;
-   my $max_label = $hunk->{MAX_LABEL};
+   my ($max_label, $invert) = ($hunk->{MAX_LABEL}, undef);
my @selected = (0) x ($max_label + 1);
my @fields = split(/[,\s]+/, $line);
+   if ($fields[0] =~ /^\^(.*)/) {
+   $invert = 1;
+   if ($1 ne '') {
+   $fields[0] = $1;
+   } else {
+   shift @fields;
+   unless (@fields) {
+   error_msg __("no lines to invert\n");
+   return undef;
+   }
+   }
+   }
for (@fields) {
if (/^([0-9]*)-([0-9]*)$/) {
if ($1 eq '' and $2 eq '') {
@@ -1093,6 +1105,7 @@ sub parse_hunk_selection {
return undef;
}
}
+   $invert and @selected = map { !$_ } @selected;
return \@selected;
 }
 
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 
caa80327c461785949eb2b9c919c253f4bef72cc..4ae706fd121f157e9cbd93ec293f45ce2a3a53b5
 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -390,7 +390,7 @@ test_expect_success 'setup expected diff' '
 '
 
 test_expect_success 'can reset individual lines of patch' '
-   printf "%s\n" l 2 |
+   printf "%s\n" l "^1 3" |
EDITOR=: git reset -p 2>error &&
test_must_be_empty error &&
git diff --cached HEAD | sed /^index/d >actual &&
-- 
2.16.1



Re: [PATCH v1 0/3] add -p: select individual hunk lines

2018-02-19 Thread Gustavo Leite
2018-02-19 8:36 GMT-03:00 Phillip Wood :
>
> "When I end up editing hunks it is almost always because I want to
> stage a subset of the lines in the hunk. Doing this by editing the
> hunk is inconvenient and error prone (especially so if the patch is
> going to be reversed before being applied). Instead offer an option
> for add -p to stage individual lines. When the user presses 'l' the
> hunk is redrawn with labels by the insertions and deletions and they
> are prompted to enter a list of the lines they wish to stage. Ranges
> of lines may be specified using 'a-b' where either 'a' or 'b' may be
> omitted to mean all lines from 'a' to the end of the hunk or all lines
> from 1 upto and including 'b'."

This is an interesting (and needed feature). Would be nice to see it merged.

--
Gustavo Leite


מאת: Joy Kone

2018-02-19 Thread joy kone
מאת: Joy Kone

שלחתי לך את האימייל הזה בגלל הצורך לפתוח איתך דיונים. אני לא רוצה
שתביני בטעות את ההצעה הזאת בכל היבט ... אם זה בסדר איתך, אני מבקש את
שיתוף הפעולה המלא שלך. פניתי אליך על בסיס אמון כדי להתמודד עם ההשקעה
במדינה שלך / חברה בשמי כשותף פוטנציאלי.

שמי ג'וי קוני. אזרח אבל מתגורר כאן. זה עשוי לעניין אותך לדעת שיש לי US
$ 10.500.000.00 שהופקדו עם מוסד פיננסי כדי להיות השקיעו במדינה שלך /
החברה. זה רלוונטי ליידע אותי אם אתה יכול להתמודד עם הקרן הזו / השקעה
איתך בארץ שלך כדי לספק לך את כל הפרטים הדרושים על המוסד הפיננסי לקבלת
מידע נוסף. בינתיים, אני מאוד ישר במגע שלי עם אנשים ואני גם לדרוש את
אותו ממך כשותף להיות. אני יכול לסמוך עליך עם הקרן הזו?

אני רוצה לציין כי זהו מיזם עסקי הדדי כמו לא כאן הוא גמול על העזרה שלך.
אני אודיע לך את התועלת שלך על העזרה שלך כפי שאנו ממשיכים. לקבלת פרטים
מקיפים יותר ומקור הקרן, אנא צרו איתי קשר בהקדם האפשרי. אם אתה מוצא את
המכתב הזה פוגע, בבקשה להתעלם ממנו ולקבל את ההתנצלויות שלי.

בברכה,
ג'וי קון


Re: Why git-revert doesn't invoke the pre-commit and the commit-msg hooks?

2018-02-19 Thread Gustavo Chaves
I asked this question on StackOverflow and got an answer:
https://stackoverflow.com/q/48852925/114983

The problem is that git-revert invokes git-commit with the -n flag,
explicitly avoiding the pre-commit and the commit-msg hooks.

This was originally introduced on commit 9fa4db544e2e, by Junio
Hamano, in 2005! The rationale in the commit message was the
following:

>> Do not verify reverted/cherry-picked/rebased patches.

>> The original committer may have used validation criteria that is less
>> stricter than yours.  You do not want to lose the changes even if they
>> are done in substandard way from your 'commit -v' verifier's point of
>> view.

I get it, but since by default you are allowed to edit the commit
message during a git-revert I think there's a case to be made to make
the pre-commit and the commit-msg being invoked by default. Also,
git-revert introduces new lines in the original commit message, and
they could be used to trigger specific checks, such as the one I
wanted to implement, to deny commits reverting merge-commits.

Shouldn't git-revert work exactly as git-commit? Instead of disabling
hooks by default, it could accept the --no-verify flag just like
git-commit to disable the hooks if the user wants it.

-- 
Gustavo Chaves


Re: GSoC 2018: Git has been accepted as a mentor organization!

2018-02-19 Thread Johannes Schindelin
Hi Christian,

On Sun, 18 Feb 2018, Christian Couder wrote:

> Just a quick message to let everyone know that Git has been accepted
> as a mentor organization for the Google Summer of Code 2018.

Nice!

> Dscho, I just sent you an invite as a mentor, as I think you said you
> are ok to mentor.

Yes, I just accepted.

Ciao,
Dscho


Re: [PATCH 0/4] Correct offsets of hunks when one is skipped

2018-02-19 Thread Phillip Wood
On 13/02/18 23:56, brian m. carlson wrote:
> On Tue, Feb 13, 2018 at 10:44:04AM +, Phillip Wood wrote:
>> From: Phillip Wood 
>>
>> While working on a patch series to stage selected lines from a hunk
>> without having to edit it I got worried that subsequent patches would
>> be applied in the wrong place which lead to this series to correct the
>> offsets of hunks following those that are skipped or edited.
>>
>> Phillip Wood (4):
>>   add -i: add function to format hunk header
>>   t3701: add failing test for pathological context lines
>>   add -p: Adjust offsets of subsequent hunks when one is skipped
>>   add -p: calculate offset delta for edited patches
>>
>>  git-add--interactive.perl  | 93 
>> +++---
>>  t/t3701-add-interactive.sh | 30 +++
>>  2 files changed, 102 insertions(+), 21 deletions(-)
> 
> This looks reasonably sane to me.  I really like that you managed to
> produce failing tests for this situation.  I know pathological cases
> like this have bit GCC in the past, so it's good that you fixed this.
> 

Thanks Brain, it's interesting to hear that GCC has been bitten in the past

Best Wishes

Phillip


[PATCH v2 2/9] t3701: indent here documents

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

Indent here documents in line with the current style for tests.

Signed-off-by: Phillip Wood 
---
 t/t3701-add-interactive.sh | 174 ++---
 1 file changed, 87 insertions(+), 87 deletions(-)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 
058698df6a4a9811b9db84fb5900472c47c61798..861ea2e08cce750515f59fc424b3f8336fd9b1a9
 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -22,14 +22,14 @@ test_expect_success 'status works (initial)' '
 '
 
 test_expect_success 'setup expected' '
-cat >expected expected expected patch fake_editor.sh &&
-   cat >>fake_editor.sh <<\EOF &&
-mv -f "$1" oldpatch &&
-mv -f patch "$1"
-EOF
+   cat >>fake_editor.sh <<-\EOF &&
+   mv -f "$1" oldpatch &&
+   mv -f patch "$1"
+   EOF
chmod a+x fake_editor.sh &&
test_set_editor "$(pwd)/fake_editor.sh"
 '
@@ -126,10 +126,10 @@ test_expect_success 'bad edit rejected' '
 '
 
 test_expect_success 'setup patch' '
-cat >patch patch expected patch expected expected expected 

[PATCH v2 4/9] t3701: don't hard code sha1 hash values

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

Purge the index lines from diffs so we're not hard coding sha1 hash
values in the expected output.

Signed-off-by: Phillip Wood 
---
 t/t3701-add-interactive.sh | 32 
 1 file changed, 12 insertions(+), 20 deletions(-)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 
b73a9598ab3eaed074735e99f3dcbc8f88d86f4c..70748393f28c93f4bc5d43b07bd96bd208aba3e9
 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -24,7 +24,6 @@ test_expect_success 'status works (initial)' '
 test_expect_success 'setup expected' '
cat >expected <<-EOF
new file mode 100644
-   index 000..d95f3ad
--- /dev/null
+++ b/file
@@ -0,0 +1 @@
@@ -34,7 +33,7 @@ test_expect_success 'setup expected' '
 
 test_expect_success 'diff works (initial)' '
(echo d; echo 1) | git add -i >output &&
-   sed -ne "/new file/,/content/p" diff &&
+   sed -ne /^index/d -e "/new file/,/content/p" diff &&
test_cmp expected diff
 '
 test_expect_success 'revert works (initial)' '
@@ -60,7 +59,6 @@ test_expect_success 'status works (commit)' '
 
 test_expect_success 'setup expected' '
cat >expected <<-EOF
-   index 180b47c..b6f2c08 100644
--- a/file
+++ b/file
@@ -1 +1,2 @@
@@ -71,7 +69,7 @@ test_expect_success 'setup expected' '
 
 test_expect_success 'diff works (commit)' '
(echo d; echo 1) | git add -i >output &&
-   sed -ne "/^index/,/content/p" diff &&
+   sed -ne "/^---/,/content/p" diff &&
test_cmp expected diff
 '
 test_expect_success 'revert works (commit)' '
@@ -90,7 +88,7 @@ test_expect_success 'setup expected' '
 test_expect_success 'dummy edit works' '
test_set_editor : &&
(echo e; echo a) | git add -p &&
-   git diff > diff &&
+   git diff | sed /^index/d >diff &&
test_cmp expected diff
 '
 
@@ -145,7 +143,6 @@ test_expect_success 'setup patch' '
 test_expect_success 'setup expected' '
cat >expected <<-EOF
diff --git a/file b/file
-   index b5dd6c9..f910ae9 100644
--- a/file
+++ b/file
@@ -1,4 +1,4 @@
@@ -159,7 +156,7 @@ test_expect_success 'setup expected' '
 
 test_expect_success 'real edit works' '
(echo e; echo n; echo d) | git add -p &&
-   git diff >output &&
+   git diff | sed /^index/d >output &&
test_cmp expected output
 '
 
@@ -168,10 +165,10 @@ test_expect_success 'skip files similarly as commit -a' '
echo file >.gitignore &&
echo changed >file &&
echo y | git add -p file &&
-   git diff >output &&
+   git diff | sed /^index/d >output &&
git reset &&
git commit -am commit &&
-   git diff >expected &&
+   git diff | sed /^index/d >expected &&
test_cmp expected output &&
git reset --hard HEAD^
 '
@@ -217,7 +214,6 @@ test_expect_success 'setup again' '
 # Write the patch file with a new line at the top and bottom
 test_expect_success 'setup patch' '
cat >patch <<-EOF
-   index 180b47c..b6f2c08 100644
--- a/file
+++ b/file
@@ -1,2 +1,4 @@
@@ -232,7 +228,6 @@ test_expect_success 'setup patch' '
 test_expect_success 'setup expected' '
cat >expected <<-EOF
diff --git a/file b/file
-   index b6f2c08..61b9053 100755
--- a/file
+++ b/file
@@ -1,2 +1,4 @@
@@ -248,15 +243,14 @@ test_expect_success 'add first line works' '
git commit -am "clear local changes" &&
git apply patch &&
(echo s; echo y; echo y) | git add -p file &&
-   git diff --cached > diff &&
+   git diff --cached | sed /^index/d >diff &&
test_cmp expected diff
 '
 
 test_expect_success 'setup expected' '
cat >expected <<-EOF
diff --git a/non-empty b/non-empty
deleted file mode 100644
-   index d95f3ad..000
--- a/non-empty
+++ /dev/null
@@ -1 +0,0 @@
@@ -271,15 +265,14 @@ test_expect_success 'deleting a non-empty file' '
git commit -m non-empty &&
rm non-empty &&
echo y | git add -p non-empty &&
-   git diff --cached >diff &&
+   git diff --cached | sed /^index/d >diff &&
test_cmp expected diff
 '
 
 test_expect_success 'setup expected' '
cat >expected <<-EOF
diff --git a/empty b/empty
deleted file mode 100644
-   index e69de29..000
EOF
 '
 
@@ -290,7 +283,7 @@ test_expect_success 'deleting an empty file' '
git commit -m empty &&
rm empty &&
echo y | git add -p empty &&
-   git diff --cached >diff &&
+   git diff --cached | sed /^index/d >diff &&
test_cmp expected diff
 '
 
@@ -317,7 +310,7 @@ test_expect_success 'split hunk "add -p (edit)"' '
#exits.
printf "%s\n" s e q n q q |
EDITOR=: git add -p &&
-   git 

[PATCH v2 9/9] add -p: don't rely on apply's '--recount' option

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

Now that add -p counts patches properly it should be possible to turn
off the '--recount' option when invoking 'git apply'

Signed-off-by: Phillip Wood 
---

Notes:
I can't think of a reason why this shouldn't be OK but I can't help
feeling slightly nervous about it. I've made it a separate patch so it
can be easily dropped or reverted if I've missed something.

 git-add--interactive.perl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
3226c2c4f02d5f8679d77b8eede984fc727b422d..a64c0db57d62ab02ef718b8c8f821105132d9920
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -677,7 +677,7 @@ sub add_untracked_cmd {
 sub run_git_apply {
my $cmd = shift;
my $fh;
-   open $fh, '| git ' . $cmd . " --recount --allow-overlap";
+   open $fh, '| git ' . $cmd . " --allow-overlap";
print $fh @_;
return close $fh;
 }
-- 
2.16.1



[PATCH v2 0/9] Correct offsets of hunks when one is skipped

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

Since v1 I've added some test cleanups for t3701, fixed the counting
when splitting and coalescing hunks containing "\ No newline at end of
file" lines and added a patch to remove '--recount' from the
invocation of 'git apply'. There are minor changes to patches 5
(previously patch 2) and patch 7 (previously patch 4) which I've
explained in the comments on those patches. Otherwise the original
patches are unchanged.

Cover letter to v1:

While working on a patch series to stage selected lines from a hunk
without having to edit it I got worried that subsequent patches would
be applied in the wrong place which lead to this series to correct the
offsets of hunks following those that are skipped or edited.


Phillip Wood (9):
  add -i: add function to format hunk header
  t3701: indent here documents
  t3701: use test_write_lines and write_script
  t3701: don't hard code sha1 hash values
  t3701: add failing test for pathological context lines
  add -p: Adjust offsets of subsequent hunks when one is skipped
  add -p: calculate offset delta for edited patches
  add -p: fix counting when splitting and coalescing
  add -p: don't rely on apply's '--recount' option

 git-add--interactive.perl  | 106 -
 t/t3701-add-interactive.sh | 281 -
 2 files changed, 229 insertions(+), 158 deletions(-)

-- 
2.16.1



[PATCH v2 3/9] t3701: use test_write_lines and write_script

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

Simplify things slightly by using the above helpers.

Signed-off-by: Phillip Wood 
---
 t/t3701-add-interactive.sh | 36 +++-
 1 file changed, 7 insertions(+), 29 deletions(-)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 
861ea2e08cce750515f59fc424b3f8336fd9b1a9..b73a9598ab3eaed074735e99f3dcbc8f88d86f4c
 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -87,13 +87,8 @@ test_expect_success 'setup expected' '
EOF
 '
 
-test_expect_success 'setup fake editor' '
-   >fake_editor.sh &&
-   chmod a+x fake_editor.sh &&
-   test_set_editor "$(pwd)/fake_editor.sh"
-'
-
 test_expect_success 'dummy edit works' '
+   test_set_editor : &&
(echo e; echo a) | git add -p &&
git diff > diff &&
test_cmp expected diff
@@ -110,13 +105,12 @@ test_expect_success 'setup patch' '
 '
 
 test_expect_success 'setup fake editor' '
-   echo "#!$SHELL_PATH" >fake_editor.sh &&
-   cat >>fake_editor.sh <<-\EOF &&
+   FAKE_EDITOR="$(pwd)/fake-editor.sh" &&
+   write_script "$FAKE_EDITOR" <<-\EOF &&
mv -f "$1" oldpatch &&
mv -f patch "$1"
EOF
-   chmod a+x fake_editor.sh &&
-   test_set_editor "$(pwd)/fake_editor.sh"
+   test_set_editor "$FAKE_EDITOR"
 '
 
 test_expect_success 'bad edit rejected' '
@@ -302,18 +296,12 @@ test_expect_success 'deleting an empty file' '
 
 test_expect_success 'split hunk setup' '
git reset --hard &&
-   for i in 10 20 30 40 50 60
-   do
-   echo $i
-   done >test &&
+   test_write_lines 10 20 30 40 50 60 >test &&
git add test &&
test_tick &&
git commit -m test &&
 
-   for i in 10 15 20 21 22 23 24 30 40 50 60
-   do
-   echo $i
-   done >test
+   test_write_lines 10 15 20 21 22 23 24 30 40 50 60 >test
 '
 
 test_expect_success 'split hunk "add -p (edit)"' '
@@ -334,17 +322,7 @@ test_expect_success 'split hunk "add -p (edit)"' '
 '
 
 test_expect_failure 'split hunk "add -p (no, yes, edit)"' '
-   cat >test <<-\EOF &&
-   5
-   10
-   20
-   21
-   30
-   31
-   40
-   50
-   60
-   EOF
+   test_write_lines 5 10 20 21 30 31 40 50 60 >test &&
git reset &&
# test sequence is s(plit), n(o), y(es), e(dit)
# q n q q is there to make sure we exit at the end.
-- 
2.16.1



[PATCH v2 8/9] add -p: fix counting when splitting and coalescing

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

When a file has no trailing new line at the end diff records this by
appending "\ No newline at end of file" below the last line of the
file. This line should not be counted in the hunk header. Fix the
splitting and coalescing code to count files without a trailing new line
properly and change one of the tests to test splitting without a
trailing new line.

Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl  | 13 +
 t/t3701-add-interactive.sh | 28 ++--
 2 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
0df0c2aa065af88e159f8e9a2febe12f4ef8ee75..3226c2c4f02d5f8679d77b8eede984fc727b422d
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -793,6 +793,11 @@ sub split_hunk {
while (++$i < @$text) {
my $line = $text->[$i];
my $display = $display->[$i];
+   if ($line =~ /^\\/) {
+   push @{$this->{TEXT}}, $line;
+   push @{$this->{DISPLAY}}, $display;
+   next;
+   }
if ($line =~ /^ /) {
if ($this->{ADDDEL} &&
!defined $next_hunk_start) {
@@ -887,8 +892,8 @@ sub merge_hunk {
$o_cnt = $n_cnt = 0;
for ($i = 1; $i < @{$prev->{TEXT}}; $i++) {
my $line = $prev->{TEXT}[$i];
-   if ($line =~ /^\+/) {
-   $n_cnt++;
+   if ($line =~ /^[+\\]/) {
+   $n_cnt++ if ($line =~ /^\+/);
push @line, $line;
next;
}
@@ -905,8 +910,8 @@ sub merge_hunk {
 
for ($i = 1; $i < @{$this->{TEXT}}; $i++) {
my $line = $this->{TEXT}[$i];
-   if ($line =~ /^\+/) {
-   $n_cnt++;
+   if ($line =~ /^[+\\]/) {
+   $n_cnt++ if ($line =~ /^\+/);
push @line, $line;
next;
}
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 
bbda771ba7e516aa37a204beffba7eeb0c85a2f4..0fb9c0e0f140e21ef7ad467c40b9211d29f53db6
 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -221,30 +221,46 @@ test_expect_success 'setup patch' '
 baseline
 content
+lastline
+   \ No newline at end of file
EOF
 '
 
-# Expected output, similar to the patch but w/ diff at the top
+# Expected output, diff is similar to the patch but w/ diff at the top
 test_expect_success 'setup expected' '
-   cat >expected <<-EOF
-   diff --git a/file b/file
+   echo diff --git a/file b/file >expected &&
+   cat patch >>expected &&
+   cat >expected-output <<-EOF
--- a/file
+++ b/file
@@ -1,2 +1,4 @@
+firstline
 baseline
 content
+lastline
+   \ No newline at end of file
+   @@ -1,2 +1,3 @@
+   +firstline
+baseline
+content
+   @@ -1,2 +2,3 @@
+baseline
+content
+   +lastline
+   \ No newline at end of file
EOF
 '
 
 # Test splitting the first patch, then adding both
-test_expect_success 'add first line works' '
+test_expect_success C_LOCALE_OUTPUT 'add first line works' '
git commit -am "clear local changes" &&
git apply patch &&
-   (echo s; echo y; echo y) | git add -p file &&
+   printf "%s\n" s y y | git add -p file 2>error |
+   sed -n -e "s/^Stage this hunk[^@]*\(@@ .*\)/\1/" \
+  -e "/^[-+@ ]"/p  >output &&
+   test_must_be_empty error &&
git diff --cached | sed /^index/d >diff &&
-   test_cmp expected diff
+   test_cmp expected diff &&
+   test_cmp expected-output output
 '
 
 test_expect_success 'setup expected' '
-- 
2.16.1



[PATCH v2 5/9] t3701: add failing test for pathological context lines

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

When a hunk is skipped by add -i the offsets of subsequent hunks are
not adjusted to account for any missing insertions due to the skipped
hunk. Most of the time this does not matter as apply uses the context
lines to apply the subsequent hunks in the correct place, however in
pathological cases the context lines will match at the now incorrect
offset and the hunk will be applied in the wrong place. The offsets of
hunks following an edited hunk that has had the number of insertions
or deletions changed also need to be updated in the same way. Add
failing tests to demonstrate this.

Signed-off-by: Phillip Wood 
---

Notes:
changes since v1:
 - changed edit test to use the existing fake editor and to strip
   the hunk header and some context lines as well as deleting an
   insertion
 - style fixes

 t/t3701-add-interactive.sh | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 
70748393f28c93f4bc5d43b07bd96bd208aba3e9..06c4747f506a1b05ccad0f9387e1fbd69cfd7784
 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -511,4 +511,35 @@ test_expect_success 'status ignores dirty submodules 
(except HEAD)' '
! grep dirty-otherwise output
 '
 
+test_expect_success 'set up pathological context' '
+   git reset --hard &&
+   test_write_lines a a a a a a a a a a a >a &&
+   git add a &&
+   git commit -m a &&
+   test_write_lines c b a a a a a a a b a a a a >a &&
+   test_write_lines a a a a a a a b a a a a >expected-1 &&
+   test_write_lines   b a a a a a a a b a a a a >expected-2 &&
+   # check editing can cope with missing header and deleted context lines
+   # as well as changes to other lines
+   test_write_lines +b " a" >patch
+'
+
+test_expect_failure 'add -p works with pathological context lines' '
+   git reset &&
+   printf "%s\n" n y |
+   git add -p &&
+   git cat-file blob :a >actual &&
+   test_cmp expected-1 actual
+'
+
+test_expect_failure 'add -p patch editing works with pathological context 
lines' '
+   git reset &&
+   test_set_editor "$FAKE_EDITOR" &&
+   # n q q below is in case edit fails
+   printf "%s\n" e yn q q |
+   git add -p &&
+   git cat-file blob :a >actual &&
+   test_cmp expected-2 actual
+'
+
 test_done
-- 
2.16.1



[PATCH v2 1/9] add -i: add function to format hunk header

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

This code is duplicated in a couple of places so make it into a
function as we're going to add some more callers shortly.

Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
964c3a75420db4751cf11125b68b6904112632f1..8ababa6453ac4f57a925aacbb8b9bb1c973e4a54
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -751,6 +751,15 @@ sub parse_hunk_header {
return ($o_ofs, $o_cnt, $n_ofs, $n_cnt);
 }
 
+sub format_hunk_header {
+   my ($o_ofs, $o_cnt, $n_ofs, $n_cnt) = @_;
+   return ("@@ -$o_ofs" .
+   (($o_cnt != 1) ? ",$o_cnt" : '') .
+   " +$n_ofs" .
+   (($n_cnt != 1) ? ",$n_cnt" : '') .
+   " @@\n");
+}
+
 sub split_hunk {
my ($text, $display) = @_;
my @split = ();
@@ -838,11 +847,7 @@ sub split_hunk {
my $o_cnt = $hunk->{OCNT};
my $n_cnt = $hunk->{NCNT};
 
-   my $head = ("@@ -$o_ofs" .
-   (($o_cnt != 1) ? ",$o_cnt" : '') .
-   " +$n_ofs" .
-   (($n_cnt != 1) ? ",$n_cnt" : '') .
-   " @@\n");
+   my $head = format_hunk_header($o_ofs, $o_cnt, $n_ofs, $n_cnt);
my $display_head = $head;
unshift @{$hunk->{TEXT}}, $head;
if ($diff_use_color) {
@@ -912,11 +917,7 @@ sub merge_hunk {
}
push @line, $line;
}
-   my $head = ("@@ -$o0_ofs" .
-   (($o_cnt != 1) ? ",$o_cnt" : '') .
-   " +$n0_ofs" .
-   (($n_cnt != 1) ? ",$n_cnt" : '') .
-   " @@\n");
+   my $head = format_hunk_header($o0_ofs, $o_cnt, $n0_ofs, $n_cnt);
@{$prev->{TEXT}} = ($head, @line);
 }
 
-- 
2.16.1



[PATCH v2 6/9] add -p: Adjust offsets of subsequent hunks when one is skipped

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

Since commit 8cbd431082 ("git-add--interactive: replace hunk
recounting with apply --recount", 2008-7-2) if a hunk is skipped then
we rely on the context lines to apply subsequent hunks in the right
place. While this works most of the time it is possible for hunks to
end up being applied in the wrong place. To fix this adjust the offset
of subsequent hunks to correct for any change in the number of
insertions or deletions due to the skipped hunk. The change in offset
due to edited hunks that have the number of insertions or deletions
changed is ignored here, it will be fixed in the next commit.

Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl  | 15 +--
 t/t3701-add-interactive.sh |  2 +-
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
8ababa6453ac4f57a925aacbb8b9bb1c973e4a54..7a0a5896bb8fa063ace29b85840849c867b874f5
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -926,14 +926,25 @@ sub coalesce_overlapping_hunks {
my @out = ();
 
my ($last_o_ctx, $last_was_dirty);
+   my $ofs_delta = 0;
 
-   for (grep { $_->{USE} } @in) {
+   for (@in) {
if ($_->{TYPE} ne 'hunk') {
push @out, $_;
next;
}
my $text = $_->{TEXT};
-   my ($o_ofs) = parse_hunk_header($text->[0]);
+   my ($o_ofs, $o_cnt, $n_ofs, $n_cnt) =
+   parse_hunk_header($text->[0]);
+   unless ($_->{USE}) {
+   $ofs_delta += $o_cnt - $n_cnt;
+   next;
+   }
+   if ($ofs_delta) {
+   $n_ofs += $ofs_delta;
+   $_->{TEXT}->[0] = format_hunk_header($o_ofs, $o_cnt,
+$n_ofs, $n_cnt);
+   }
if (defined $last_o_ctx &&
$o_ofs <= $last_o_ctx &&
!$_->{DIRTY} &&
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 
06c4747f506a1b05ccad0f9387e1fbd69cfd7784..e153a02605df25ea40e49dd48b7c9fd981713b9d
 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -524,7 +524,7 @@ test_expect_success 'set up pathological context' '
test_write_lines +b " a" >patch
 '
 
-test_expect_failure 'add -p works with pathological context lines' '
+test_expect_success 'add -p works with pathological context lines' '
git reset &&
printf "%s\n" n y |
git add -p &&
-- 
2.16.1



Re: Please test Git for Windows' latest snapshot

2018-02-19 Thread Johannes Schindelin
Hi Bryan,

On Fri, 16 Feb 2018, Bryan Turner wrote:

> On Fri, Feb 16, 2018 at 3:30 PM, Johannes Schindelin
>  wrote:
> > Hi team,
> >
> > I am unwilling to release Git for Windows v2.16.2 on a Friday night, but I
> > have something almost as good. There is a snapshot available here:
> >
> > 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__wingit.blob.core.windows.net_files_index.html=DwIBAg=wBUwXtM9sKhff6UeHOQgvw=uBedA6EFFVX1HiLgmpdrBrv8bIDAScKjk1yk9LOASBM=xZghHWteeNbJ2bu5ySDq9WwqnfX8X7FZ_CWsV9gAyJU=NzSYCFSWWokPP9A9FA_EmJO5yu8qtRKw5M-Ep_qooUc=
> >
> > That snapshot brings quite a few updated components apart from Git proper
> > (such as an updated MSYS2 runtime), and I would love to ask y'all to give
> > this snapshot a proper "tire kicking".
> 
> I've run Bitbucket Server's full Git test suite (~1,500 tests) against
> the Portable Git snapshot (e1848984d1), no failures to report.

Thank you as always!

Sadly, another user found out that `git svn` is completely broken in the
32-bit versions, and that's what I am trying to figure out right now. Not
a problem in Git itself, of course, but in the way the Subversion Perl
bindings were built.

Ciao,
Dscho


[PATCH v2 7/9] add -p: calculate offset delta for edited patches

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

Recount the number of preimage and postimage lines in a hunk after it
has been edited so any change in the number of insertions or deletions
can be used to adjust the offsets of subsequent hunks. If an edited
hunk is subsequently split then the offset correction will be lost. It
would be possible to fix this if it is a problem, however the code
here is still an improvement on the status quo for the common case
where an edited hunk is applied without being split.

This is also a necessary step to removing '--recount' and
'--allow-overlap' from the invocation of 'git apply'. Before
'--recount' can be removed the splitting and coalescing counting needs
to be fixed to handle a missing newline at the end of a file. In order
to remove '--allow-overlap' there needs to be i) some way of verifying
the offset data in the edited hunk (probably by correlating the
preimage (or postimage if the patch is going to be applied in reverse)
lines of the edited and unedited versions to see if they are offset or
if any leading/trailing context lines have been removed) and ii) a way of
dealing with edited hunks that change context lines that are shared
with neighbouring hunks.

Signed-off-by: Phillip Wood 
---

Notes:
changes since v1
 - edited hunks are now recounted before seeing if they apply in
   preparation for removing '--recount' when invoking 'git apply'
 - added sentence to commit message about the offset data being lost
   if an edited hunk is split

 git-add--interactive.perl  | 55 ++
 t/t3701-add-interactive.sh |  2 +-
 2 files changed, 47 insertions(+), 10 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
7a0a5896bb8fa063ace29b85840849c867b874f5..0df0c2aa065af88e159f8e9a2febe12f4ef8ee75
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -938,13 +938,19 @@ sub coalesce_overlapping_hunks {
parse_hunk_header($text->[0]);
unless ($_->{USE}) {
$ofs_delta += $o_cnt - $n_cnt;
+   # If this hunk has been edited then subtract
+   # the delta that is due to the edit.
+   $_->{OFS_DELTA} and $ofs_delta -= $_->{OFS_DELTA};
next;
}
if ($ofs_delta) {
$n_ofs += $ofs_delta;
$_->{TEXT}->[0] = format_hunk_header($o_ofs, $o_cnt,
 $n_ofs, $n_cnt);
}
+   # If this hunk was edited then adjust the offset delta
+   # to reflect the edit.
+   $_->{OFS_DELTA} and $ofs_delta += $_->{OFS_DELTA};
if (defined $last_o_ctx &&
$o_ofs <= $last_o_ctx &&
!$_->{DIRTY} &&
@@ -1016,6 +1022,30 @@ marked for discarding."),
 marked for applying."),
 );
 
+sub recount_edited_hunk {
+   local $_;
+   my ($oldtext, $newtext) = @_;
+   my ($o_cnt, $n_cnt) = (0, 0);
+   for (@{$newtext}[1..$#{$newtext}]) {
+   my $mode = substr($_, 0, 1);
+   if ($mode eq '-') {
+   $o_cnt++;
+   } elsif ($mode eq '+') {
+   $n_cnt++;
+   } elsif ($mode eq ' ') {
+   $o_cnt++;
+   $n_cnt++;
+   }
+   }
+   my ($o_ofs, undef, $n_ofs, undef) =
+   parse_hunk_header($newtext->[0]);
+   $newtext->[0] = format_hunk_header($o_ofs, $o_cnt, $n_ofs, $n_cnt);
+   my (undef, $orig_o_cnt, undef, $orig_n_cnt) =
+   parse_hunk_header($oldtext->[0]);
+   # Return the change in the number of lines inserted by this hunk
+   return $orig_o_cnt - $orig_n_cnt - $o_cnt + $n_cnt;
+}
+
 sub edit_hunk_manually {
my ($oldtext) = @_;
 
@@ -1114,25 +1144,32 @@ sub prompt_yesno {
 }
 
 sub edit_hunk_loop {
-   my ($head, $hunk, $ix) = @_;
-   my $text = $hunk->[$ix]->{TEXT};
+   my ($head, $hunks, $ix) = @_;
+   my $hunk = $hunks->[$ix];
+   my $text = $hunk->{TEXT};
 
while (1) {
-   $text = edit_hunk_manually($text);
-   if (!defined $text) {
+   my $newtext = edit_hunk_manually($text);
+   if (!defined $newtext) {
return undef;
}
my $newhunk = {
-   TEXT => $text,
-   TYPE => $hunk->[$ix]->{TYPE},
+   TEXT => $newtext,
+   TYPE => $hunk->{TYPE},
USE => 1,
DIRTY => 1,
};
+   $newhunk->{OFS_DELTA} = recount_edited_hunk($text, $newtext);
+   # 

[PATCH v1 1/3] add -p: select individual hunk lines

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

When I end up editing hunks it is almost always because I want to
stage a subset of the lines in the hunk. Doing this by editing the
hunk is inconvenient and error prone (especially so if the patch is
going to be reversed before being applied). Instead offer an option
for add -p to stage individual lines. When the user presses 'l' the
hunk is redrawn with labels by the insertions and deletions and they
are prompted to enter a list of the lines they wish to stage. Ranges
of lines may be specified using 'a-b' where either 'a' or 'b' may be
omitted to mean all lines from 'a' to the end of the hunk or all lines
from 1 upto and including 'b'.

Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl  | 136 +
 t/t3701-add-interactive.sh |  63 +
 2 files changed, 199 insertions(+)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
a64c0db57d62ab02ef718b8c8f821105132d9920..8a33796e1f6a564d0a27ba06c216dbb9848827b9
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -1003,6 +1003,126 @@ sub color_diff {
} @_;
 }
 
+sub label_hunk_lines {
+   local $_;
+   my $hunk = shift;
+   my $i = 0;
+   my $labels = [ map { /^[-+]/ ? ++$i : 0 } @{$hunk->{TEXT}} ];
+   $i > 1 and @{$hunk}{qw(LABELS MAX_LABEL)} = ($labels, $i);
+   return $i > 1;
+}
+
+sub select_hunk_lines {
+   my ($hunk, $selected) = @_;
+   my ($text, $labels) = @{$hunk}{qw(TEXT LABELS)};
+   my ($i, $o_cnt, $n_cnt) = (0, 0, 0);
+   my ($push_eol, @newtext);
+   # Lines with this mode will become context lines if they are
+   # not selected
+   my $context_mode = $patch_mode_flavour{IS_REVERSE} ? '+' : '-';
+   for $i (1..$#{$text}) {
+   my $mode = substr($text->[$i], 0, 1);
+   if ($mode eq '\\') {
+   push @newtext, $text->[$i] if ($push_eol);
+   undef $push_eol;
+   } elsif ($labels->[$i] and $selected->[$labels->[$i]]) {
+   push @newtext, $text->[$i];
+   if ($mode eq '+') {
+   $n_cnt++;
+   } else {
+   $o_cnt++;
+   }
+   $push_eol = 1;
+   } elsif ($mode eq ' ' or $mode eq $context_mode) {
+   push @newtext, ' ' . substr($text->[$i], 1);
+   $o_cnt++; $n_cnt++;
+   $push_eol = 1;
+   } else {
+   undef $push_eol;
+   }
+   }
+   my ($o_ofs, $orig_o_cnt, $n_ofs, $orig_n_cnt) =
+   parse_hunk_header($text->[0]);
+   unshift @newtext, format_hunk_header($o_ofs, $o_cnt, $n_ofs, $n_cnt);
+   my $newhunk = {
+   TEXT => \@newtext,
+   DISPLAY => [ color_diff(@newtext) ],
+   OFS_DELTA => $orig_o_cnt - $orig_n_cnt - $o_cnt + $n_cnt,
+   TYPE => $hunk->{TYPE},
+   USE => 1,
+   };
+   # If this hunk has previously been edited add the offset delta
+   # of the old hunk to get the real delta from the original
+   # hunk.
+   $hunk->{OFS_DELTA} and $newhunk->{OFS_DELTA} += $hunk->{OFS_DELTA};
+   return $newhunk;
+}
+
+sub check_hunk_label {
+   my ($max_label, $label) = ($_[0]->{MAX_LABEL}, $_[1]);
+   if ($label < 1 or $label > $max_label) {
+   error_msg sprintf(__("invalid hunk line '%d'\n"), $label);
+   return 0;
+   }
+   return 1;
+}
+
+sub parse_hunk_selection {
+   local $_;
+   my ($hunk, $line) = @_;
+   my $max_label = $hunk->{MAX_LABEL};
+   my @selected = (0) x ($max_label + 1);
+   my @fields = split(/[,\s]+/, $line);
+   for (@fields) {
+   if (/^([0-9]*)-([0-9]*)$/) {
+   if ($1 eq '' and $2 eq '') {
+   error_msg __("range '-' missing upper or lower 
bound\n");
+   return undef;
+   }
+   my $lo = $1 eq '' ? 1 : $1;
+   my $hi = $2 eq '' ? $max_label : $2;
+   check_hunk_label($hunk, $lo) or return undef;
+   check_hunk_label($hunk, $hi) or return undef;
+   $hi < $lo and ($lo, $hi) = ($hi, $lo);
+   @selected[$lo..$hi] = (1) x (1 + $hi - $lo);
+   } elsif (/^([0-9]+)$/) {
+   check_hunk_label($hunk, $1) or return undef;
+   $selected[$1] = 1;
+   } else {
+   error_msg sprintf(__("invalid hunk line '%s'\n"), $_);
+   return undef;
+   }
+   }
+   return \@selected;
+}
+
+sub display_hunk_lines {
+   my 

[PATCH v1 0/3] add -p: select individual hunk lines

2018-02-19 Thread Phillip Wood
From: Phillip Wood 

I need to update the add -i documentation but otherwise I think these
patches are OK so I thought I'd try and get some feedback. They build
on top of the recount fixes in [1]. The commit message for the first
patch describes the motivation:

"When I end up editing hunks it is almost always because I want to
stage a subset of the lines in the hunk. Doing this by editing the
hunk is inconvenient and error prone (especially so if the patch is
going to be reversed before being applied). Instead offer an option
for add -p to stage individual lines. When the user presses 'l' the
hunk is redrawn with labels by the insertions and deletions and they
are prompted to enter a list of the lines they wish to stage. Ranges
of lines may be specified using 'a-b' where either 'a' or 'b' may be
omitted to mean all lines from 'a' to the end of the hunk or all lines
from 1 upto and including 'b'."

[1] 
https://public-inbox.org/git/20180219112910.24471-1-phillip.w...@talktalk.net/T/#u

Phillip Wood (3):
  add -p: select individual hunk lines
  add -p: allow line selection to be inverted
  add -p: optimize line selection for short hunks

 git-add--interactive.perl  | 179 +
 t/t3701-add-interactive.sh |  63 
 2 files changed, 242 insertions(+)

-- 
2.16.1



Re: [PATCH v2 4/9] t3701: don't hard code sha1 hash values

2018-02-19 Thread Eric Sunshine
On Mon, Feb 19, 2018 at 6:29 AM, Phillip Wood  wrote:
> Purge the index lines from diffs so we're not hard coding sha1 hash
> values in the expected output.

Perhaps the commit message could provide a bit more explanation about
why this is a good idea. For instance, briefly mention that doing so
will ease Git's transition from SHA1 to some newer hash function.

> Signed-off-by: Phillip Wood 


Re: [PATCH v2 3/9] t3701: use test_write_lines and write_script

2018-02-19 Thread Eric Sunshine
On Mon, Feb 19, 2018 at 6:29 AM, Phillip Wood  wrote:
> From: Phillip Wood 
>
> Simplify things slightly by using the above helpers.
>
> Signed-off-by: Phillip Wood 
> ---
> diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
> @@ -87,13 +87,8 @@ test_expect_success 'setup expected' '
> @@ -110,13 +105,12 @@ test_expect_success 'setup patch' '
>  test_expect_success 'setup fake editor' '
> -   echo "#!$SHELL_PATH" >fake_editor.sh &&
> -   cat >>fake_editor.sh <<-\EOF &&
> +   FAKE_EDITOR="$(pwd)/fake-editor.sh" &&
> +   write_script "$FAKE_EDITOR" <<-\EOF &&
> mv -f "$1" oldpatch &&
> mv -f patch "$1"
> EOF
> -   chmod a+x fake_editor.sh &&
> -   test_set_editor "$(pwd)/fake_editor.sh"
> +   test_set_editor "$FAKE_EDITOR"
>  '

The very first thing that test_set_editor() does is set FAKE_EDITOR to
the value of $1, so it is confusing to see it getting setting it here
first; the reader has to spend extra brain cycles wondering if
something non-obvious is going on that requires this manual
assignment. Perhaps drop the assignment altogether and just write
literal "fake_editor.sh" in the couple places it's needed (as was done
in the original code)?


[PATCH v4 07/13] commit-graph: implement --set-latest

2018-02-19 Thread Derrick Stolee
It is possible to have multiple commit graph files in a directory, but
only one is important at a time.

Use a 'graph-latest' file to point to the important file. Teach
git-commit-graph to write 'graph-latest' when given the "--set-latest"
option. Using this 'graph-latest' file is more robust than relying on
directory scanning and modified times.

Signed-off-by: Derrick Stolee 
---
 Documentation/git-commit-graph.txt | 10 ++
 builtin/commit-graph.c | 26 --
 commit-graph.c |  7 +++
 commit-graph.h |  2 ++
 t/t5318-commit-graph.sh| 24 +++-
 5 files changed, 62 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-commit-graph.txt 
b/Documentation/git-commit-graph.txt
index 6d26e56..dc948c5 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -34,6 +34,9 @@ COMMANDS
 Write a commit graph file based on the commits found in packfiles.
 Includes all commits from the existing commit graph file. Outputs the
 resulting filename.
++
+With `--set-latest` option, update the graph-latest file to point
+to the written graph file.
 
 'read'::
 
@@ -53,6 +56,13 @@ EXAMPLES
 $ git commit-graph write
 
 
+* Write a graph file for the packed commits in your local .git folder
+* and update graph-latest.
++
+
+$ git commit-graph write --set-latest
+
+
 * Read basic information from a graph file.
 +
 
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 28cd097..bf86172 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -8,7 +8,7 @@
 static char const * const builtin_commit_graph_usage[] = {
N_("git commit-graph [--object-dir ]"),
N_("git commit-graph read [--object-dir ] [--file=]"),
-   N_("git commit-graph write [--object-dir ]"),
+   N_("git commit-graph write [--object-dir ] [--set-latest]"),
NULL
 };
 
@@ -18,13 +18,14 @@ static const char * const builtin_commit_graph_read_usage[] 
= {
 };
 
 static const char * const builtin_commit_graph_write_usage[] = {
-   N_("git commit-graph write [--object-dir ]"),
+   N_("git commit-graph write [--object-dir ] [--set-latest]"),
NULL
 };
 
 static struct opts_commit_graph {
const char *obj_dir;
const char *graph_file;
+   int set_latest;
 } opts;
 
 static int graph_read(int argc, const char **argv)
@@ -81,6 +82,22 @@ static int graph_read(int argc, const char **argv)
return 0;
 }
 
+static void set_latest_file(const char *obj_dir, const char *graph_file)
+{
+   int fd;
+   struct lock_file lk = LOCK_INIT;
+   char *latest_fname = get_graph_latest_filename(obj_dir);
+
+   fd = hold_lock_file_for_update(, latest_fname, LOCK_DIE_ON_ERROR);
+   FREE_AND_NULL(latest_fname);
+
+   if (fd < 0)
+   die_errno("unable to open graph-head");
+
+   write_in_full(fd, graph_file, strlen(graph_file));
+   commit_lock_file();
+}
+
 static int graph_write(int argc, const char **argv)
 {
char *graph_name;
@@ -89,6 +106,8 @@ static int graph_write(int argc, const char **argv)
{ OPTION_STRING, 'o', "object-dir", _dir,
N_("dir"),
N_("The object directory to store the graph") },
+   OPT_BOOL('u', "set-latest", _latest,
+   N_("update graph-head to written graph file")),
OPT_END(),
};
 
@@ -102,6 +121,9 @@ static int graph_write(int argc, const char **argv)
graph_name = write_commit_graph(opts.obj_dir);
 
if (graph_name) {
+   if (opts.set_latest)
+   set_latest_file(opts.obj_dir, graph_name);
+
printf("%s\n", graph_name);
FREE_AND_NULL(graph_name);
}
diff --git a/commit-graph.c b/commit-graph.c
index 2a8594f..5ee0805 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -38,6 +38,13 @@
 #define GRAPH_MIN_SIZE (GRAPH_CHUNKLOOKUP_SIZE + GRAPH_FANOUT_SIZE + \
GRAPH_OID_LEN + 8)
 
+char *get_graph_latest_filename(const char *obj_dir)
+{
+   struct strbuf fname = STRBUF_INIT;
+   strbuf_addf(, "%s/info/graph-latest", obj_dir);
+   return strbuf_detach(, 0);
+}
+
 static struct commit_graph *alloc_commit_graph(void)
 {
struct commit_graph *g = xmalloc(sizeof(*g));
diff --git a/commit-graph.h b/commit-graph.h
index 9093b97..ae24b3a 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -3,6 +3,8 @@
 
 #include "git-compat-util.h"
 
+extern char *get_graph_latest_filename(const char *obj_dir);
+
 struct commit_graph {
int graph_fd;
 
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 893fa24..cad9d90 100755
--- 

[PATCH v4 04/13] commit-graph: implement write_commit_graph()

2018-02-19 Thread Derrick Stolee
Teach Git to write a commit graph file by checking all packed objects
to see if they are commits, then store the file in the given object
directory.

Signed-off-by: Derrick Stolee 
---
 Makefile   |   1 +
 commit-graph.c | 370 +
 commit-graph.h |   7 ++
 3 files changed, 378 insertions(+)
 create mode 100644 commit-graph.c
 create mode 100644 commit-graph.h

diff --git a/Makefile b/Makefile
index fc40b81..eeaeb6a 100644
--- a/Makefile
+++ b/Makefile
@@ -761,6 +761,7 @@ LIB_OBJS += color.o
 LIB_OBJS += column.o
 LIB_OBJS += combine-diff.o
 LIB_OBJS += commit.o
+LIB_OBJS += commit-graph.o
 LIB_OBJS += compat/obstack.o
 LIB_OBJS += compat/terminal.o
 LIB_OBJS += config.o
diff --git a/commit-graph.c b/commit-graph.c
new file mode 100644
index 000..f9e39b0
--- /dev/null
+++ b/commit-graph.c
@@ -0,0 +1,370 @@
+#include "cache.h"
+#include "config.h"
+#include "git-compat-util.h"
+#include "pack.h"
+#include "packfile.h"
+#include "commit.h"
+#include "object.h"
+#include "revision.h"
+#include "sha1-lookup.h"
+#include "commit-graph.h"
+
+#define GRAPH_SIGNATURE 0x43475048 /* "CGPH" */
+#define GRAPH_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */
+#define GRAPH_CHUNKID_OIDLOOKUP 0x4f49444c /* "OIDL" */
+#define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */
+#define GRAPH_CHUNKID_LARGEEDGES 0x45444745 /* "EDGE" */
+
+#define GRAPH_DATA_WIDTH 36
+
+#define GRAPH_VERSION_1 0x1
+#define GRAPH_VERSION GRAPH_VERSION_1
+
+#define GRAPH_OID_VERSION_SHA1 1
+#define GRAPH_OID_LEN_SHA1 20
+#define GRAPH_OID_VERSION GRAPH_OID_VERSION_SHA1
+#define GRAPH_OID_LEN GRAPH_OID_LEN_SHA1
+
+#define GRAPH_LARGE_EDGES_NEEDED 0x8000
+#define GRAPH_PARENT_MISSING 0x7fff
+#define GRAPH_EDGE_LAST_MASK 0x7fff
+#define GRAPH_PARENT_NONE 0x7000
+
+#define GRAPH_LAST_EDGE 0x8000
+
+#define GRAPH_FANOUT_SIZE (4 * 256)
+#define GRAPH_CHUNKLOOKUP_WIDTH 12
+#define GRAPH_CHUNKLOOKUP_SIZE (5 * GRAPH_CHUNKLOOKUP_WIDTH)
+#define GRAPH_MIN_SIZE (GRAPH_CHUNKLOOKUP_SIZE + GRAPH_FANOUT_SIZE + \
+   GRAPH_OID_LEN + 8)
+
+static void write_graph_chunk_fanout(struct sha1file *f,
+struct commit **commits,
+int nr_commits)
+{
+   uint32_t i, count = 0;
+   struct commit **list = commits;
+   struct commit **last = commits + nr_commits;
+
+   /*
+* Write the first-level table (the list is sorted,
+* but we use a 256-entry lookup to be able to avoid
+* having to do eight extra binary search iterations).
+*/
+   for (i = 0; i < 256; i++) {
+   while (list < last) {
+   if ((*list)->object.oid.hash[0] != i)
+   break;
+   count++;
+   list++;
+   }
+
+   sha1write_be32(f, count);
+   }
+}
+
+static void write_graph_chunk_oids(struct sha1file *f, int hash_len,
+  struct commit **commits, int nr_commits)
+{
+   struct commit **list, **last = commits + nr_commits;
+   for (list = commits; list < last; list++)
+   sha1write(f, (*list)->object.oid.hash, (int)hash_len);
+}
+
+static int commit_pos(struct commit **commits, int nr_commits,
+ const struct object_id *oid, uint32_t *pos)
+{
+   uint32_t first = 0, last = nr_commits;
+
+   while (first < last) {
+   uint32_t mid = first + (last - first) / 2;
+   struct object_id *current;
+   int cmp;
+
+   current = &(commits[mid]->object.oid);
+   cmp = oidcmp(oid, current);
+   if (!cmp) {
+   *pos = mid;
+   return 1;
+   }
+   if (cmp > 0) {
+   first = mid + 1;
+   continue;
+   }
+   last = mid;
+   }
+
+   *pos = first;
+   return 0;
+}
+
+static void write_graph_chunk_data(struct sha1file *f, int hash_len,
+  struct commit **commits, int nr_commits)
+{
+   struct commit **list = commits;
+   struct commit **last = commits + nr_commits;
+   uint32_t num_large_edges = 0;
+
+   while (list < last) {
+   struct commit_list *parent;
+   uint32_t int_id;
+   uint32_t packedDate[2];
+
+   parse_commit(*list);
+   sha1write(f, (*list)->tree->object.oid.hash, hash_len);
+
+   parent = (*list)->parents;
+
+   if (!parent)
+   int_id = GRAPH_PARENT_NONE;
+   else if (!commit_pos(commits, nr_commits,
+&(parent->item->object.oid), _id))
+   int_id = GRAPH_PARENT_MISSING;
+
+   sha1write_be32(f, int_id);
+
+   if (parent)
+   

[PATCH v4 05/13] commit-graph: implement 'git-commit-graph write'

2018-02-19 Thread Derrick Stolee
Teach git-commit-graph to write graph files. Create new test script to verify
this command succeeds without failure.

Signed-off-by: Derrick Stolee 
---
 Documentation/git-commit-graph.txt |  40 +
 builtin/commit-graph.c |  43 +-
 t/t5318-commit-graph.sh| 119 +
 3 files changed, 201 insertions(+), 1 deletion(-)
 create mode 100755 t/t5318-commit-graph.sh

diff --git a/Documentation/git-commit-graph.txt 
b/Documentation/git-commit-graph.txt
index e1c3078..c3f222f 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -5,6 +5,46 @@ NAME
 
 git-commit-graph - Write and verify Git commit graphs (.graph files)
 
+
+SYNOPSIS
+
+[verse]
+'git commit-graph write'  [--object-dir ]
+
+
+DESCRIPTION
+---
+
+Manage the serialized commit graph file.
+
+
+OPTIONS
+---
+--object-dir::
+   Use given directory for the location of packfiles and graph files.
+   The graph files will be in /info and the packfiles are expected
+   to be in /pack.
+
+
+COMMANDS
+
+'write'::
+
+Write a commit graph file based on the commits found in packfiles.
+Includes all commits from the existing commit graph file. Outputs the
+resulting filename.
+
+
+EXAMPLES
+
+
+* Write a commit graph file for the packed commits in your local .git folder.
++
+
+$ git commit-graph write
+
+
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 98110bb..a51d87b 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -1,9 +1,18 @@
 #include "builtin.h"
 #include "config.h"
+#include "dir.h"
+#include "lockfile.h"
 #include "parse-options.h"
+#include "commit-graph.h"
 
 static char const * const builtin_commit_graph_usage[] = {
N_("git commit-graph [--object-dir ]"),
+   N_("git commit-graph write [--object-dir ]"),
+   NULL
+};
+
+static const char * const builtin_commit_graph_write_usage[] = {
+   N_("git commit-graph write [--object-dir ]"),
NULL
 };
 
@@ -11,11 +20,38 @@ static struct opts_commit_graph {
const char *obj_dir;
 } opts;
 
+static int graph_write(int argc, const char **argv)
+{
+   char *graph_name;
+
+   static struct option builtin_commit_graph_write_options[] = {
+   { OPTION_STRING, 'o', "object-dir", _dir,
+   N_("dir"),
+   N_("The object directory to store the graph") },
+   OPT_END(),
+   };
+
+   argc = parse_options(argc, argv, NULL,
+builtin_commit_graph_write_options,
+builtin_commit_graph_write_usage, 0);
+
+   if (!opts.obj_dir)
+   opts.obj_dir = get_object_directory();
+
+   graph_name = write_commit_graph(opts.obj_dir);
+
+   if (graph_name) {
+   printf("%s\n", graph_name);
+   FREE_AND_NULL(graph_name);
+   }
+
+   return 0;
+}
 
 int cmd_commit_graph(int argc, const char **argv, const char *prefix)
 {
static struct option builtin_commit_graph_options[] = {
-   { OPTION_STRING, 'p', "object-dir", _dir,
+   { OPTION_STRING, 'o', "object-dir", _dir,
N_("dir"),
N_("The object directory to store the graph") },
OPT_END(),
@@ -31,6 +67,11 @@ int cmd_commit_graph(int argc, const char **argv, const char 
*prefix)
 builtin_commit_graph_usage,
 PARSE_OPT_STOP_AT_NON_OPTION);
 
+   if (argc > 0) {
+   if (!strcmp(argv[0], "write"))
+   return graph_write(argc, argv);
+   }
+
usage_with_options(builtin_commit_graph_usage,
   builtin_commit_graph_options);
 }
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
new file mode 100755
index 000..6a5e93c
--- /dev/null
+++ b/t/t5318-commit-graph.sh
@@ -0,0 +1,119 @@
+#!/bin/sh
+
+test_description='commit graph'
+. ./test-lib.sh
+
+test_expect_success 'setup full repo' '
+   rm -rf .git &&
+   mkdir full &&
+   cd full &&
+   git init &&
+   objdir=".git/objects"
+'
+
+test_expect_success 'write graph with no packs' '
+   git commit-graph write --object-dir .
+'
+
+test_expect_success 'create commits and repack' '
+   for i in $(test_seq 3)
+   do
+   test_commit $i &&
+   git branch commits/$i
+   done &&
+   git repack
+'
+
+test_expect_success 'write graph' '
+   graph1=$(git commit-graph write) &&
+   test_path_is_file $objdir/info/$graph1
+'
+
+test_expect_success 'Add more commits' '
+   git reset --hard commits/1 &&
+   for i in $(test_seq 4 5)
+   do
+   test_commit $i &&
+   

[PATCH v4 02/13] graph: add commit graph design document

2018-02-19 Thread Derrick Stolee
Add Documentation/technical/commit-graph.txt with details of the planned
commit graph feature, including future plans.

Signed-off-by: Derrick Stolee 
---
 Documentation/technical/commit-graph.txt | 185 +++
 1 file changed, 185 insertions(+)
 create mode 100644 Documentation/technical/commit-graph.txt

diff --git a/Documentation/technical/commit-graph.txt 
b/Documentation/technical/commit-graph.txt
new file mode 100644
index 000..e52ab23
--- /dev/null
+++ b/Documentation/technical/commit-graph.txt
@@ -0,0 +1,185 @@
+Git Commit Graph Design Notes
+=
+
+Git walks the commit graph for many reasons, including:
+
+1. Listing and filtering commit history.
+2. Computing merge bases.
+
+These operations can become slow as the commit count grows. The merge
+base calculation shows up in many user-facing commands, such as 'merge-base'
+or 'status' and can take minutes to compute depending on history shape.
+
+There are two main costs here:
+
+1. Decompressing and parsing commits.
+2. Walking the entire graph to avoid topological order mistakes.
+
+The commit graph file is a supplemental data structure that accelerates
+commit graph walks. If a user downgrades or disables the 'core.commitGraph'
+config setting, then the existing ODB is sufficient. The file is stored
+either in the .git/objects/info directory or in the info directory of an
+alternate.
+
+The commit graph file stores the commit graph structure along with some
+extra metadata to speed up graph walks. By listing commit OIDs in lexi-
+cographic order, we can identify an integer position for each commit and
+refer to the parents of a commit using those integer positions. We use
+binary search to find initial commits and then use the integer positions
+for fast lookups during the walk.
+
+A consumer may load the following info for a commit from the graph:
+
+1. The commit OID.
+2. The list of parents, along with their integer position.
+3. The commit date.
+4. The root tree OID.
+5. The generation number (see definition below).
+
+Values 1-4 satisfy the requirements of parse_commit_gently().
+
+Define the "generation number" of a commit recursively as follows:
+
+ * A commit with no parents (a root commit) has generation number one.
+
+ * A commit with at least one parent has generation number one more than
+   the largest generation number among its parents.
+
+Equivalently, the generation number of a commit A is one more than the
+length of a longest path from A to a root commit. The recursive definition
+is easier to use for computation and observing the following property:
+
+If A and B are commits with generation numbers N and M, respectively,
+and N <= M, then A cannot reach B. That is, we know without searching
+that B is not an ancestor of A because it is further from a root commit
+than A.
+
+Conversely, when checking if A is an ancestor of B, then we only need
+to walk commits until all commits on the walk boundary have generation
+number at most N. If we walk commits using a priority queue seeded by
+generation numbers, then we always expand the boundary commit with highest
+generation number and can easily detect the stopping condition.
+
+This property can be used to significantly reduce the time it takes to
+walk commits and determine topological relationships. Without generation
+numbers, the general heuristic is the following:
+
+If A and B are commits with commit time X and Y, respectively, and
+X < Y, then A _probably_ cannot reach B.
+
+This heuristic is currently used whenever the computation can make
+mistakes with topological orders (such as "git log" with default order),
+but is not used when the topological order is required (such as merge
+base calculations, "git log --graph").
+
+In practice, we expect some commits to be created recently and not stored
+in the commit graph. We can treat these commits as having "infinite"
+generation number and walk until reaching commits with known generation
+number.
+
+Design Details
+--
+
+- A graph file is stored in a file named 'graph-.graph' in the
+  .git/objects/info directory. This could be stored in the info directory
+  of an alternate.
+
+- The latest graph file name is stored in a 'graph-latest' file next to
+  the graph files. This allows atomic swaps of latest graph files without
+  race conditions with concurrent processes.
+
+- The core.commitGraph config setting must be on to consume graph files.
+
+- The file format includes parameters for the object ID hash function,
+  so a future change of hash algorithm does not require a change in format.
+
+Current Limitations
+---
+
+- Only one graph file is used at one time. This allows the integer position
+  to seek into the single graph file. It is possible to extend the model
+  for multiple graph files, but that is currently not part of the design.
+
+- .graph files are managed 

Re: [PATCH v2 2/9] t3701: indent here documents

2018-02-19 Thread Eric Sunshine
On Mon, Feb 19, 2018 at 6:29 AM, Phillip Wood  wrote:
> From: Phillip Wood 
>
> Indent here documents in line with the current style for tests.
>
> Signed-off-by: Phillip Wood 
> ---
> diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
> @@ -22,14 +22,14 @@ test_expect_success 'status works (initial)' '
>  test_expect_success 'setup expected' '
> -cat >expected < -new file mode 100644
> -index 000..d95f3ad
>  /dev/null
> -+++ b/file
> -@@ -0,0 +1 @@
> -+content
> -EOF
> +   cat >expected <<-EOF

Minor: You could take the opportunity to update these to use -\EOF
(rather than -EOF) to document that no variable interpolation is
expected inside the 'here' document. Probably itself not worth a
re-roll.

> +   new file mode 100644
> +   index 000..d95f3ad
> +   --- /dev/null
> +   +++ b/file
> +   @@ -0,0 +1 @@
> +   +content
> +   EOF
>  '


[PATCH v4 08/13] commit-graph: implement --delete-expired

2018-02-19 Thread Derrick Stolee
Teach git-commit-graph to delete the .graph files that are siblings of a
newly-written graph file, except for the file referenced by 'graph-latest'
at the beginning of the process and the newly-written file. If we fail to
delete a graph file, only report a warning because another git process may
be using that file. In a multi-process environment, we expect the previoius
graph file to be used by a concurrent process, so we do not delete it to
avoid race conditions.

Signed-off-by: Derrick Stolee 
---
 Documentation/git-commit-graph.txt | 11 +--
 builtin/commit-graph.c | 61 --
 commit-graph.c | 23 ++
 commit-graph.h |  1 +
 t/t5318-commit-graph.sh|  7 +++--
 5 files changed, 96 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-commit-graph.txt 
b/Documentation/git-commit-graph.txt
index dc948c5..b9b4031 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -37,6 +37,11 @@ resulting filename.
 +
 With `--set-latest` option, update the graph-latest file to point
 to the written graph file.
++
+With the `--delete-expired` option, delete the graph files in the pack
+directory that are not referred to by the graph-latest file. To avoid race
+conditions, do not delete the file previously referred to by the
+graph-latest file if it is updated by the `--set-latest` option.
 
 'read'::
 
@@ -56,11 +61,11 @@ EXAMPLES
 $ git commit-graph write
 
 
-* Write a graph file for the packed commits in your local .git folder
-* and update graph-latest.
+* Write a graph file for the packed commits in your local .git folder,
+* update graph-latest, and delete stale graph files.
 +
 
-$ git commit-graph write --set-latest
+$ git commit-graph write --set-latest --delete-expired
 
 
 * Read basic information from a graph file.
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index bf86172..fd99169 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -8,7 +8,7 @@
 static char const * const builtin_commit_graph_usage[] = {
N_("git commit-graph [--object-dir ]"),
N_("git commit-graph read [--object-dir ] [--file=]"),
-   N_("git commit-graph write [--object-dir ] [--set-latest]"),
+   N_("git commit-graph write [--object-dir ] [--set-latest] 
[--delete-expired]"),
NULL
 };
 
@@ -18,7 +18,7 @@ static const char * const builtin_commit_graph_read_usage[] = 
{
 };
 
 static const char * const builtin_commit_graph_write_usage[] = {
-   N_("git commit-graph write [--object-dir ] [--set-latest]"),
+   N_("git commit-graph write [--object-dir ] [--set-latest] 
[--delete-expired]"),
NULL
 };
 
@@ -26,6 +26,7 @@ static struct opts_commit_graph {
const char *obj_dir;
const char *graph_file;
int set_latest;
+   int delete_expired;
 } opts;
 
 static int graph_read(int argc, const char **argv)
@@ -98,9 +99,56 @@ static void set_latest_file(const char *obj_dir, const char 
*graph_file)
commit_lock_file();
 }
 
+/*
+ * To avoid race conditions and deleting graph files that are being
+ * used by other processes, look inside a pack directory for all files
+ * of the form "graph-.graph" that do not match the old or new
+ * graph hashes and delete them.
+ */
+static void do_delete_expired(const char *obj_dir,
+ const char *old_graph_name,
+ const char *new_graph_name)
+{
+   DIR *dir;
+   struct dirent *de;
+   int dirnamelen;
+   struct strbuf path = STRBUF_INIT;
+
+   strbuf_addf(, "%s/info", obj_dir);
+   dir = opendir(path.buf);
+   if (!dir) {
+   if (errno != ENOENT)
+   error_errno("unable to open object pack directory: %s",
+   obj_dir);
+   return;
+   }
+
+   strbuf_addch(, '/');
+   dirnamelen = path.len;
+   while ((de = readdir(dir)) != NULL) {
+   size_t base_len;
+
+   if (is_dot_or_dotdot(de->d_name))
+   continue;
+
+   strbuf_setlen(, dirnamelen);
+   strbuf_addstr(, de->d_name);
+
+   base_len = path.len;
+   if (strip_suffix_mem(path.buf, _len, ".graph") &&
+   strcmp(new_graph_name, de->d_name) &&
+   (!old_graph_name || strcmp(old_graph_name, de->d_name)) &&
+   remove_path(path.buf))
+   die("failed to remove path %s", path.buf);
+   }
+
+   strbuf_release();
+}
+
 static int graph_write(int argc, const char **argv)
 {
char *graph_name;
+   char *old_graph_name;
 
static struct option builtin_commit_graph_write_options[] = {
 

[PATCH v4 06/13] commit-graph: implement git commit-graph read

2018-02-19 Thread Derrick Stolee
Teach git-commit-graph to read commit graph files and summarize their contents.

Use the read subcommand to verify the contents of a commit graph file in the
tests.

Signed-off-by: Derrick Stolee 
---
 Documentation/git-commit-graph.txt |  15 +
 builtin/commit-graph.c |  63 
 commit-graph.c | 116 +
 commit-graph.h |  21 +++
 t/t5318-commit-graph.sh|  38 ++--
 5 files changed, 249 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-commit-graph.txt 
b/Documentation/git-commit-graph.txt
index c3f222f..6d26e56 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -9,6 +9,7 @@ git-commit-graph - Write and verify Git commit graphs (.graph 
files)
 SYNOPSIS
 
 [verse]
+'git commit-graph read'  [--object-dir ]
 'git commit-graph write'  [--object-dir ]
 
 
@@ -34,6 +35,14 @@ Write a commit graph file based on the commits found in 
packfiles.
 Includes all commits from the existing commit graph file. Outputs the
 resulting filename.
 
+'read'::
+
+Read a graph file given by the graph-head file and output basic
+details about the graph file.
++
+With `--file=` option, consider the graph stored in the file at
+the path  /info/.
+
 
 EXAMPLES
 
@@ -44,6 +53,12 @@ EXAMPLES
 $ git commit-graph write
 
 
+* Read basic information from a graph file.
++
+
+$ git commit-graph read --file=
+
+
 
 GIT
 ---
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index a51d87b..28cd097 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -7,10 +7,16 @@
 
 static char const * const builtin_commit_graph_usage[] = {
N_("git commit-graph [--object-dir ]"),
+   N_("git commit-graph read [--object-dir ] [--file=]"),
N_("git commit-graph write [--object-dir ]"),
NULL
 };
 
+static const char * const builtin_commit_graph_read_usage[] = {
+   N_("git commit-graph read [--object-dir ] [--file=]"),
+   NULL
+};
+
 static const char * const builtin_commit_graph_write_usage[] = {
N_("git commit-graph write [--object-dir ]"),
NULL
@@ -18,8 +24,63 @@ static const char * const builtin_commit_graph_write_usage[] 
= {
 
 static struct opts_commit_graph {
const char *obj_dir;
+   const char *graph_file;
 } opts;
 
+static int graph_read(int argc, const char **argv)
+{
+   struct commit_graph *graph = 0;
+   struct strbuf full_path = STRBUF_INIT;
+
+   static struct option builtin_commit_graph_read_options[] = {
+   { OPTION_STRING, 'o', "object-dir", _dir,
+   N_("dir"),
+   N_("The object directory to store the graph") },
+   { OPTION_STRING, 'H', "file", _file,
+   N_("file"),
+   N_("The filename for a specific commit graph file in 
the object directory."),
+   PARSE_OPT_OPTARG, NULL, (intptr_t) "" },
+   OPT_END(),
+   };
+
+   argc = parse_options(argc, argv, NULL,
+builtin_commit_graph_read_options,
+builtin_commit_graph_read_usage, 0);
+
+   if (!opts.obj_dir)
+   opts.obj_dir = get_object_directory();
+
+   if (!opts.graph_file)
+   die("no graph hash specified");
+
+   strbuf_addf(_path, "%s/info/%s", opts.obj_dir, opts.graph_file);
+   graph = load_commit_graph_one(full_path.buf);
+
+   if (!graph)
+   die("graph file %s does not exist", full_path.buf);
+
+   printf("header: %08x %d %d %d %d\n",
+   ntohl(*(uint32_t*)graph->data),
+   *(unsigned char*)(graph->data + 4),
+   *(unsigned char*)(graph->data + 5),
+   *(unsigned char*)(graph->data + 6),
+   *(unsigned char*)(graph->data + 7));
+   printf("num_commits: %u\n", graph->num_commits);
+   printf("chunks:");
+
+   if (graph->chunk_oid_fanout)
+   printf(" oid_fanout");
+   if (graph->chunk_oid_lookup)
+   printf(" oid_lookup");
+   if (graph->chunk_commit_data)
+   printf(" commit_metadata");
+   if (graph->chunk_large_edges)
+   printf(" large_edges");
+   printf("\n");
+
+   return 0;
+}
+
 static int graph_write(int argc, const char **argv)
 {
char *graph_name;
@@ -68,6 +129,8 @@ int cmd_commit_graph(int argc, const char **argv, const char 
*prefix)
 PARSE_OPT_STOP_AT_NON_OPTION);
 
if (argc > 0) {
+   if (!strcmp(argv[0], "read"))
+   return graph_read(argc, argv);
if (!strcmp(argv[0], "write"))
return 

[PATCH v4 00/13] Serialized Git Commit Graph

2018-02-19 Thread Derrick Stolee
Thanks for all of the feedback. I've learned a lot working on this patch.

As discussed [0], this version changes several fundamental structures
and operations, including:

* Graph files are stored in .git/objects/info

* The "graph-head" file is now called "graph-latest" to avoid confusion
  with HEAD. We use "--set-latest" argument instead of "--update-head".

* The graph format no longer stores the hash width, but expects the hash
  version to imply a length. This frees up one byte for future use while
  preserving 4-byte alignment.

* The graph format is more explicit about optional chunks and no longer
  dies when seeing an unknown chunk. The large edges chunk is now optional.

* There is no longer a "clear" subcommand to git-commit-graph.

* Fixed the bug related to "--stdin-commits" and check that the command
  only includes commits reachable from the input OIDs.

* The struct packed_oid_list type is now an array of struct object_id
  instead of an array of pointers. In my testing, I saw no performance
  difference between these two options, so I switched to the simpler
  pattern.

* My patch is based on jt/binsearch-with-fanout (b4e00f730), with the
  newer method prototype since v3.

Thanks,
-Stolee

[0] 
https://public-inbox.org/git/1517348383-112294-1-git-send-email-dsto...@microsoft.com/T/#m22bfdb7cf7b3d6e5f380b8bf0eec957e2cfd2dd7

-- >8 --

As promised [1], this patch contains a way to serialize the commit graph.
The current implementation defines a new file format to store the graph
structure (parent relationships) and basic commit metadata (commit date,
root tree OID) in order to prevent parsing raw commits while performing
basic graph walks. For example, we do not need to parse the full commit
when performing these walks:

* 'git log --topo-order -1000' walks all reachable commits to avoid
  incorrect topological orders, but only needs the commit message for
  the top 1000 commits.

* 'git merge-base  ' may walk many commits to find the correct
  boundary between the commits reachable from A and those reachable
  from B. No commit messages are needed.

* 'git branch -vv' checks ahead/behind status for all local branches
  compared to their upstream remote branches. This is essentially as
  hard as computing merge bases for each.

The current patch speeds up these calculations by injecting a check in
parse_commit_gently() to check if there is a graph file and using that
to provide the required metadata to the struct commit.

The file format has room to store generation numbers, which will be
provided as a patch after this framework is merged. Generation numbers
are referenced by the design document but not implemented in order to
make the current patch focus on the graph construction process. Once
that is stable, it will be easier to add generation numbers and make
graph walks aware of generation numbers one-by-one.

Here are some performance results for a copy of the Linux repository
where 'master' has 704,766 reachable commits and is behind 'origin/master'
by 19,610 commits.

| Command  | Before | After  | Rel % |
|--|||---|
| log --oneline --topo-order -1000 |  5.9s  |  0.7s  | -88%  |
| branch -vv   |  0.42s |  0.27s | -35%  |
| rev-list --all   |  6.4s  |  1.0s  | -84%  |
| rev-list --all --objects | 32.6s  | 27.6s  | -15%  |

To test this yourself, run the following on your repo:

  git config core.commitGraph true
  git show-ref -s | git commit-graph write --set-latest --stdin-commits

The second command writes a commit graph file containing every commit
reachable from your refs. Now, all git commands that walk commits will
check your graph first before consulting the ODB. You can run your own
performance comparisions by toggling the 'core.commitgraph' setting.

[1] 
https://public-inbox.org/git/d154319e-bb9e-b300-7c37-27b1dcd2a...@jeffhostetler.com/
Re: What's cooking in git.git (Jan 2018, #03; Tue, 23)

[2] https://github.com/derrickstolee/git/pull/2
A GitHub pull request containing the latest version of this patch.

Derrick Stolee (13):
  commit-graph: add format document
  graph: add commit graph design document
  commit-graph: create git-commit-graph builtin
  commit-graph: implement write_commit_graph()
  commit-graph: implement 'git-commit-graph write'
  commit-graph: implement git commit-graph read
  commit-graph: implement --set-latest
  commit-graph: implement --delete-expired
  commit-graph: add core.commitGraph setting
  commit-graph: close under reachability
  commit: integrate commit graph with commit parsing
  commit-graph: read only from specific pack-indexes
  commit-graph: build graph from starting commits

 .gitignore  |   1 +
 Documentation/config.txt|   3 +
 Documentation/git-commit-graph.txt  | 105 
 Documentation/technical/commit-graph-format.txt |  90 +++
 

[PATCH v4 03/13] commit-graph: create git-commit-graph builtin

2018-02-19 Thread Derrick Stolee
Teach git the 'commit-graph' builtin that will be used for writing and
reading packed graph files. The current implementation is mostly
empty, except for an '--object-dir' option.

Signed-off-by: Derrick Stolee 
---
 .gitignore |  1 +
 Documentation/git-commit-graph.txt | 11 +++
 Makefile   |  1 +
 builtin.h  |  1 +
 builtin/commit-graph.c | 37 +
 command-list.txt   |  1 +
 git.c  |  1 +
 7 files changed, 53 insertions(+)
 create mode 100644 Documentation/git-commit-graph.txt
 create mode 100644 builtin/commit-graph.c

diff --git a/.gitignore b/.gitignore
index 833ef3b..e82f901 100644
--- a/.gitignore
+++ b/.gitignore
@@ -34,6 +34,7 @@
 /git-clone
 /git-column
 /git-commit
+/git-commit-graph
 /git-commit-tree
 /git-config
 /git-count-objects
diff --git a/Documentation/git-commit-graph.txt 
b/Documentation/git-commit-graph.txt
new file mode 100644
index 000..e1c3078
--- /dev/null
+++ b/Documentation/git-commit-graph.txt
@@ -0,0 +1,11 @@
+git-commit-graph(1)
+===
+
+NAME
+
+git-commit-graph - Write and verify Git commit graphs (.graph files)
+
+GIT
+---
+Part of the linkgit:git[1] suite
+
diff --git a/Makefile b/Makefile
index ee9d5eb..fc40b81 100644
--- a/Makefile
+++ b/Makefile
@@ -932,6 +932,7 @@ BUILTIN_OBJS += builtin/clone.o
 BUILTIN_OBJS += builtin/column.o
 BUILTIN_OBJS += builtin/commit-tree.o
 BUILTIN_OBJS += builtin/commit.o
+BUILTIN_OBJS += builtin/commit-graph.o
 BUILTIN_OBJS += builtin/config.o
 BUILTIN_OBJS += builtin/count-objects.o
 BUILTIN_OBJS += builtin/credential.o
diff --git a/builtin.h b/builtin.h
index 42378f3..079855b 100644
--- a/builtin.h
+++ b/builtin.h
@@ -149,6 +149,7 @@ extern int cmd_clone(int argc, const char **argv, const 
char *prefix);
 extern int cmd_clean(int argc, const char **argv, const char *prefix);
 extern int cmd_column(int argc, const char **argv, const char *prefix);
 extern int cmd_commit(int argc, const char **argv, const char *prefix);
+extern int cmd_commit_graph(int argc, const char **argv, const char *prefix);
 extern int cmd_commit_tree(int argc, const char **argv, const char *prefix);
 extern int cmd_config(int argc, const char **argv, const char *prefix);
 extern int cmd_count_objects(int argc, const char **argv, const char *prefix);
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
new file mode 100644
index 000..98110bb
--- /dev/null
+++ b/builtin/commit-graph.c
@@ -0,0 +1,37 @@
+#include "builtin.h"
+#include "config.h"
+#include "parse-options.h"
+
+static char const * const builtin_commit_graph_usage[] = {
+   N_("git commit-graph [--object-dir ]"),
+   NULL
+};
+
+static struct opts_commit_graph {
+   const char *obj_dir;
+} opts;
+
+
+int cmd_commit_graph(int argc, const char **argv, const char *prefix)
+{
+   static struct option builtin_commit_graph_options[] = {
+   { OPTION_STRING, 'p', "object-dir", _dir,
+   N_("dir"),
+   N_("The object directory to store the graph") },
+   OPT_END(),
+   };
+
+   if (argc == 2 && !strcmp(argv[1], "-h"))
+   usage_with_options(builtin_commit_graph_usage,
+  builtin_commit_graph_options);
+
+   git_config(git_default_config, NULL);
+   argc = parse_options(argc, argv, prefix,
+builtin_commit_graph_options,
+builtin_commit_graph_usage,
+PARSE_OPT_STOP_AT_NON_OPTION);
+
+   usage_with_options(builtin_commit_graph_usage,
+  builtin_commit_graph_options);
+}
+
diff --git a/command-list.txt b/command-list.txt
index a1fad28..835c589 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -34,6 +34,7 @@ git-clean   mainporcelain
 git-clone   mainporcelain   init
 git-column  purehelpers
 git-commit  mainporcelain   history
+git-commit-graphplumbingmanipulators
 git-commit-tree plumbingmanipulators
 git-config  ancillarymanipulators
 git-count-objects   ancillaryinterrogators
diff --git a/git.c b/git.c
index 9e96dd4..d4832c1 100644
--- a/git.c
+++ b/git.c
@@ -388,6 +388,7 @@ static struct cmd_struct commands[] = {
{ "clone", cmd_clone },
{ "column", cmd_column, RUN_SETUP_GENTLY },
{ "commit", cmd_commit, RUN_SETUP | NEED_WORK_TREE },
+   { "commit-graph", cmd_commit_graph, RUN_SETUP },
{ "commit-tree", cmd_commit_tree, RUN_SETUP },
{ "config", cmd_config, RUN_SETUP_GENTLY },
{ "count-objects", cmd_count_objects, RUN_SETUP },
-- 
2.7.4



[PATCH v4 09/13] commit-graph: add core.commitGraph setting

2018-02-19 Thread Derrick Stolee
The commit graph feature is controlled by the new core.commitGraph config
setting. This defaults to 0, so the feature is opt-in.

The intention of core.commitGraph is that a user can always stop checking
for or parsing commit graph files if core.commitGraph=0.

Signed-off-by: Derrick Stolee 
---
 Documentation/config.txt | 3 +++
 cache.h  | 1 +
 config.c | 5 +
 environment.c| 1 +
 4 files changed, 10 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 9593bfa..e90d0d1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -883,6 +883,9 @@ core.notesRef::
 This setting defaults to "refs/notes/commits", and it can be overridden by
 the `GIT_NOTES_REF` environment variable.  See linkgit:git-notes[1].
 
+core.commitGraph::
+   Enable git commit graph feature. Allows reading from .graph files.
+
 core.sparseCheckout::
Enable "sparse checkout" feature. See section "Sparse checkout" in
linkgit:git-read-tree[1] for more information.
diff --git a/cache.h b/cache.h
index 6440e2b..1063873 100644
--- a/cache.h
+++ b/cache.h
@@ -771,6 +771,7 @@ extern char *git_replace_ref_base;
 
 extern int fsync_object_files;
 extern int core_preload_index;
+extern int core_commit_graph;
 extern int core_apply_sparse_checkout;
 extern int precomposed_unicode;
 extern int protect_hfs;
diff --git a/config.c b/config.c
index 41862d4..614cf59 100644
--- a/config.c
+++ b/config.c
@@ -1213,6 +1213,11 @@ static int git_default_core_config(const char *var, 
const char *value)
return 0;
}
 
+   if (!strcmp(var, "core.commitgraph")) {
+   core_commit_graph = git_config_bool(var, value);
+   return 0;
+   }
+
if (!strcmp(var, "core.sparsecheckout")) {
core_apply_sparse_checkout = git_config_bool(var, value);
return 0;
diff --git a/environment.c b/environment.c
index 8289c25..81fed83 100644
--- a/environment.c
+++ b/environment.c
@@ -60,6 +60,7 @@ enum push_default_type push_default = 
PUSH_DEFAULT_UNSPECIFIED;
 enum object_creation_mode object_creation_mode = OBJECT_CREATION_MODE;
 char *notes_ref_name;
 int grafts_replace_parents = 1;
+int core_commit_graph;
 int core_apply_sparse_checkout;
 int merge_log_config = -1;
 int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */
-- 
2.7.4



[PATCH v4 10/13] commit-graph: close under reachability

2018-02-19 Thread Derrick Stolee
Teach write_commit_graph() to walk all parents from the commits
discovered in packfiles. This prevents gaps given by loose objects or
previously-missed packfiles.

Also automatically add commits from the existing graph file, if it
exists.

Signed-off-by: Derrick Stolee 
---
 commit-graph.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/commit-graph.c b/commit-graph.c
index c8fb38f..00bd73a 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -385,6 +385,28 @@ static int if_packed_commit_add_to_list(const struct 
object_id *oid,
return 0;
 }
 
+static void close_reachable(struct packed_oid_list *oids)
+{
+   int i;
+   struct rev_info revs;
+   struct commit *commit;
+   init_revisions(, NULL);
+   for (i = 0; i < oids->nr; i++) {
+   commit = lookup_commit(>list[i]);
+   if (commit && !parse_commit(commit))
+   revs.commits = commit_list_insert(commit, 
);
+   }
+
+   if (prepare_revision_walk())
+   die(_("revision walk setup failed"));
+
+   while ((commit = get_revision()) != NULL) {
+   ALLOC_GROW(oids->list, oids->nr + 1, oids->alloc);
+   oidcpy(>list[oids->nr], &(commit->object.oid));
+   (oids->nr)++;
+   }
+}
+
 char *write_commit_graph(const char *obj_dir)
 {
struct packed_oid_list oids;
@@ -411,6 +433,7 @@ char *write_commit_graph(const char *obj_dir)
ALLOC_ARRAY(oids.list, oids.alloc);
 
for_each_packed_object(if_packed_commit_add_to_list, , 0);
+   close_reachable();
 
QSORT(oids.list, oids.nr, commit_compare);
 
-- 
2.7.4



[PATCH v4 12/13] commit-graph: read only from specific pack-indexes

2018-02-19 Thread Derrick Stolee
Teach git-commit-graph to inspect the objects only in a certain list
of pack-indexes within the given pack directory. This allows updating
the commit graph iteratively, since we add all commits stored in a
previous commit graph.

Signed-off-by: Derrick Stolee 
---
 Documentation/git-commit-graph.txt | 11 +++
 builtin/commit-graph.c | 32 +---
 commit-graph.c | 26 --
 commit-graph.h |  4 +++-
 packfile.c |  4 ++--
 packfile.h |  2 ++
 t/t5318-commit-graph.sh| 16 
 7 files changed, 87 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-commit-graph.txt 
b/Documentation/git-commit-graph.txt
index b9b4031..93d50d1 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -42,6 +42,10 @@ With the `--delete-expired` option, delete the graph files 
in the pack
 directory that are not referred to by the graph-latest file. To avoid race
 conditions, do not delete the file previously referred to by the
 graph-latest file if it is updated by the `--set-latest` option.
++
+With the `--stdin-packs` option, generate the new commit graph by
+walking objects only in the specified packfiles and any commits in
+the existing graph-head.
 
 'read'::
 
@@ -68,6 +72,13 @@ $ git commit-graph write
 $ git commit-graph write --set-latest --delete-expired
 
 
+* Write a graph file, extending the current graph file using commits
+* in , update graph-latest, and delete stale graph files.
++
+
+$ echo  | git commit-graph write --set-latest --delete-expired 
--stdin-packs
+
+
 * Read basic information from a graph file.
 +
 
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index fd99169..5f08c40 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -8,7 +8,7 @@
 static char const * const builtin_commit_graph_usage[] = {
N_("git commit-graph [--object-dir ]"),
N_("git commit-graph read [--object-dir ] [--file=]"),
-   N_("git commit-graph write [--object-dir ] [--set-latest] 
[--delete-expired]"),
+   N_("git commit-graph write [--object-dir ] [--set-latest] 
[--delete-expired] [--stdin-packs]"),
NULL
 };
 
@@ -18,7 +18,7 @@ static const char * const builtin_commit_graph_read_usage[] = 
{
 };
 
 static const char * const builtin_commit_graph_write_usage[] = {
-   N_("git commit-graph write [--object-dir ] [--set-latest] 
[--delete-expired]"),
+   N_("git commit-graph write [--object-dir ] [--set-latest] 
[--delete-expired] [--stdin-packs]"),
NULL
 };
 
@@ -27,6 +27,7 @@ static struct opts_commit_graph {
const char *graph_file;
int set_latest;
int delete_expired;
+   int stdin_packs;
 } opts;
 
 static int graph_read(int argc, const char **argv)
@@ -149,6 +150,11 @@ static int graph_write(int argc, const char **argv)
 {
char *graph_name;
char *old_graph_name;
+   const char **pack_indexes = NULL;
+   int nr_packs = 0;
+   const char **lines = NULL;
+   int nr_lines = 0;
+   int alloc_lines = 0;
 
static struct option builtin_commit_graph_write_options[] = {
{ OPTION_STRING, 'o', "object-dir", _dir,
@@ -158,6 +164,8 @@ static int graph_write(int argc, const char **argv)
N_("update graph-head to written graph file")),
OPT_BOOL('d', "delete-expired", _expired,
N_("delete expired head graph file")),
+   OPT_BOOL('s', "stdin-packs", _packs,
+   N_("only scan packfiles listed by stdin")),
OPT_END(),
};
 
@@ -170,7 +178,25 @@ static int graph_write(int argc, const char **argv)
 
old_graph_name = get_graph_latest_contents(opts.obj_dir);
 
-   graph_name = write_commit_graph(opts.obj_dir);
+   if (opts.stdin_packs) {
+   struct strbuf buf = STRBUF_INIT;
+   nr_lines = 0;
+   alloc_lines = 128;
+   ALLOC_ARRAY(lines, alloc_lines);
+
+   while (strbuf_getline(, stdin) != EOF) {
+   ALLOC_GROW(lines, nr_lines + 1, alloc_lines);
+   lines[nr_lines++] = buf.buf;
+   strbuf_detach(, NULL);
+   }
+
+   pack_indexes = lines;
+   nr_packs = nr_lines;
+   }
+
+   graph_name = write_commit_graph(opts.obj_dir,
+   pack_indexes,
+   nr_packs);
 
if (graph_name) {
if (opts.set_latest)
diff --git a/commit-graph.c b/commit-graph.c
index ea07b47..943192c 100644
--- 

[PATCH v4 11/13] commit: integrate commit graph with commit parsing

2018-02-19 Thread Derrick Stolee
Teach Git to inspect a commit graph file to supply the contents of a
struct commit when calling parse_commit_gently(). This implementation
satisfies all post-conditions on the struct commit, including loading
parents, the root tree, and the commit date. The only loosely-expected
condition is that the commit buffer is loaded into the cache. This
was checked in log-tree.c:show_log(), but the "return;" on failure
produced unexpected results (i.e. the message line was never terminated).
The new behavior of loading the buffer when needed prevents the
unexpected behavior.

If core.commitGraph is false, then do not check graph files.

In test script t5318-commit-graph.sh, add output-matching conditions on
read-only graph operations.

By loading commits from the graph instead of parsing commit buffers, we
save a lot of time on long commit walks. Here are some performance
results for a copy of the Linux repository where 'master' has 704,766
reachable commits and is behind 'origin/master' by 19,610 commits.

| Command  | Before | After  | Rel % |
|--|||---|
| log --oneline --topo-order -1000 |  5.9s  |  0.7s  | -88%  |
| branch -vv   |  0.42s |  0.27s | -35%  |
| rev-list --all   |  6.4s  |  1.0s  | -84%  |
| rev-list --all --objects | 32.6s  | 27.6s  | -15%  |

Signed-off-by: Derrick Stolee 
---
 alloc.c |   1 +
 commit-graph.c  | 148 
 commit-graph.h  |  18 +-
 commit.c|   3 +
 commit.h|   3 +
 log-tree.c  |   3 +-
 t/t5318-commit-graph.sh |  45 ++-
 7 files changed, 216 insertions(+), 5 deletions(-)

diff --git a/alloc.c b/alloc.c
index 12afadf..cf4f8b6 100644
--- a/alloc.c
+++ b/alloc.c
@@ -93,6 +93,7 @@ void *alloc_commit_node(void)
struct commit *c = alloc_node(_state, sizeof(struct commit));
c->object.type = OBJ_COMMIT;
c->index = alloc_commit_index();
+   c->graph_pos = COMMIT_NOT_FROM_GRAPH;
return c;
 }
 
diff --git a/commit-graph.c b/commit-graph.c
index 00bd73a..ea07b47 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -38,6 +38,9 @@
 #define GRAPH_MIN_SIZE (GRAPH_CHUNKLOOKUP_SIZE + GRAPH_FANOUT_SIZE + \
GRAPH_OID_LEN + 8)
 
+/* global storage */
+struct commit_graph *commit_graph = NULL;
+
 char *get_graph_latest_filename(const char *obj_dir)
 {
struct strbuf fname = STRBUF_INIT;
@@ -184,6 +187,150 @@ struct commit_graph *load_commit_graph_one(const char 
*graph_file)
return graph;
 }
 
+static void prepare_commit_graph_one(const char *obj_dir)
+{
+   struct strbuf graph_file = STRBUF_INIT;
+   char *graph_name;
+
+   if (commit_graph)
+   return;
+
+   graph_name = get_graph_latest_contents(obj_dir);
+
+   if (!graph_name)
+   return;
+
+   strbuf_addf(_file, "%s/info/%s", obj_dir, graph_name);
+
+   commit_graph = load_commit_graph_one(graph_file.buf);
+
+   FREE_AND_NULL(graph_name);
+   strbuf_release(_file);
+}
+
+static int prepare_commit_graph_run_once = 0;
+void prepare_commit_graph(void)
+{
+   struct alternate_object_database *alt;
+   char *obj_dir;
+
+   if (prepare_commit_graph_run_once)
+   return;
+   prepare_commit_graph_run_once = 1;
+
+   obj_dir = get_object_directory();
+   prepare_commit_graph_one(obj_dir);
+   prepare_alt_odb();
+   for (alt = alt_odb_list; !commit_graph && alt; alt = alt->next)
+   prepare_commit_graph_one(alt->path);
+}
+
+static void close_commit_graph(void)
+{
+   if (!commit_graph)
+   return;
+
+   if (commit_graph->graph_fd >= 0) {
+   munmap((void *)commit_graph->data, commit_graph->data_len);
+   commit_graph->data = NULL;
+   close(commit_graph->graph_fd);
+   }
+
+   FREE_AND_NULL(commit_graph);
+}
+
+static int bsearch_graph(struct commit_graph *g, struct object_id *oid, 
uint32_t *pos)
+{
+   return bsearch_hash(oid->hash, g->chunk_oid_fanout,
+   g->chunk_oid_lookup, g->hash_len, pos);
+}
+
+static struct commit_list **insert_parent_or_die(struct commit_graph *g,
+uint64_t pos,
+struct commit_list **pptr)
+{
+   struct commit *c;
+   struct object_id oid;
+   hashcpy(oid.hash, g->chunk_oid_lookup + g->hash_len * pos);
+   c = lookup_commit();
+   if (!c)
+   die("could not find commit %s", oid_to_hex());
+   c->graph_pos = pos;
+   return _list_insert(c, pptr)->next;
+}
+
+static int fill_commit_in_graph(struct commit *item, struct commit_graph *g, 
uint32_t pos)
+{
+   struct object_id oid;
+   uint32_t new_parent_pos;
+   uint32_t 

[PATCH v4 13/13] commit-graph: build graph from starting commits

2018-02-19 Thread Derrick Stolee
Teach git-commit-graph to read commits from stdin when the
--stdin-commits flag is specified. Commits reachable from these
commits are added to the graph. This is a much faster way to construct
the graph than inspecting all packed objects, but is restricted to
known tips.

For the Linux repository, 700,000+ commits were added to the graph
file starting from 'master' in 7-9 seconds, depending on the number
of packfiles in the repo (1, 24, or 120). If a commit graph file
already exists (and core.commitGraph=true), then this operation takes
only 1.8 seconds due to less time spent parsing commits.

Signed-off-by: Derrick Stolee 
---
 Documentation/git-commit-graph.txt | 15 ++-
 builtin/commit-graph.c | 27 +--
 commit-graph.c | 26 --
 commit-graph.h |  4 +++-
 t/t5318-commit-graph.sh| 19 +++
 5 files changed, 81 insertions(+), 10 deletions(-)

diff --git a/Documentation/git-commit-graph.txt 
b/Documentation/git-commit-graph.txt
index 93d50d1..43ac74b 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -45,7 +45,12 @@ graph-latest file if it is updated by the `--set-latest` 
option.
 +
 With the `--stdin-packs` option, generate the new commit graph by
 walking objects only in the specified packfiles and any commits in
-the existing graph-head.
+the existing graph-head. (Cannot be combined with --stdin-commits.)
++
+With the `--stdin-commits` option, generate the new commit graph by
+walking commits starting at the commits specified in stdin as a list
+of OIDs in hex, one OID per line. (Cannot be combined with
+--stdin-packs.)
 
 'read'::
 
@@ -79,6 +84,14 @@ $ git commit-graph write --set-latest --delete-expired
 $ echo  | git commit-graph write --set-latest --delete-expired 
--stdin-packs
 
 
+* Write a graph file, extending the current graph file using all
+* commits reachable from refs/heads/*, update graph-latest, and delete
+* stale graph files.
++
+
+$ git show-ref -s | git commit-graph write --set-latest --delete-expired 
--stdin-commits
+
+
 * Read basic information from a graph file.
 +
 
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 5f08c40..9b92549 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -8,7 +8,7 @@
 static char const * const builtin_commit_graph_usage[] = {
N_("git commit-graph [--object-dir ]"),
N_("git commit-graph read [--object-dir ] [--file=]"),
-   N_("git commit-graph write [--object-dir ] [--set-latest] 
[--delete-expired] [--stdin-packs]"),
+   N_("git commit-graph write [--object-dir ] [--set-latest] 
[--delete-expired] [--stdin-packs|--stdin-commits]"),
NULL
 };
 
@@ -18,7 +18,7 @@ static const char * const builtin_commit_graph_read_usage[] = 
{
 };
 
 static const char * const builtin_commit_graph_write_usage[] = {
-   N_("git commit-graph write [--object-dir ] [--set-latest] 
[--delete-expired] [--stdin-packs]"),
+   N_("git commit-graph write [--object-dir ] [--set-latest] 
[--delete-expired] [--stdin-packs|--stdin-commits]"),
NULL
 };
 
@@ -28,6 +28,7 @@ static struct opts_commit_graph {
int set_latest;
int delete_expired;
int stdin_packs;
+   int stdin_commits;
 } opts;
 
 static int graph_read(int argc, const char **argv)
@@ -152,6 +153,8 @@ static int graph_write(int argc, const char **argv)
char *old_graph_name;
const char **pack_indexes = NULL;
int nr_packs = 0;
+   const char **commit_hex = NULL;
+   int nr_commits = 0;
const char **lines = NULL;
int nr_lines = 0;
int alloc_lines = 0;
@@ -166,6 +169,8 @@ static int graph_write(int argc, const char **argv)
N_("delete expired head graph file")),
OPT_BOOL('s', "stdin-packs", _packs,
N_("only scan packfiles listed by stdin")),
+   OPT_BOOL('C', "stdin-commits", _commits,
+   N_("start walk at commits listed by stdin")),
OPT_END(),
};
 
@@ -173,12 +178,14 @@ static int graph_write(int argc, const char **argv)
 builtin_commit_graph_write_options,
 builtin_commit_graph_write_usage, 0);
 
+   if (opts.stdin_packs && opts.stdin_commits)
+   die(_("cannot use both --stdin-commits and --stdin-packs"));
if (!opts.obj_dir)
opts.obj_dir = get_object_directory();
 
old_graph_name = get_graph_latest_contents(opts.obj_dir);
 
-   if (opts.stdin_packs) {
+   if (opts.stdin_packs || opts.stdin_commits) {
struct strbuf buf = STRBUF_INIT;
  

[PATCH v4 01/13] commit-graph: add format document

2018-02-19 Thread Derrick Stolee
Add document specifying the binary format for commit graphs. This
format allows for:

* New versions.
* New hash functions and hash lengths.
* Optional extensions.

Basic header information is followed by a binary table of contents
into "chunks" that include:

* An ordered list of commit object IDs.
* A 256-entry fanout into that list of OIDs.
* A list of metadata for the commits.
* A list of "large edges" to enable octopus merges.

The format automatically includes two parent positions for every
commit. This favors speed over space, since using only one position
per commit would cause an extra level of indirection for every merge
commit. (Octopus merges suffer from this indirection, but they are
very rare.)

Signed-off-by: Derrick Stolee 
---
 Documentation/technical/commit-graph-format.txt | 90 +
 1 file changed, 90 insertions(+)
 create mode 100644 Documentation/technical/commit-graph-format.txt

diff --git a/Documentation/technical/commit-graph-format.txt 
b/Documentation/technical/commit-graph-format.txt
new file mode 100644
index 000..11b18b5
--- /dev/null
+++ b/Documentation/technical/commit-graph-format.txt
@@ -0,0 +1,90 @@
+Git commit graph format
+===
+
+The Git commit graph stores a list of commit OIDs and some associated
+metadata, including:
+
+- The generation number of the commit. Commits with no parents have
+  generation number 1; commits with parents have generation number
+  one more than the maximum generation number of its parents. We
+  reserve zero as special, and can be used to mark a generation
+  number invalid or as "not computed".
+
+- The root tree OID.
+
+- The commit date.
+
+- The parents of the commit, stored using positional references within
+  the graph file.
+
+== graph-*.graph files have the following format:
+
+In order to allow extensions that add extra data to the graph, we organize
+the body into "chunks" and provide a binary lookup table at the beginning
+of the body. The header includes certain values, such as number of chunks,
+hash lengths and types.
+
+All 4-byte numbers are in network order.
+
+HEADER:
+
+  4-byte signature:
+  The signature is: {'C', 'G', 'P', 'H'}
+
+  1-byte version number:
+  Currently, the only valid version is 1.
+
+  1-byte Object Id Version (1 = SHA-1)
+
+  1-byte number (C) of "chunks"
+
+  1-byte (reserved for later use)
+
+CHUNK LOOKUP:
+
+  (C + 1) * 12 bytes listing the table of contents for the chunks:
+  First 4 bytes describe chunk id. Value 0 is a terminating label.
+  Other 8 bytes provide offset in current file for chunk to start.
+  (Chunks are ordered contiguously in the file, so you can infer
+  the length using the next chunk position if necessary.)
+
+  The remaining data in the body is described one chunk at a time, and
+  these chunks may be given in any order. Chunks are required unless
+  otherwise specified.
+
+CHUNK DATA:
+
+  OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)
+  The ith entry, F[i], stores the number of OIDs with first
+  byte at most i. Thus F[255] stores the total
+  number of commits (N).
+
+  OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
+  The OIDs for all commits in the graph, sorted in ascending order.
+
+  Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
+* The first H bytes are for the OID of the root tree.
+* The next 8 bytes are for the int-ids of the first two parents
+  of the ith commit. Stores value 0x if no parent in that
+  position. If there are more than two parents, the second value
+  has its most-significant bit on and the other bits store an array
+  position into the Large Edge List chunk.
+* The next 8 bytes store the generation number of the commit and
+  the commit time in seconds since EPOCH. The generation number
+  uses the higher 30 bits of the first 4 bytes, while the commit
+  time uses the 32 bits of the second 4 bytes, along with the lowest
+  2 bits of the lowest byte, storing the 33rd and 34th bit of the
+  commit time.
+
+  Large Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional]
+  This list of 4-byte values store the second through nth parents for
+  all octopus merges. The second parent value in the commit data stores
+  an array position within this list along with the most-significant bit
+  on. Starting at that array position, iterate through this list of int-ids
+  for the parents until reaching a value with the most-significant bit on.
+  The other bits correspond to the int-id of the last parent.
+
+TRAILER:
+
+   H-byte HASH-checksum of all of the above.
+
-- 
2.7.4



Re: [PATCH] t/known-leaky: add list of known-leaky test scripts

2018-02-19 Thread Jeff King
On Wed, Feb 14, 2018 at 10:56:37PM +0100, Martin Ågren wrote:

> Here's what a list of known leaks might look like. It feels a bit
> awkward to post a known-incomplete list (I don't run all tests). Duy
> offered to pick up the ball if I gave up, maybe you could complete and
> post this as your own? :-? Even if I (or others) can't reproduce the
> complete list locally, regressions will be trivial to find, and newly
> leak-free tests fairly easy to notice.

I didn't think about that when I posted my scripts. In general, it's OK
to me if you miss a script when you generate the "leaky" list. But if
you skip it, you cannot say whether it is leaky or not, and should
probably neither add nor remove it from the known-leaky list. So I think
the second shell snippet needs to become a little more clever about
skipped test scripts.

Even that isn't 100% fool-proof, as some individual tests may be skipped
or not skipped on various platforms. But it may be enough in practice
(and eventually we'd have no known-leaky tests, of course ;) ).

Or alternatively, we could just not bother with checking this into the
repository, and it becomes a local thing for people interested in
leak-testing. What's the value in having a shared known-leaky list,
especially if we don't expect most people to run it.

I guess it lets us add a Travis job to do the leak-checking, which might
get more coverage. So maybe if we do have an in-repo known-leaky, it
should match some canonical Travis environment. That won't give us
complete coverage, but at this point we're just trying to notice
low-hanging fruit.

-Peff


Re: cherry-pick '-m' curiosity

2018-02-19 Thread G. Sylvie Davies
On Mon, Feb 5, 2018 at 3:46 AM, Sergey Organov  wrote:
> Hello,
>
> $ git help cherry-pick
>
> -m parent-number, --mainline parent-number
>Usually you cannot cherry-pick a merge because you do not
>know which side of the merge should be considered the
>mainline.
>
> Isn't it always the case that "mainline" is the first parent, as that's
> how "git merge" happens to work?
>

First-parent will be whatever commit you were sitting on when you
typed "git merge".

If you're sitting on your branch and you type "git fetch; git merge
origin/master", then "mainline" will be 2nd parent.

Same happens if you type "git pull".

Further reading here:
https://developer.atlassian.com/blog/2016/04/stop-foxtrots-now/

"git revert -m" also has the same problem.


> Is, say, "-m 2" ever useful?
>
> --
> Sergey


Git should preserve modification times at least on request

2018-02-19 Thread Peter Backes
Hello,

please ensure to CC me if you reply as I am not subscribed to the list.

https://git.wiki.kernel.org/index.php/Git_FAQ#Why_isn.27t_Git_preserving_modification_time_on_files.3F
 
argues that git isn't preserving modification times because it needs to 
ensure that build tools work properly.

I agree that modification times should not be restored by default, 
because of the principle of least astonishment. But should it be 
impossible? The principle of least astonishment does not mandate this; 
it is not a paternalistic principle.

Thus, I do not get at all
- why git doesn't *store* modification times, perhaps by default, but 
at least on request
- why git doesn't restore modification times *on request*

It is pretty annoying that git cannot, even if I know what I am doing, 
and explicitly want it to, preserve the modification time.

One use case: I have lots of file lying around in my build directory 
and for some of them, the modification time in important information to 
me. Those files are not at all used with the build tool. In contrast to 
git pull, git pull --rebase needs those to be stashed. But after the 
pull and unstash, the mtime is gone. Boo.

Please provide options to store and restore modification times. It 
shouldn't be hard to do, given that other metadata such as the mode is 
already stored. It would make live so much easier. And the fact that 
this has made into the FAQ clearly suggests that there are many others 
who think so.

Best wishes
Peter

-- 
Peter Backes, r...@helen.plasma.xg8.de


Re: Git should preserve modification times at least on request

2018-02-19 Thread Johannes Schindelin
Hi Peter,

On Mon, 19 Feb 2018, Peter Backes wrote:

> please ensure to CC me if you reply as I am not subscribed to the list.
> 
> https://git.wiki.kernel.org/index.php/Git_FAQ#Why_isn.27t_Git_preserving_modification_time_on_files.3F
>  
> argues that git isn't preserving modification times because it needs to 
> ensure that build tools work properly.
> 
> I agree that modification times should not be restored by default, 
> because of the principle of least astonishment. But should it be 
> impossible? The principle of least astonishment does not mandate this; 
> it is not a paternalistic principle.
> 
> Thus, I do not get at all
> - why git doesn't *store* modification times, perhaps by default, but 
> at least on request
> - why git doesn't restore modification times *on request*
> 
> It is pretty annoying that git cannot, even if I know what I am doing, 
> and explicitly want it to, preserve the modification time.
> 
> One use case: I have lots of file lying around in my build directory 
> and for some of them, the modification time in important information to 
> me. Those files are not at all used with the build tool. In contrast to 
> git pull, git pull --rebase needs those to be stashed. But after the 
> pull and unstash, the mtime is gone. Boo.
> 
> Please provide options to store and restore modification times. It 
> shouldn't be hard to do, given that other metadata such as the mode is 
> already stored. It would make live so much easier. And the fact that 
> this has made into the FAQ clearly suggests that there are many others 
> who think so.

Since you already assessed that it shouldn't be hard to do, you probably
want to put your money where your mouth is and come up with a patch, and
then offer it up for discussion on this here mailing list.

Ciao,
Johannes


Re: Git should preserve modification times at least on request

2018-02-19 Thread Peter Backes
Hi Johannes,

On Mon, Feb 19, 2018 at 10:58:12PM +0100, Johannes Schindelin wrote:
> Since you already assessed that it shouldn't be hard to do, you probably
> want to put your money where your mouth is and come up with a patch, and
> then offer it up for discussion on this here mailing list.

Well, it would be good to discuss this a bit beforehand, since my time 
is wasted if there's no chance to get it accepted. Perhaps there is 
some counterargument I don't know about.

Is there some existing code that could be used? I think I read 
somewhere that git once did preserve mtimes, but that this code was 
removed because of the build tool issues. Perhaps that code could 
simply be put back in, and surrounded by conditions.

Best wishes
Peter

PS: Given the opportunity, I want to thank you very much for 
maintaining the git repository for my cvsclone tool.

-- 
Peter Backes, r...@helen.plasma.xg8.de


[GSoC][PATCH] tag: Make "git tag --contains " less chatty if is invalid

2018-02-19 Thread Paul-Sebastian Ungureanu
git tag --contains  prints the whole help text if  is
invalid. It should only show the error message instead.

This bug was a side effect of looking up the commit in option
parser callback. When a error occurs in the option parser, the
full usage is shown. To fix this bug, the part related to
looking up the commit was moved outside of the option parser
to the ref-filter module.

Basically, the option parser only parses strings that represent
commits and the ref-filter performs the commit look-up. If an
error occurs during the option parsing, then it must be an invalid
argument and the user should be informed of usage, but if a error
occurs during ref-filtering, then it is a problem with the
argument.

Signed-off-by: Paul-Sebastian Ungureanu 
---
 builtin/tag.c   | 11 +--
 parse-options.h | 11 +++
 ref-filter.c| 19 +++
 ref-filter.h|  3 +++
 t/t7013-tag-contains.sh | 37 +
 5 files changed, 79 insertions(+), 2 deletions(-)
 create mode 100755 t/t7013-tag-contains.sh

diff --git a/builtin/tag.c b/builtin/tag.c
index a7e6a5b0f..3fce14713 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -396,6 +396,12 @@ int cmd_tag(int argc, const char **argv, const char 
*prefix)
 
OPT_GROUP(N_("Tag listing options")),
OPT_COLUMN(0, "column", , N_("show tag list in 
columns")),
+
+   OPT_CONTAINS_STRS(_commit_strs, N_("print only tags 
that contain the commit")),
+   OPT_NO_CONTAINS_STRS(_commit_strs, N_("print only 
tags that don't contain the commit")),
+   OPT_WITH_STRS(_commit_strs, N_("print only tags 
that contain the commit")),
+   OPT_WITHOUT_STRS(_commit_strs, N_("print only tags 
that don't contain the commit")),
+
OPT_CONTAINS(_commit, N_("print only tags that 
contain the commit")),
OPT_NO_CONTAINS(_commit, N_("print only tags that 
don't contain the commit")),
OPT_WITH(_commit, N_("print only tags that contain 
the commit")),
@@ -437,6 +443,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
cmdmode = 'l';
else if (filter.with_commit || filter.no_commit ||
 filter.points_at.nr || filter.merge_commit ||
+filter.with_commit_strs.nr || filter.no_commit_strs.nr 
||
 filter.lines != -1)
cmdmode = 'l';
}
@@ -473,9 +480,9 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
}
if (filter.lines != -1)
die(_("-n option is only allowed in list mode"));
-   if (filter.with_commit)
+   if (filter.with_commit || filter.with_commit_strs.nr)
die(_("--contains option is only allowed in list mode"));
-   if (filter.no_commit)
+   if (filter.no_commit || filter.no_commit_strs.nr)
die(_("--no-contains option is only allowed in list mode"));
if (filter.points_at.nr)
die(_("--points-at option is only allowed in list mode"));
diff --git a/parse-options.h b/parse-options.h
index af711227a..3aa8ddd46 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -258,9 +258,20 @@ extern int parse_opt_passthru_argv(const struct option *, 
const char *, int);
  PARSE_OPT_LASTARG_DEFAULT | flag, \
  parse_opt_commits, (intptr_t) "HEAD" \
}
+#define _OPT_CONTAINS_OR_WITH_STRS(name, variable, help, flag) \
+   { OPTION_CALLBACK, 0, name, (variable), N_("commit"), (help), \
+ PARSE_OPT_LASTARG_DEFAULT | flag, \
+ parse_opt_string_list, (intptr_t) "HEAD" \
+   }
+
 #define OPT_CONTAINS(v, h) _OPT_CONTAINS_OR_WITH("contains", v, h, 
PARSE_OPT_NONEG)
 #define OPT_NO_CONTAINS(v, h) _OPT_CONTAINS_OR_WITH("no-contains", v, h, 
PARSE_OPT_NONEG)
 #define OPT_WITH(v, h) _OPT_CONTAINS_OR_WITH("with", v, h, PARSE_OPT_HIDDEN | 
PARSE_OPT_NONEG)
 #define OPT_WITHOUT(v, h) _OPT_CONTAINS_OR_WITH("without", v, h, 
PARSE_OPT_HIDDEN | PARSE_OPT_NONEG)
 
+#define OPT_CONTAINS_STRS(v, h) _OPT_CONTAINS_OR_WITH_STRS("contains", v, h, 
PARSE_OPT_NONEG)
+#define OPT_NO_CONTAINS_STRS(v, h) _OPT_CONTAINS_OR_WITH_STRS("no-contains", 
v, h, PARSE_OPT_NONEG)
+#define OPT_WITH_STRS(v, h) _OPT_CONTAINS_OR_WITH_STRS("with", v, h, 
PARSE_OPT_HIDDEN | PARSE_OPT_NONEG)
+#define OPT_WITHOUT_STRS(v, h) _OPT_CONTAINS_OR_WITH_STRS("without", v, h, 
PARSE_OPT_HIDDEN | PARSE_OPT_NONEG)
+
 #endif
diff --git a/ref-filter.c b/ref-filter.c
index f9e25aea7..e7bdaa27f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2000,6 +2000,21 @@ static void do_merge_filter(struct ref_filter_cbdata 
*ref_cbdata)
free(to_clear);
 }
 
+int add_str_to_commit_list(struct string_list_item *item, void *commit_list)
+{
+   struct object_id oid;
+   struct commit *commit;
+
+   if (get_oid(item->string, ))
+  

Re: Fetch-hooks

2018-02-19 Thread Jeff King
On Wed, Feb 14, 2018 at 03:02:00AM +0100, Leo Gaspard wrote:

> > So does anybody actually want to be able to adjust the refs as they pass
> > through? It really sounds like you just want to be able to reject or not
> > reject the fetch. And that rejecting would be the uncommon case, so it's
> > OK to just abort the whole operation and expect the user to figure it
> > out.
> 
> This completely fits my use case (modulo the fact that it's more similar
> to the `update` hook than to `pre-receive` I think, as verifying the
> signature requires the objects to already have been downloaded), indeed,
> though I'm not sure it would have fit Joey's (based on my understanding,
> adding a merge was what was originally asked for).
> 
> Actually, I'm wondering if the existing semantics of `update` could not
> be reused for the `pre-fetch`. Would it make sense to just call `update`
> during a fetch in the same way as during a receive-pack? That said it
> likely makes this a breaking change, though it's maybe unlikely that a
> repository is used both for fetching and for receive-pack'ing, it could
> happen.
> 
> So the proposal as I understand it would currently be adding a
> `fetch-update` hook that does exactly the same thing as `update` but for
> `fetch`. This solves my use case in a nice way, though it likely doesn't
> solve Joey's original one (which has been taken care of by a wrapper
> since then).
> 
> What do you all think about it?

I think it should be a separate hook; having an existing hook trigger in
new places is likely to cause confusion and regressions.

If you do go this route, please model it after "pre-receive" rather than
"update". We had "update" originally but found it was too limiting for
hooks to see only one ref at a time. So we introduced pre-receive. The
"update" hook remains for historical reasons, but I don't think we'd
want to reproduce the mistake. :)

-Peff


Is there any way to "interrupt" a rebase?

2018-02-19 Thread Hilco Wijbenga
Hi all,

When maintaining a long running branch, I regularly rebase onto our
active development branch so that my branch stays up-to-date. What
happens fairly often is that during such a rebase, Git will exit
because of rebase/merge conflicts. Nothing unexpected there, of
course, but as it sometimes turns out, the conflict should have been
fixed in an earlier commit. The only way that I know of to fix this,
is to abort the rebase and start over with "git rebase ...
--interactive" then "edit" every commit and go through them
one-by-one. This is often overkill, though. Is there a better way?
Perhaps I could "rewind" the rebase to an earlier commit and restart
from there?

So a scenario like this:

my-branch : X -> A -> B -> C -> D -> E -> F -> G
base-branch : X -> Y

git rebase --onto base-branch HEAD~7
commit A --> conflicts
... lots of work ...
commit B --> conflicts
... lots of work ...
commit C (Git handles conflicts)
commit D (no conflict)
commit E --> conflicts
... er, that should have been fixed in commit C

How do I keep all the work I did for commits A and B? I get the
impression that rerere does not help here because I did not finish the
rebase succesfully (and that makes perfect sense, of course). Is there
a way at this point in the rebase to "go back" to commit C (so without
"git rebase --abort")?

(Surely, it's not as simple as doing a "git reset --hard
sha-of-commit-C" is it?)

Cheers,
Hilco


[PATCH 1/2] t5545: factor out http repository setup

2018-02-19 Thread Jeff King
We repeat many lines of setup code in the two http tests,
and further tests would need to repeat it again.  Let's
factor this out into a function.

Incidentally, this also fixes an unlikely bug: if the httpd
root path contains a double-quote, our test_when_finished
would barf due to improper quoting (we escape the embedded
quotes, but not the $, meaning we expand the variable before
the eval).

Signed-off-by: Jeff King 
---
Arguably this setup could be done once and then reused by several tests,
which would be a bit more efficient.  But the whole script is written in
this "remake repos fresh" style, so I didn't look into switching it.

 t/t5545-push-options.sh | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index 463783789c..c64dee2127 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -220,14 +220,20 @@ test_expect_success 'invalid push option in config' '
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-test_expect_success 'push option denied properly by http server' '
+# set up http repository for fetching/pushing, with push options config
+# bool set to $1
+mk_http_pair () {
test_when_finished "rm -rf test_http_clone" &&
-   test_when_finished "rm -rf \"$HTTPD_DOCUMENT_ROOT_PATH\"/upstream.git" 
&&
+   test_when_finished 'rm -rf "$HTTPD_DOCUMENT_ROOT_PATH"/upstream.git' &&
mk_repo_pair &&
-   git -C upstream config receive.advertisePushOptions false &&
+   git -C upstream config receive.advertisePushOptions "$1" &&
git -C upstream config http.receivepack true &&
cp -R upstream/.git "$HTTPD_DOCUMENT_ROOT_PATH"/upstream.git &&
-   git clone "$HTTPD_URL"/smart/upstream test_http_clone &&
+   git clone "$HTTPD_URL"/smart/upstream test_http_clone
+}
+
+test_expect_success 'push option denied properly by http server' '
+   mk_http_pair false &&
test_commit -C test_http_clone one &&
test_must_fail git -C test_http_clone push --push-option=asdf origin 
master 2>actual &&
test_i18ngrep "the receiving end does not support push options" actual 
&&
@@ -235,13 +241,7 @@ test_expect_success 'push option denied properly by http 
server' '
 '
 
 test_expect_success 'push options work properly across http' '
-   test_when_finished "rm -rf test_http_clone" &&
-   test_when_finished "rm -rf \"$HTTPD_DOCUMENT_ROOT_PATH\"/upstream.git" 
&&
-   mk_repo_pair &&
-   git -C upstream config receive.advertisePushOptions true &&
-   git -C upstream config http.receivepack true &&
-   cp -R upstream/.git "$HTTPD_DOCUMENT_ROOT_PATH"/upstream.git &&
-   git clone "$HTTPD_URL"/smart/upstream test_http_clone &&
+   mk_http_pair true &&
 
test_commit -C test_http_clone one &&
git -C test_http_clone push origin master &&
-- 
2.16.2.552.gea2a3cf654



[PATCH 0/2] quoting bug sending push-options over http

2018-02-19 Thread Jeff King
This series fixes a small quoting problem in 511155db51 (remote-curl:
allow push options, 2017-03-22). The interesting one is the second
patch.

  [1/2]: t5545: factor out http repository setup
  [2/2]: remote-curl: unquote incoming push-options

 remote-curl.c   | 11 ++-
 t/t5545-push-options.sh | 40 +---
 2 files changed, 39 insertions(+), 12 deletions(-)

-Peff


[PATCH 2/2] remote-curl: unquote incoming push-options

2018-02-19 Thread Jeff King
The transport-helper protocol c-style quotes the value of
any options passed to the helper via the "option  "
directive. However, remote-curl doesn't actually unquote the
push-option values, meaning that we will send the quoted
version to the other side (whereas git-over-ssh would send
the raw value).

The pack-protocol.txt documentation defines the push-options
as a series of VCHARs, which excludes most characters that
would need quoting. But:

  1. You can still see the bug with a valid push-option that
 starts with a double-quote (since that triggers
 quoting).

  2. We do currently handle any non-NUL characters correctly
 in git-over-ssh. So even though the spec does not say
 that we need to handle most quoted characters, it's
 nice if our behavior is consistent between protocols.

There are two new tests: the "direct" one shows that this
already works in the non-http case, and the http one covers
this bugfix.

Reported-by: Jon Simons 
Signed-off-by: Jeff King 
---
 remote-curl.c   | 11 ++-
 t/t5545-push-options.sh | 18 ++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 6ec5352435..f5b3d22e26 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -13,6 +13,7 @@
 #include "credential.h"
 #include "sha1-array.h"
 #include "send-pack.h"
+#include "quote.h"
 
 static struct remote *remote;
 /* always ends with a trailing slash */
@@ -145,7 +146,15 @@ static int set_option(const char *name, const char *value)
return -1;
return 0;
} else if (!strcmp(name, "push-option")) {
-   string_list_append(_options, value);
+   if (*value != '"')
+   string_list_append(_options, value);
+   else {
+   struct strbuf unquoted = STRBUF_INIT;
+   if (unquote_c_style(, value, NULL) < 0)
+   die("invalid quoting in push-option value");
+   string_list_append_nodup(_options,
+strbuf_detach(, 
NULL));
+   }
return 0;
 
 #if LIBCURL_VERSION_NUM >= 0x070a08
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index c64dee2127..b47a95871c 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -217,6 +217,15 @@ test_expect_success 'invalid push option in config' '
test_refs master HEAD@{1}
 '
 
+test_expect_success 'push options keep quoted characters intact (direct)' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
+   test_commit -C workbench one &&
+   git -C workbench push --push-option="\"embedded quotes\"" up master &&
+   echo "\"embedded quotes\"" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
@@ -260,6 +269,15 @@ test_expect_success 'push options work properly across 
http' '
test_cmp expect actual
 '
 
+test_expect_success 'push options keep quoted characters intact (http)' '
+   mk_http_pair true &&
+
+   test_commit -C test_http_clone one &&
+   git -C test_http_clone push --push-option="\"embedded quotes\"" origin 
master &&
+   echo "\"embedded quotes\"" >expect &&
+   test_cmp expect 
"$HTTPD_DOCUMENT_ROOT_PATH"/upstream.git/hooks/pre-receive.push_options
+'
+
 stop_httpd
 
 test_done
-- 
2.16.2.552.gea2a3cf654


Re: Is there any way to "interrupt" a rebase?

2018-02-19 Thread brian m. carlson
On Mon, Feb 19, 2018 at 11:35:25AM -0800, Hilco Wijbenga wrote:
> So a scenario like this:
> 
> my-branch : X -> A -> B -> C -> D -> E -> F -> G
> base-branch : X -> Y
> 
> git rebase --onto base-branch HEAD~7
> commit A --> conflicts
> ... lots of work ...
> commit B --> conflicts
> ... lots of work ...
> commit C (Git handles conflicts)
> commit D (no conflict)
> commit E --> conflicts
> ... er, that should have been fixed in commit C
> 
> How do I keep all the work I did for commits A and B? I get the
> impression that rerere does not help here because I did not finish the
> rebase succesfully (and that makes perfect sense, of course). Is there
> a way at this point in the rebase to "go back" to commit C (so without
> "git rebase --abort")?

What I do in this case is I unstage all the changes from the index, make
the change that should have gone into commit C, use git commit --fixup
(or --squash), and then restage the rest of the changes and continue
with the rebase.  I can then use git rebase -i --autosquash afterwards
to insert the commit into the right place.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


RE: Git should preserve modification times at least on request

2018-02-19 Thread Randall S. Becker
On February 19, 2018 4:58 PM Johannes wrote:
> On Mon, 19 Feb 2018, Peter Backes wrote:
> 
> > please ensure to CC me if you reply as I am not subscribed to the list.
> >
> > https://git.wiki.kernel.org/index.php/Git_FAQ#Why_isn.27t_Git_preservi
> > ng_modification_time_on_files.3F argues that git isn't preserving
> > modification times because it needs to ensure that build tools work
> > properly.
> >
> > I agree that modification times should not be restored by default,
> > because of the principle of least astonishment. But should it be
> > impossible? The principle of least astonishment does not mandate this;
> > it is not a paternalistic principle.
> >
> > Thus, I do not get at all
> > - why git doesn't *store* modification times, perhaps by default, but
> > at least on request
> > - why git doesn't restore modification times *on request*
> >
> > It is pretty annoying that git cannot, even if I know what I am doing,
> > and explicitly want it to, preserve the modification time.
> >
> > One use case: I have lots of file lying around in my build directory
> > and for some of them, the modification time in important information
> > to me. Those files are not at all used with the build tool. In
> > contrast to git pull, git pull --rebase needs those to be stashed. But
> > after the pull and unstash, the mtime is gone. Boo.
> >
> > Please provide options to store and restore modification times. It
> > shouldn't be hard to do, given that other metadata such as the mode is
> > already stored. It would make live so much easier. And the fact that
> > this has made into the FAQ clearly suggests that there are many others
> > who think so.
> 
> Since you already assessed that it shouldn't be hard to do, you probably
> want to put your money where your mouth is and come up with a patch, and
> then offer it up for discussion on this here mailing list.

Putting my large-production-user hat on, there are (at least) three
conditions that exist in this space:

1. Build systems - this typically need the file modification time to be set
to the time at which git touches a file (e.g., checkout). This permits build
systems to detect that files are modified (even if an older version is
checked out, make, for example, still needs to see the change to initiate a
build. My understanding is that current git behaviour is modeled on this use
case.

2. Commit linkage - in some environments, files that are checked out are set
to the timestamp of the commit rather than the original file time or the
checkout time. This permits a faster production resolution of when changes
were run through the system as a group. I have implemented this strategy
(somewhat grudgingly) in a few places. It is a possible desire for some
users. I particularly dislike this approach because merge/cherry-pick/rebase
can mess with the preceptive "when" of a change and if you are going to do
this, make sure that your metadata is suitably managed.

3. Original file times - as Peter asked, storing the original file time has
some legacy advantages. This emulates the behaviour of some legacy SCM
systems and makes people feel better about things. From an audit point of
view, this has value for systems other than git. In git, you use the
hash-object to figure out what the file really is, so there is no real audit
need anymore for timestamps, which can be spoofed at whim anyway. The
hash-object comment applies to 2 also. Same comment here for dealing with
non-touching but modifying. For example: what is the timestamp on a
merge-squash? I would contend that it is the time of the merge-squash, not
the original time. It could also be an interim term, when a conflict was
resolved.

Just remember that #2 and #3 break #1, unless you essentially rebuild from
scratch in every build (ant/maven models). With that said, I seen many repo
admins who want all of the above, so making them all available would make
their lives easier.

My $0.02. Cheers,
Randall

-- Brief whoami:
  NonStop developer since approximately NonStop(2112884442)
  UNIX developer since approximately 421664400
-- In my real life, I talk too much.





Re: Fetch-hooks

2018-02-19 Thread Leo Gaspard
On 02/19/2018 10:23 PM, Jeff King wrote:
> [...]
> If you do go this route, please model it after "pre-receive" rather than
> "update". We had "update" originally but found it was too limiting for
> hooks to see only one ref at a time. So we introduced pre-receive. The
> "update" hook remains for historical reasons, but I don't think we'd
> want to reproduce the mistake. :)

Hmm, what bothered me with “pre-receive” was that it was an
all-or-nothing decision, without the ability to allow some references
through and not others.

Is there a way for “pre-receive” to individually filter hooks? I was
under the impression that the only way to do that was to use the
“update” hook, which was the reason I wanted to model it after “update”
rather than “pre-receive” (my use case being a check independent for
each pushed ref)


[PATCH 18/36] sha1_file: convert read_loose_object to use struct object_id

2018-02-19 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 builtin/fsck.c |  2 +-
 cache.h|  4 ++--
 sha1_file.c| 10 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 9981db2263..57509a4eac 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -513,7 +513,7 @@ static struct object *parse_loose_object(const struct 
object_id *oid,
unsigned long size;
int eaten;
 
-   if (read_loose_object(path, oid->hash, , , ) < 0)
+   if (read_loose_object(path, oid, , , ) < 0)
return NULL;
 
if (!contents && type != OBJ_BLOB)
diff --git a/cache.h b/cache.h
index 09cd994bcc..8a9055f4e7 100644
--- a/cache.h
+++ b/cache.h
@@ -1241,14 +1241,14 @@ extern int check_sha1_signature(const unsigned char 
*sha1, void *buf, unsigned l
 extern int finalize_object_file(const char *tmpfile, const char *filename);
 
 /*
- * Open the loose object at path, check its sha1, and return the contents,
+ * Open the loose object at path, check its hash, and return the contents,
  * type, and size. If the object is a blob, then "contents" may return NULL,
  * to allow streaming of large blobs.
  *
  * Returns 0 on success, negative on error (details may be written to stderr).
  */
 int read_loose_object(const char *path,
- const unsigned char *expected_sha1,
+ const struct object_id *expected_oid,
  enum object_type *type,
  unsigned long *size,
  void **contents);
diff --git a/sha1_file.c b/sha1_file.c
index c1d06c6109..69e8d27773 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2176,7 +2176,7 @@ static int check_stream_sha1(git_zstream *stream,
 }
 
 int read_loose_object(const char *path,
- const unsigned char *expected_sha1,
+ const struct object_id *expected_oid,
  enum object_type *type,
  unsigned long *size,
  void **contents)
@@ -2208,19 +2208,19 @@ int read_loose_object(const char *path,
}
 
if (*type == OBJ_BLOB) {
-   if (check_stream_sha1(, hdr, *size, path, expected_sha1) 
< 0)
+   if (check_stream_sha1(, hdr, *size, path, 
expected_oid->hash) < 0)
goto out;
} else {
-   *contents = unpack_sha1_rest(, hdr, *size, 
expected_sha1);
+   *contents = unpack_sha1_rest(, hdr, *size, 
expected_oid->hash);
if (!*contents) {
error("unable to unpack contents of %s", path);
git_inflate_end();
goto out;
}
-   if (check_sha1_signature(expected_sha1, *contents,
+   if (check_sha1_signature(expected_oid->hash, *contents,
 *size, typename(*type))) {
error("sha1 mismatch for %s (expected %s)", path,
- sha1_to_hex(expected_sha1));
+ oid_to_hex(expected_oid));
free(*contents);
goto out;
}


[PATCH 20/36] streaming: convert open_istream to use struct object_id

2018-02-19 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 archive-tar.c  | 2 +-
 archive-zip.c  | 2 +-
 builtin/index-pack.c   | 2 +-
 builtin/pack-objects.c | 2 +-
 sha1_file.c| 2 +-
 streaming.c| 6 +++---
 streaming.h| 2 +-
 7 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index fd622eacc0..7a0d31d847 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -119,7 +119,7 @@ static int stream_blocked(const struct object_id *oid)
char buf[BLOCKSIZE];
ssize_t readlen;
 
-   st = open_istream(oid->hash, , , NULL);
+   st = open_istream(oid, , , NULL);
if (!st)
return error("cannot stream blob %s", oid_to_hex(oid));
for (;;) {
diff --git a/archive-zip.c b/archive-zip.c
index 5841a6ceb6..18b951b732 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -337,7 +337,7 @@ static int write_zip_entry(struct archiver_args *args,
 
if (S_ISREG(mode) && type == OBJ_BLOB && !args->convert &&
size > big_file_threshold) {
-   stream = open_istream(oid->hash, , , NULL);
+   stream = open_istream(oid, , , NULL);
if (!stream)
return error("cannot stream blob %s",
 oid_to_hex(oid));
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index e0a776f1ac..a0ca525e99 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -771,7 +771,7 @@ static int check_collison(struct object_entry *entry)
 
memset(, 0, sizeof(data));
data.entry = entry;
-   data.st = open_istream(entry->idx.oid.hash, , , NULL);
+   data.st = open_istream(>idx.oid, , , NULL);
if (!data.st)
return -1;
if (size != entry->size || type != entry->type)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5c674b2843..ef63aa5878 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -267,7 +267,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
if (!usable_delta) {
if (entry->type == OBJ_BLOB &&
entry->size > big_file_threshold &&
-   (st = open_istream(entry->idx.oid.hash, , , 
NULL)) != NULL)
+   (st = open_istream(>idx.oid, , , NULL)) != 
NULL)
buf = NULL;
else {
buf = read_sha1_file(entry->idx.oid.hash, ,
diff --git a/sha1_file.c b/sha1_file.c
index 64f0905799..5dc85122c3 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -799,7 +799,7 @@ int check_object_signature(const struct object_id *oid, 
void *map,
return oidcmp(oid, _oid) ? -1 : 0;
}
 
-   st = open_istream(oid->hash, _type, , NULL);
+   st = open_istream(oid, _type, , NULL);
if (!st)
return -1;
 
diff --git a/streaming.c b/streaming.c
index 5892b50bd8..be85507922 100644
--- a/streaming.c
+++ b/streaming.c
@@ -130,14 +130,14 @@ static enum input_source istream_source(const unsigned 
char *sha1,
}
 }
 
-struct git_istream *open_istream(const unsigned char *sha1,
+struct git_istream *open_istream(const struct object_id *oid,
 enum object_type *type,
 unsigned long *size,
 struct stream_filter *filter)
 {
struct git_istream *st;
struct object_info oi = OBJECT_INFO_INIT;
-   const unsigned char *real = lookup_replace_object(sha1);
+   const unsigned char *real = lookup_replace_object(oid->hash);
enum input_source src = istream_source(real, type, );
 
if (src < 0)
@@ -507,7 +507,7 @@ int stream_blob_to_fd(int fd, const struct object_id *oid, 
struct stream_filter
ssize_t kept = 0;
int result = -1;
 
-   st = open_istream(oid->hash, , , filter);
+   st = open_istream(oid, , , filter);
if (!st) {
if (filter)
free_stream_filter(filter);
diff --git a/streaming.h b/streaming.h
index 73c1d156b3..32f4626771 100644
--- a/streaming.h
+++ b/streaming.h
@@ -8,7 +8,7 @@
 /* opaque */
 struct git_istream;
 
-extern struct git_istream *open_istream(const unsigned char *, enum 
object_type *, unsigned long *, struct stream_filter *);
+extern struct git_istream *open_istream(const struct object_id *, enum 
object_type *, unsigned long *, struct stream_filter *);
 extern int close_istream(struct git_istream *);
 extern ssize_t read_istream(struct git_istream *, void *, size_t);
 


[PATCH 21/36] builtin/mktree: convert to struct object_id

2018-02-19 Thread brian m. carlson
Convert this file to use struct object_id.  Modify one use of
get_sha1_hex into parse_oid_hex; this is safe since we get the data from
a strbuf.

Signed-off-by: brian m. carlson 
---
 builtin/mktree.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/builtin/mktree.c b/builtin/mktree.c
index 8dd9f52f77..d6ca19d835 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -10,13 +10,13 @@
 
 static struct treeent {
unsigned mode;
-   unsigned char sha1[20];
+   struct object_id oid;
int len;
char name[FLEX_ARRAY];
 } **entries;
 static int alloc, used;
 
-static void append_to_tree(unsigned mode, unsigned char *sha1, char *path)
+static void append_to_tree(unsigned mode, struct object_id *oid, char *path)
 {
struct treeent *ent;
size_t len = strlen(path);
@@ -26,7 +26,7 @@ static void append_to_tree(unsigned mode, unsigned char 
*sha1, char *path)
FLEX_ALLOC_MEM(ent, name, path, len);
ent->mode = mode;
ent->len = len;
-   hashcpy(ent->sha1, sha1);
+   oidcpy(>oid, oid);
 
ALLOC_GROW(entries, used + 1, alloc);
entries[used++] = ent;
@@ -54,7 +54,7 @@ static void write_tree(struct object_id *oid)
for (i = 0; i < used; i++) {
struct treeent *ent = entries[i];
strbuf_addf(, "%o %s%c", ent->mode, ent->name, '\0');
-   strbuf_add(, ent->sha1, 20);
+   strbuf_add(, ent->oid.hash, the_hash_algo->rawsz);
}
 
write_object_file(buf.buf, buf.len, tree_type, oid);
@@ -69,11 +69,12 @@ static const char *mktree_usage[] = {
 static void mktree_line(char *buf, size_t len, int nul_term_line, int 
allow_missing)
 {
char *ptr, *ntr;
+   const char *p;
unsigned mode;
enum object_type mode_type; /* object type derived from mode */
enum object_type obj_type; /* object type derived from sha */
char *path, *to_free = NULL;
-   unsigned char sha1[20];
+   struct object_id oid;
 
ptr = buf;
/*
@@ -85,9 +86,8 @@ static void mktree_line(char *buf, size_t len, int 
nul_term_line, int allow_miss
die("input format error: %s", buf);
ptr = ntr + 1; /* type */
ntr = strchr(ptr, ' ');
-   if (!ntr || buf + len <= ntr + 40 ||
-   ntr[41] != '\t' ||
-   get_sha1_hex(ntr + 1, sha1))
+   if (!ntr || parse_oid_hex(ntr + 1, , ) ||
+   *p != '\t')
die("input format error: %s", buf);
 
/* It is perfectly normal if we do not have a commit from a submodule */
@@ -116,12 +116,12 @@ static void mktree_line(char *buf, size_t len, int 
nul_term_line, int allow_miss
}
 
/* Check the type of object identified by sha1 */
-   obj_type = sha1_object_info(sha1, NULL);
+   obj_type = sha1_object_info(oid.hash, NULL);
if (obj_type < 0) {
if (allow_missing) {
; /* no problem - missing objects are presumed to be of 
the right type */
} else {
-   die("entry '%s' object %s is unavailable", path, 
sha1_to_hex(sha1));
+   die("entry '%s' object %s is unavailable", path, 
oid_to_hex());
}
} else {
if (obj_type != mode_type) {
@@ -131,11 +131,11 @@ static void mktree_line(char *buf, size_t len, int 
nul_term_line, int allow_miss
 * because the new tree entry will never be correct.
 */
die("entry '%s' object %s is a %s but specified type 
was (%s)",
-   path, sha1_to_hex(sha1), typename(obj_type), 
typename(mode_type));
+   path, oid_to_hex(), typename(obj_type), 
typename(mode_type));
}
}
 
-   append_to_tree(mode, sha1, path);
+   append_to_tree(mode, , path);
free(to_free);
 }
 


[PATCH 02/36] builtin/write-tree: convert to struct object_id

2018-02-19 Thread brian m. carlson
This is needed to convert parts of the cache-tree code.

Signed-off-by: brian m. carlson 
---
 builtin/write-tree.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/write-tree.c b/builtin/write-tree.c
index bd0a78aa3c..299a121531 100644
--- a/builtin/write-tree.c
+++ b/builtin/write-tree.c
@@ -19,7 +19,7 @@ int cmd_write_tree(int argc, const char **argv, const char 
*unused_prefix)
 {
int flags = 0, ret;
const char *prefix = NULL;
-   unsigned char sha1[20];
+   struct object_id oid;
const char *me = "git-write-tree";
struct option write_tree_options[] = {
OPT_BIT(0, "missing-ok", , N_("allow missing objects"),
@@ -38,10 +38,10 @@ int cmd_write_tree(int argc, const char **argv, const char 
*unused_prefix)
argc = parse_options(argc, argv, unused_prefix, write_tree_options,
 write_tree_usage, 0);
 
-   ret = write_cache_as_tree(sha1, flags, prefix);
+   ret = write_cache_as_tree(oid.hash, flags, prefix);
switch (ret) {
case 0:
-   printf("%s\n", sha1_to_hex(sha1));
+   printf("%s\n", oid_to_hex());
break;
case WRITE_TREE_UNREADABLE_INDEX:
die("%s: error reading the index", me);


[PATCH 00/36] object_id part 12

2018-02-19 Thread brian m. carlson
This is the twelfth in a series of patches to convert from unsigned char
[20] to struct object_id.  This series is based on next.

Included in this series are conversions for find_unique_abbrev and
lookup_replace_object, as well as parts of the sha1_file code.

Conflicts with pu are average in number but minor, mostly because of the
type_name conversion.  None of them are tricky, except that the
introduction of get_tree_entry_if_blob requires a conversion of that
function.

brian m. carlson (36):
  bulk-checkin: convert index_bulk_checkin to struct object_id
  builtin/write-tree: convert to struct object_id
  cache-tree: convert write_*_as_tree to object_id
  cache-tree: convert remnants to struct object_id
  resolve-undo: convert struct resolve_undo_info to object_id
  tree: convert read_tree_recursive to struct object_id
  ref-filter: convert grab_objectname to struct object_id
  strbuf: convert strbuf_add_unique_abbrev to use struct object_id
  wt-status: convert struct wt_status_state to object_id
  Convert find_unique_abbrev* to struct object_id
  http-walker: convert struct object_request to use struct object_id
  send-pack: convert remaining functions to struct object_id
  replace_object: convert struct replace_object to object_id
  builtin/mktag: convert to struct object_id
  archive: convert write_archive_entry_fn_t to object_id
  archive: convert sha1_file_to_archive to struct object_id
  builtin/index-pack: convert struct ref_delta_entry to object_id
  sha1_file: convert read_loose_object to use struct object_id
  sha1_file: convert check_sha1_signature to struct object_id
  streaming: convert open_istream to use struct object_id
  builtin/mktree: convert to struct object_id
  sha1_file: convert assert_sha1_type to object_id
  sha1_file: convert retry_bad_packed_offset to struct object_id
  packfile: convert unpack_entry to struct object_id
  Convert remaining callers of sha1_object_info_extended to object_id
  sha1_file: convert sha1_object_info* to object_id
  builtin/fmt-merge-msg: convert remaining code to object_id
  builtin/notes: convert static functions to object_id
  tree-walk: convert get_tree_entry_follow_symlinks internals to
object_id
  streaming: convert istream internals to struct object_id
  tree-walk: convert tree entry functions to object_id
  sha1_file: convert read_object_with_reference to object_id
  sha1_file: convert read_sha1_file to struct object_id
  Convert lookup_replace_object to struct object_id
  sha1_file: introduce a constant for max header length
  convert: convert to struct object_id

 apply.c  |   4 +-
 archive-tar.c|  28 
 archive-zip.c|  18 ++---
 archive.c|  32 -
 archive.h|  10 +--
 bisect.c |   3 +-
 blame.c  |  18 +++--
 builtin/am.c |   8 +--
 builtin/blame.c  |   4 +-
 builtin/branch.c |   2 +-
 builtin/cat-file.c   |  30 +
 builtin/checkout.c   |  12 ++--
 builtin/commit-tree.c|   2 +-
 builtin/describe.c   |   6 +-
 builtin/difftool.c   |   2 +-
 builtin/fast-export.c|   8 +--
 builtin/fetch.c  |  10 +--
 builtin/fmt-merge-msg.c  |   4 +-
 builtin/fsck.c   |   4 +-
 builtin/grep.c   |   6 +-
 builtin/index-pack.c |  43 ++--
 builtin/log.c|   8 +--
 builtin/ls-files.c   |   4 +-
 builtin/ls-tree.c|   8 +--
 builtin/merge-tree.c |   5 +-
 builtin/merge.c  |   8 +--
 builtin/mktag.c  |  20 +++---
 builtin/mktree.c |  24 +++
 builtin/name-rev.c   |   2 +-
 builtin/notes.c  |  14 ++--
 builtin/pack-objects.c   |  27 
 builtin/prune.c  |   2 +-
 builtin/receive-pack.c   |   8 +--
 builtin/reflog.c |   2 +-
 builtin/replace.c|  10 +--
 builtin/reset.c  |   2 +-
 builtin/rev-list.c   |   2 +-
 builtin/rev-parse.c  |   2 +-
 builtin/rm.c |   2 +-
 builtin/show-branch.c|   2 +-
 builtin/show-ref.c   |   4 +-
 builtin/tag.c|  16 +++--
 builtin/unpack-file.c|   2 +-
 builtin/unpack-objects.c |   4 +-
 builtin/update-index.c   |   2 +-
 builtin/verify-commit.c  |   2 +-
 builtin/worktree.c   |   4 +-
 builtin/write-tree.c |   6 +-
 bulk-checkin.c   |  18 ++---
 bulk-checkin.h   |   2 +-
 bundle.c |   2 +-
 cache-tree.c |  36 +-
 cache-tree.h |   4 +-
 cache.h  |  42 ++--
 combine-diff.c   |   6 +-
 commit.c |   8 +--
 config.c |   2 +-
 convert.c|  12 ++--
 convert.h|   2 +-
 diff.c   |   6 +-
 dir.c|   2 +-
 entry.c  |   4 +-
 fast-import.c|  31 -
 fsck.c   |   2 +-
 grep.c   |   2 +-
 http-push.c  |   2 +-
 

[PATCH 01/36] bulk-checkin: convert index_bulk_checkin to struct object_id

2018-02-19 Thread brian m. carlson
Convert the index_bulk_checkin function, and the static functions it
calls, to use pointers to struct object_id.

Signed-off-by: brian m. carlson 
---
 bulk-checkin.c | 18 +-
 bulk-checkin.h |  2 +-
 sha1_file.c|  2 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index 8bcd1c8665..6f5ee25bff 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -60,17 +60,17 @@ static void finish_bulk_checkin(struct bulk_checkin_state 
*state)
reprepare_packed_git();
 }
 
-static int already_written(struct bulk_checkin_state *state, unsigned char 
sha1[])
+static int already_written(struct bulk_checkin_state *state, struct object_id 
*oid)
 {
int i;
 
/* The object may already exist in the repository */
-   if (has_sha1_file(sha1))
+   if (has_sha1_file(oid->hash))
return 1;
 
/* Might want to keep the list sorted */
for (i = 0; i < state->nr_written; i++)
-   if (!hashcmp(state->written[i]->oid.hash, sha1))
+   if (!oidcmp(>written[i]->oid, oid))
return 1;
 
/* This is a new object we need to keep */
@@ -186,7 +186,7 @@ static void prepare_to_stream(struct bulk_checkin_state 
*state,
 }
 
 static int deflate_to_pack(struct bulk_checkin_state *state,
-  unsigned char result_sha1[],
+  struct object_id *result_oid,
   int fd, size_t size,
   enum object_type type, const char *path,
   unsigned flags)
@@ -236,17 +236,17 @@ static int deflate_to_pack(struct bulk_checkin_state 
*state,
if (lseek(fd, seekback, SEEK_SET) == (off_t) -1)
return error("cannot seek back");
}
-   the_hash_algo->final_fn(result_sha1, );
+   the_hash_algo->final_fn(result_oid->hash, );
if (!idx)
return 0;
 
idx->crc32 = crc32_end(state->f);
-   if (already_written(state, result_sha1)) {
+   if (already_written(state, result_oid)) {
hashfile_truncate(state->f, );
state->offset = checkpoint.offset;
free(idx);
} else {
-   hashcpy(idx->oid.hash, result_sha1);
+   oidcpy(>oid, result_oid);
ALLOC_GROW(state->written,
   state->nr_written + 1,
   state->alloc_written);
@@ -255,11 +255,11 @@ static int deflate_to_pack(struct bulk_checkin_state 
*state,
return 0;
 }
 
-int index_bulk_checkin(unsigned char *sha1,
+int index_bulk_checkin(struct object_id *oid,
   int fd, size_t size, enum object_type type,
   const char *path, unsigned flags)
 {
-   int status = deflate_to_pack(, sha1, fd, size, type,
+   int status = deflate_to_pack(, oid, fd, size, type,
 path, flags);
if (!state.plugged)
finish_bulk_checkin();
diff --git a/bulk-checkin.h b/bulk-checkin.h
index fbd40fc98c..a85527318b 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -4,7 +4,7 @@
 #ifndef BULK_CHECKIN_H
 #define BULK_CHECKIN_H
 
-extern int index_bulk_checkin(unsigned char sha1[],
+extern int index_bulk_checkin(struct object_id *oid,
  int fd, size_t size, enum object_type type,
  const char *path, unsigned flags);
 
diff --git a/sha1_file.c b/sha1_file.c
index 826d7a0ae3..c1d06c6109 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1892,7 +1892,7 @@ static int index_stream(struct object_id *oid, int fd, 
size_t size,
enum object_type type, const char *path,
unsigned flags)
 {
-   return index_bulk_checkin(oid->hash, fd, size, type, path, flags);
+   return index_bulk_checkin(oid, fd, size, type, path, flags);
 }
 
 int index_fd(struct object_id *oid, int fd, struct stat *st,


[PATCH 03/36] cache-tree: convert write_*_as_tree to object_id

2018-02-19 Thread brian m. carlson
Convert write_index_as_tree and write_cache_as_tree to use struct
object_id.

Signed-off-by: brian m. carlson 
---
 builtin/am.c |  8 
 builtin/merge.c  |  2 +-
 builtin/write-tree.c |  2 +-
 cache-tree.c | 10 +-
 cache-tree.h |  4 ++--
 sequencer.c  |  4 ++--
 6 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 6661edc162..220c5deed8 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1546,7 +1546,7 @@ static int fall_back_threeway(const struct am_state 
*state, const char *index_pa
discard_cache();
read_cache_from(index_path);
 
-   if (write_index_as_tree(orig_tree.hash, _index, index_path, 0, 
NULL))
+   if (write_index_as_tree(_tree, _index, index_path, 0, NULL))
return error(_("Repository lacks necessary blobs to fall back 
on 3-way merge."));
 
say(state, stdout, _("Using index info to reconstruct a base tree..."));
@@ -1571,7 +1571,7 @@ static int fall_back_threeway(const struct am_state 
*state, const char *index_pa
return error(_("Did you hand edit your patch?\n"
"It does not apply to blobs recorded in its 
index."));
 
-   if (write_index_as_tree(their_tree.hash, _index, index_path, 0, 
NULL))
+   if (write_index_as_tree(_tree, _index, index_path, 0, NULL))
return error("could not write tree");
 
say(state, stdout, _("Falling back to patching base and 3-way 
merge..."));
@@ -1622,7 +1622,7 @@ static void do_commit(const struct am_state *state)
if (run_hook_le(NULL, "pre-applypatch", NULL))
exit(1);
 
-   if (write_cache_as_tree(tree.hash, 0, NULL))
+   if (write_cache_as_tree(, 0, NULL))
die(_("git write-tree failed to write a tree"));
 
if (!get_oid_commit("HEAD", )) {
@@ -2001,7 +2001,7 @@ static int clean_index(const struct object_id *head, 
const struct object_id *rem
if (fast_forward_to(head_tree, head_tree, 1))
return -1;
 
-   if (write_cache_as_tree(index.hash, 0, NULL))
+   if (write_cache_as_tree(, 0, NULL))
return -1;
 
index_tree = parse_tree_indirect();
diff --git a/builtin/merge.c b/builtin/merge.c
index 92ba99a1a5..861b170468 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -638,7 +638,7 @@ static int read_tree_trivial(struct object_id *common, 
struct object_id *head,
 
 static void write_tree_trivial(struct object_id *oid)
 {
-   if (write_cache_as_tree(oid->hash, 0, NULL))
+   if (write_cache_as_tree(oid, 0, NULL))
die(_("git write-tree failed to write a tree"));
 }
 
diff --git a/builtin/write-tree.c b/builtin/write-tree.c
index 299a121531..c9d3c544e7 100644
--- a/builtin/write-tree.c
+++ b/builtin/write-tree.c
@@ -38,7 +38,7 @@ int cmd_write_tree(int argc, const char **argv, const char 
*unused_prefix)
argc = parse_options(argc, argv, unused_prefix, write_tree_options,
 write_tree_usage, 0);
 
-   ret = write_cache_as_tree(oid.hash, flags, prefix);
+   ret = write_cache_as_tree(, flags, prefix);
switch (ret) {
case 0:
printf("%s\n", oid_to_hex());
diff --git a/cache-tree.c b/cache-tree.c
index c52e4303df..ba07a8067e 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -599,7 +599,7 @@ static struct cache_tree *cache_tree_find(struct cache_tree 
*it, const char *pat
return it;
 }
 
-int write_index_as_tree(unsigned char *sha1, struct index_state *index_state, 
const char *index_path, int flags, const char *prefix)
+int write_index_as_tree(struct object_id *oid, struct index_state 
*index_state, const char *index_path, int flags, const char *prefix)
 {
int entries, was_valid;
struct lock_file lock_file = LOCK_INIT;
@@ -640,19 +640,19 @@ int write_index_as_tree(unsigned char *sha1, struct 
index_state *index_state, co
ret = WRITE_TREE_PREFIX_ERROR;
goto out;
}
-   hashcpy(sha1, subtree->oid.hash);
+   oidcpy(oid, >oid);
}
else
-   hashcpy(sha1, index_state->cache_tree->oid.hash);
+   oidcpy(oid, _state->cache_tree->oid);
 
 out:
rollback_lock_file(_file);
return ret;
 }
 
-int write_cache_as_tree(unsigned char *sha1, int flags, const char *prefix)
+int write_cache_as_tree(struct object_id *oid, int flags, const char *prefix)
 {
-   return write_index_as_tree(sha1, _index, get_index_file(), flags, 
prefix);
+   return write_index_as_tree(oid, _index, get_index_file(), flags, 
prefix);
 }
 
 static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
diff --git a/cache-tree.h b/cache-tree.h
index f7b9cab7ee..cfd5328cc9 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -47,8 +47,8 @@ int update_main_cache_tree(int);
 #define 

[PATCH 13/36] replace_object: convert struct replace_object to object_id

2018-02-19 Thread brian m. carlson
Convert the two members of this struct to be instances of struct
object_id.  Adjust the various functions in this file accordingly.

Signed-off-by: brian m. carlson 
---
 replace_object.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/replace_object.c b/replace_object.c
index 3e49965d05..232e8b8550 100644
--- a/replace_object.c
+++ b/replace_object.c
@@ -8,8 +8,8 @@
  * sha1.
  */
 static struct replace_object {
-   unsigned char original[20];
-   unsigned char replacement[20];
+   struct object_id original;
+   struct object_id replacement;
 } **replace_object;
 
 static int replace_object_alloc, replace_object_nr;
@@ -17,7 +17,7 @@ static int replace_object_alloc, replace_object_nr;
 static const unsigned char *replace_sha1_access(size_t index, void *table)
 {
struct replace_object **replace = table;
-   return replace[index]->original;
+   return replace[index]->original.hash;
 }
 
 static int replace_object_pos(const unsigned char *sha1)
@@ -29,7 +29,7 @@ static int replace_object_pos(const unsigned char *sha1)
 static int register_replace_object(struct replace_object *replace,
   int ignore_dups)
 {
-   int pos = replace_object_pos(replace->original);
+   int pos = replace_object_pos(replace->original.hash);
 
if (0 <= pos) {
if (ignore_dups)
@@ -59,14 +59,14 @@ static int register_replace_ref(const char *refname,
const char *hash = slash ? slash + 1 : refname;
struct replace_object *repl_obj = xmalloc(sizeof(*repl_obj));
 
-   if (strlen(hash) != 40 || get_sha1_hex(hash, repl_obj->original)) {
+   if (get_oid_hex(hash, _obj->original)) {
free(repl_obj);
warning("bad replace ref name: %s", refname);
return 0;
}
 
/* Copy sha1 from the read ref */
-   hashcpy(repl_obj->replacement, oid->hash);
+   oidcpy(_obj->replacement, oid);
 
/* Register new object */
if (register_replace_object(repl_obj, 1))
@@ -113,7 +113,7 @@ const unsigned char *do_lookup_replace_object(const 
unsigned char *sha1)
 
pos = replace_object_pos(cur);
if (0 <= pos)
-   cur = replace_object[pos]->replacement;
+   cur = replace_object[pos]->replacement.hash;
} while (0 <= pos);
 
return cur;


[PATCH 14/36] builtin/mktag: convert to struct object_id

2018-02-19 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 builtin/mktag.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/builtin/mktag.c b/builtin/mktag.c
index beb552847b..65bb41e3cd 100644
--- a/builtin/mktag.c
+++ b/builtin/mktag.c
@@ -18,13 +18,13 @@
 /*
  * We refuse to tag something we can't verify. Just because.
  */
-static int verify_object(const unsigned char *sha1, const char *expected_type)
+static int verify_object(const struct object_id *oid, const char 
*expected_type)
 {
int ret = -1;
enum object_type type;
unsigned long size;
-   void *buffer = read_sha1_file(sha1, , );
-   const unsigned char *repl = lookup_replace_object(sha1);
+   void *buffer = read_sha1_file(oid->hash, , );
+   const unsigned char *repl = lookup_replace_object(oid->hash);
 
if (buffer) {
if (type == type_from_string(expected_type))
@@ -38,8 +38,8 @@ static int verify_tag(char *buffer, unsigned long size)
 {
int typelen;
char type[20];
-   unsigned char sha1[20];
-   const char *object, *type_line, *tag_line, *tagger_line, *lb, *rb;
+   struct object_id oid;
+   const char *object, *type_line, *tag_line, *tagger_line, *lb, *rb, *p;
size_t len;
 
if (size < 84)
@@ -52,11 +52,11 @@ static int verify_tag(char *buffer, unsigned long size)
if (memcmp(object, "object ", 7))
return error("char%d: does not start with \"object \"", 0);
 
-   if (get_sha1_hex(object + 7, sha1))
+   if (parse_oid_hex(object + 7, , ))
return error("char%d: could not get SHA1 hash", 7);
 
/* Verify type line */
-   type_line = object + 48;
+   type_line = p + 1;
if (memcmp(type_line - 1, "\ntype ", 6))
return error("char%d: could not find \"\\ntype \"", 47);
 
@@ -80,8 +80,8 @@ static int verify_tag(char *buffer, unsigned long size)
type[typelen] = 0;
 
/* Verify that the object matches */
-   if (verify_object(sha1, type))
-   return error("char%d: could not verify object %s", 7, 
sha1_to_hex(sha1));
+   if (verify_object(, type))
+   return error("char%d: could not verify object %s", 7, 
oid_to_hex());
 
/* Verify the tag-name: we don't allow control characters or spaces in 
it */
tag_line += 4;


[PATCH 07/36] ref-filter: convert grab_objectname to struct object_id

2018-02-19 Thread brian m. carlson
This is necessary in order to convert find_unique_abbrev.

Signed-off-by: brian m. carlson 
---
 ref-filter.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index f9e25aea7a..9cbd92611c 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -737,18 +737,18 @@ static void *get_obj(const struct object_id *oid, struct 
object **obj, unsigned
return buf;
 }
 
-static int grab_objectname(const char *name, const unsigned char *sha1,
+static int grab_objectname(const char *name, const struct object_id *oid,
   struct atom_value *v, struct used_atom *atom)
 {
if (starts_with(name, "objectname")) {
if (atom->u.objectname.option == O_SHORT) {
-   v->s = xstrdup(find_unique_abbrev(sha1, 
DEFAULT_ABBREV));
+   v->s = xstrdup(find_unique_abbrev(oid->hash, 
DEFAULT_ABBREV));
return 1;
} else if (atom->u.objectname.option == O_FULL) {
-   v->s = xstrdup(sha1_to_hex(sha1));
+   v->s = xstrdup(oid_to_hex(oid));
return 1;
} else if (atom->u.objectname.option == O_LENGTH) {
-   v->s = xstrdup(find_unique_abbrev(sha1, 
atom->u.objectname.length));
+   v->s = xstrdup(find_unique_abbrev(oid->hash, 
atom->u.objectname.length));
return 1;
} else
die("BUG: unknown %%(objectname) option");
@@ -775,7 +775,7 @@ static void grab_common_values(struct atom_value *val, int 
deref, struct object
v->s = xstrfmt("%lu", sz);
}
else if (deref)
-   grab_objectname(name, obj->oid.hash, v, _atom[i]);
+   grab_objectname(name, >oid, v, _atom[i]);
}
 }
 
@@ -1439,7 +1439,7 @@ static void populate_value(struct ref_array_item *ref)
v->s = xstrdup(buf + 1);
}
continue;
-   } else if (!deref && grab_objectname(name, 
ref->objectname.hash, v, atom)) {
+   } else if (!deref && grab_objectname(name, >objectname, v, 
atom)) {
continue;
} else if (!strcmp(name, "HEAD")) {
if (atom->u.head && !strcmp(ref->refname, atom->u.head))


[PATCH 29/36] tree-walk: convert get_tree_entry_follow_symlinks internals to object_id

2018-02-19 Thread brian m. carlson
Convert the internals of this function to use struct object_id.  This is
one of the last remaining callers of read_sha1_file_extended that has
not been converted yet.

Signed-off-by: brian m. carlson 
---
 tree-walk.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/tree-walk.c b/tree-walk.c
index 63a87ed666..521defdaa2 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -583,14 +583,14 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
struct dir_state *parents = NULL;
size_t parents_alloc = 0;
size_t i, parents_nr = 0;
-   unsigned char current_tree_sha1[20];
+   struct object_id current_tree_oid;
struct strbuf namebuf = STRBUF_INIT;
struct tree_desc t;
int follows_remaining = GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;
 
init_tree_desc(, NULL, 0UL);
strbuf_addstr(, name);
-   hashcpy(current_tree_sha1, tree_sha1);
+   hashcpy(current_tree_oid.hash, tree_sha1);
 
while (1) {
int find_result;
@@ -599,22 +599,22 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
 
if (!t.buffer) {
void *tree;
-   unsigned char root[20];
+   struct object_id root;
unsigned long size;
-   tree = read_object_with_reference(current_tree_sha1,
+   tree = read_object_with_reference(current_tree_oid.hash,
  tree_type, ,
- root);
+ root.hash);
if (!tree)
goto done;
 
ALLOC_GROW(parents, parents_nr + 1, parents_alloc);
parents[parents_nr].tree = tree;
parents[parents_nr].size = size;
-   hashcpy(parents[parents_nr].sha1, root);
+   hashcpy(parents[parents_nr].sha1, root.hash);
parents_nr++;
 
if (namebuf.buf[0] == '\0') {
-   hashcpy(result, root);
+   hashcpy(result, root.hash);
retval = FOUND;
goto done;
}
@@ -671,14 +671,14 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
 
/* Look up the first (or only) path component in the tree. */
find_result = find_tree_entry(, namebuf.buf,
- current_tree_sha1, mode);
+ current_tree_oid.hash, mode);
if (find_result) {
goto done;
}
 
if (S_ISDIR(*mode)) {
if (!remainder) {
-   hashcpy(result, current_tree_sha1);
+   hashcpy(result, current_tree_oid.hash);
retval = FOUND;
goto done;
}
@@ -688,7 +688,7 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
  1 + first_slash - namebuf.buf);
} else if (S_ISREG(*mode)) {
if (!remainder) {
-   hashcpy(result, current_tree_sha1);
+   hashcpy(result, current_tree_oid.hash);
retval = FOUND;
} else {
retval = NOT_DIR;
@@ -714,7 +714,7 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
 */
retval = DANGLING_SYMLINK;
 
-   contents = read_sha1_file(current_tree_sha1, ,
+   contents = read_sha1_file(current_tree_oid.hash, ,
  _len);
 
if (!contents)


[PATCH 31/36] tree-walk: convert tree entry functions to object_id

2018-02-19 Thread brian m. carlson
Convert get_tree_entry and find_tree_entry to take pointers to struct
object_id.

Signed-off-by: brian m. carlson 
---
 archive.c  |  4 ++--
 blame.c|  6 ++
 builtin/rm.c   |  2 +-
 builtin/update-index.c |  2 +-
 line-log.c |  3 +--
 match-trees.c  |  6 +++---
 merge-recursive.c  | 12 ++--
 notes.c|  2 +-
 sha1_name.c|  7 +++
 tree-walk.c| 20 ++--
 tree-walk.h|  2 +-
 11 files changed, 31 insertions(+), 35 deletions(-)

diff --git a/archive.c b/archive.c
index da62b2f541..1ab62d426e 100644
--- a/archive.c
+++ b/archive.c
@@ -397,8 +397,8 @@ static void parse_treeish_arg(const char **argv,
unsigned int mode;
int err;
 
-   err = get_tree_entry(tree->object.oid.hash, prefix,
-tree_oid.hash, );
+   err = get_tree_entry(>object.oid, prefix, _oid,
+);
if (err || !S_ISDIR(mode))
die("current working directory is untracked");
 
diff --git a/blame.c b/blame.c
index c1df10cdd2..3f9bd29615 100644
--- a/blame.c
+++ b/blame.c
@@ -80,7 +80,7 @@ static void verify_working_tree_path(struct commit 
*work_tree, const char *path)
struct object_id blob_oid;
unsigned mode;
 
-   if (!get_tree_entry(commit_oid->hash, path, blob_oid.hash, 
) &&
+   if (!get_tree_entry(commit_oid, path, _oid, ) &&
oid_object_info(_oid, NULL) == OBJ_BLOB)
return;
}
@@ -502,9 +502,7 @@ static int fill_blob_sha1_and_mode(struct blame_origin 
*origin)
 {
if (!is_null_oid(>blob_oid))
return 0;
-   if (get_tree_entry(origin->commit->object.oid.hash,
-  origin->path,
-  origin->blob_oid.hash, >mode))
+   if (get_tree_entry(>commit->object.oid, origin->path, 
>blob_oid, >mode))
goto error_out;
if (oid_object_info(>blob_oid, NULL) != OBJ_BLOB)
goto error_out;
diff --git a/builtin/rm.c b/builtin/rm.c
index 4a2fcca27b..974a7ef5bf 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -178,7 +178,7 @@ static int check_local_mod(struct object_id *head, int 
index_only)
 * way as changed from the HEAD.
 */
if (no_head
-|| get_tree_entry(head->hash, name, oid.hash, )
+|| get_tree_entry(head, name, , )
 || ce->ce_mode != create_ce_mode(mode)
 || oidcmp(>oid, ))
staged_changes = 1;
diff --git a/builtin/update-index.c b/builtin/update-index.c
index 58d1c2d282..9625d1e10a 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -592,7 +592,7 @@ static struct cache_entry *read_one_ent(const char *which,
int size;
struct cache_entry *ce;
 
-   if (get_tree_entry(ent->hash, path, oid.hash, )) {
+   if (get_tree_entry(ent, path, , )) {
if (which)
error("%s: not in %s branch.", path, which);
return NULL;
diff --git a/line-log.c b/line-log.c
index 545ad0f28b..700121eb92 100644
--- a/line-log.c
+++ b/line-log.c
@@ -501,8 +501,7 @@ static void fill_blob_sha1(struct commit *commit, struct 
diff_filespec *spec)
unsigned mode;
struct object_id oid;
 
-   if (get_tree_entry(commit->object.oid.hash, spec->path,
-  oid.hash, ))
+   if (get_tree_entry(>object.oid, spec->path, , ))
die("There is no path %s in the commit", spec->path);
fill_filespec(spec, , 1, mode);
 
diff --git a/match-trees.c b/match-trees.c
index 0ca99d5162..10d6c39d47 100644
--- a/match-trees.c
+++ b/match-trees.c
@@ -269,7 +269,7 @@ void shift_tree(const struct object_id *hash1,
if (!*del_prefix)
return;
 
-   if (get_tree_entry(hash2->hash, del_prefix, shifted->hash, 
))
+   if (get_tree_entry(hash2, del_prefix, shifted, ))
die("cannot find path %s in tree %s",
del_prefix, oid_to_hex(hash2));
return;
@@ -296,12 +296,12 @@ void shift_tree_by(const struct object_id *hash1,
unsigned candidate = 0;
 
/* Can hash2 be a tree at shift_prefix in tree hash1? */
-   if (!get_tree_entry(hash1->hash, shift_prefix, sub1.hash, ) &&
+   if (!get_tree_entry(hash1, shift_prefix, , ) &&
S_ISDIR(mode1))
candidate |= 1;
 
/* Can hash1 be a tree at shift_prefix in tree hash2? */
-   if (!get_tree_entry(hash2->hash, shift_prefix, sub2.hash, ) &&
+   if (!get_tree_entry(hash2, shift_prefix, , ) &&
S_ISDIR(mode2))
candidate |= 2;
 
diff --git 

[PATCH 24/36] packfile: convert unpack_entry to struct object_id

2018-02-19 Thread brian m. carlson
Convert unpack_entry and read_object to use struct object_id.
---
 packfile.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/packfile.c b/packfile.c
index bfa6438b4e..ed7b342ebf 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1452,7 +1452,7 @@ struct unpack_entry_stack_ent {
unsigned long size;
 };
 
-static void *read_object(const unsigned char *sha1, enum object_type *type,
+static void *read_object(const struct object_id *oid, enum object_type *type,
 unsigned long *size)
 {
struct object_info oi = OBJECT_INFO_INIT;
@@ -1461,7 +1461,7 @@ static void *read_object(const unsigned char *sha1, enum 
object_type *type,
oi.sizep = size;
oi.contentp = 
 
-   if (sha1_object_info_extended(sha1, , 0) < 0)
+   if (sha1_object_info_extended(oid->hash, , 0) < 0)
return NULL;
return content;
 }
@@ -1501,11 +1501,11 @@ void *unpack_entry(struct packed_git *p, off_t 
obj_offset,
struct revindex_entry *revidx = find_pack_revindex(p, 
obj_offset);
off_t len = revidx[1].offset - obj_offset;
if (check_pack_crc(p, _curs, obj_offset, len, 
revidx->nr)) {
-   const unsigned char *sha1 =
-   nth_packed_object_sha1(p, revidx->nr);
+   struct object_id oid;
+   nth_packed_object_oid(, p, revidx->nr);
error("bad packed object CRC for %s",
- sha1_to_hex(sha1));
-   mark_bad_packed_object(p, sha1);
+ oid_to_hex());
+   mark_bad_packed_object(p, oid.hash);
data = NULL;
goto out;
}
@@ -1588,16 +1588,16 @@ void *unpack_entry(struct packed_git *p, off_t 
obj_offset,
 * of a corrupted pack, and is better than failing 
outright.
 */
struct revindex_entry *revidx;
-   const unsigned char *base_sha1;
+   struct object_id base_oid;
revidx = find_pack_revindex(p, obj_offset);
if (revidx) {
-   base_sha1 = nth_packed_object_sha1(p, 
revidx->nr);
+   nth_packed_object_oid(_oid, p, revidx->nr);
error("failed to read delta base object %s"
  " at offset %"PRIuMAX" from %s",
- sha1_to_hex(base_sha1), 
(uintmax_t)obj_offset,
+ oid_to_hex(_oid), 
(uintmax_t)obj_offset,
  p->pack_name);
-   mark_bad_packed_object(p, base_sha1);
-   base = read_object(base_sha1, , 
_size);
+   mark_bad_packed_object(p, base_oid.hash);
+   base = read_object(_oid, , 
_size);
external_base = base;
}
}


[PATCH 22/36] sha1_file: convert assert_sha1_type to object_id

2018-02-19 Thread brian m. carlson
Convert this function to take a pointer to struct object_id and rename
it to assert_oid_type.

Signed-off-by: brian m. carlson 
---
 builtin/commit-tree.c | 2 +-
 cache.h   | 2 +-
 commit.c  | 2 +-
 sha1_file.c   | 8 
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/builtin/commit-tree.c b/builtin/commit-tree.c
index e5bdf57b1e..ecf42191da 100644
--- a/builtin/commit-tree.c
+++ b/builtin/commit-tree.c
@@ -58,7 +58,7 @@ int cmd_commit_tree(int argc, const char **argv, const char 
*prefix)
usage(commit_tree_usage);
if (get_oid_commit(argv[i], ))
die("Not a valid object name %s", argv[i]);
-   assert_sha1_type(oid.hash, OBJ_COMMIT);
+   assert_oid_type(, OBJ_COMMIT);
new_parent(lookup_commit(), );
continue;
}
diff --git a/cache.h b/cache.h
index f29ff43bbd..b1a117e404 100644
--- a/cache.h
+++ b/cache.h
@@ -1275,7 +1275,7 @@ extern int has_object_file_with_flags(const struct 
object_id *oid, int flags);
  */
 extern int has_loose_object_nonlocal(const unsigned char *sha1);
 
-extern void assert_sha1_type(const unsigned char *sha1, enum object_type 
expect);
+extern void assert_oid_type(const struct object_id *oid, enum object_type 
expect);
 
 /* Helper to check and "touch" a file */
 extern int check_and_freshen_file(const char *fn, int freshen);
diff --git a/commit.c b/commit.c
index e8a49b9c84..b9f028976f 100644
--- a/commit.c
+++ b/commit.c
@@ -1517,7 +1517,7 @@ int commit_tree_extended(const char *msg, size_t msg_len,
int encoding_is_utf8;
struct strbuf buffer;
 
-   assert_sha1_type(tree->hash, OBJ_TREE);
+   assert_oid_type(tree, OBJ_TREE);
 
if (memchr(msg, '\0', msg_len))
return error("a NUL byte in commit log message not allowed.");
diff --git a/sha1_file.c b/sha1_file.c
index 5dc85122c3..fa69d86309 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1966,13 +1966,13 @@ int read_pack_header(int fd, struct pack_header *header)
return 0;
 }
 
-void assert_sha1_type(const unsigned char *sha1, enum object_type expect)
+void assert_oid_type(const struct object_id *oid, enum object_type expect)
 {
-   enum object_type type = sha1_object_info(sha1, NULL);
+   enum object_type type = sha1_object_info(oid->hash, NULL);
if (type < 0)
-   die("%s is not a valid object", sha1_to_hex(sha1));
+   die("%s is not a valid object", oid_to_hex(oid));
if (type != expect)
-   die("%s is not a valid '%s' object", sha1_to_hex(sha1),
+   die("%s is not a valid '%s' object", oid_to_hex(oid),
typename(expect));
 }
 


[PATCH 33/36] sha1_file: convert read_sha1_file to struct object_id

2018-02-19 Thread brian m. carlson
Convert read_sha1_file to take a pointer to struct object_id and rename
it read_object_file.  Do the same for read_sha1_file_extended.

Convert one use in grep.c to use the new function without any other code
change, since the pointer being passed is a void pointer that is already
initialized with a pointer to struct object_id.  Update the declaration
and definitions of the modified functions, and apply the following
semantic patch to convert the remaining callers:

@@
expression E1, E2, E3;
@@
- read_sha1_file(E1.hash, E2, E3)
+ read_object_file(, E2, E3)

@@
expression E1, E2, E3;
@@
- read_sha1_file(E1->hash, E2, E3)
+ read_object_file(E1, E2, E3)

@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1.hash, E2, E3, E4)
+ read_object_file_extended(, E2, E3, E4)

@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1->hash, E2, E3, E4)
+ read_object_file_extended(E1, E2, E3, E4)

Signed-off-by: brian m. carlson 
---
 apply.c  |  4 ++--
 archive.c|  2 +-
 bisect.c |  3 ++-
 blame.c  |  8 
 builtin/cat-file.c   | 14 --
 builtin/difftool.c   |  2 +-
 builtin/fast-export.c|  4 ++--
 builtin/fmt-merge-msg.c  |  2 +-
 builtin/grep.c   |  2 +-
 builtin/index-pack.c |  5 +++--
 builtin/log.c|  2 +-
 builtin/merge-tree.c |  5 +++--
 builtin/mktag.c  |  2 +-
 builtin/notes.c  |  6 +++---
 builtin/pack-objects.c   | 16 ++--
 builtin/reflog.c |  2 +-
 builtin/tag.c|  4 ++--
 builtin/unpack-file.c|  2 +-
 builtin/unpack-objects.c |  2 +-
 builtin/verify-commit.c  |  2 +-
 bundle.c |  2 +-
 cache.h  | 10 +-
 combine-diff.c   |  2 +-
 commit.c |  6 +++---
 config.c |  2 +-
 diff.c   |  2 +-
 dir.c|  2 +-
 entry.c  |  2 +-
 fast-import.c|  6 +++---
 fsck.c   |  2 +-
 grep.c   |  2 +-
 http-push.c  |  2 +-
 mailmap.c|  2 +-
 match-trees.c|  4 ++--
 merge-blobs.c|  4 ++--
 merge-recursive.c|  4 ++--
 notes-cache.c|  2 +-
 notes-merge.c|  2 +-
 notes.c  |  8 
 object.c |  2 +-
 read-cache.c |  4 ++--
 ref-filter.c |  2 +-
 remote-testsvn.c |  4 ++--
 rerere.c |  4 ++--
 sha1_file.c  | 20 ++--
 streaming.c  |  2 +-
 submodule-config.c   |  2 +-
 tag.c|  4 ++--
 tree-walk.c  |  4 ++--
 tree.c   |  2 +-
 xdiff-interface.c|  2 +-
 51 files changed, 104 insertions(+), 103 deletions(-)

diff --git a/apply.c b/apply.c
index 40a368b315..dd3cf99b7c 100644
--- a/apply.c
+++ b/apply.c
@@ -3180,7 +3180,7 @@ static int apply_binary(struct apply_state *state,
unsigned long size;
char *result;
 
-   result = read_sha1_file(oid.hash, , );
+   result = read_object_file(, , );
if (!result)
return error(_("the necessary postimage %s for "
   "'%s' cannot be read"),
@@ -3242,7 +3242,7 @@ static int read_blob_object(struct strbuf *buf, const 
struct object_id *oid, uns
unsigned long sz;
char *result;
 
-   result = read_sha1_file(oid->hash, , );
+   result = read_object_file(oid, , );
if (!result)
return -1;
/* XXX read_sha1_file NUL-terminates */
diff --git a/archive.c b/archive.c
index 1ab62d426e..93ab175b0b 100644
--- a/archive.c
+++ b/archive.c
@@ -72,7 +72,7 @@ void *object_file_to_archive(const struct archiver_args *args,
const struct commit *commit = args->convert ? args->commit : NULL;
 
path += args->baselen;
-   buffer = read_sha1_file(oid->hash, type, sizep);
+   buffer = read_object_file(oid, type, sizep);
if (buffer && S_ISREG(mode)) {
struct strbuf buf = STRBUF_INIT;
size_t size = 0;
diff --git a/bisect.c b/bisect.c
index f6d05bd66f..ad395bb2b8 100644
--- a/bisect.c
+++ b/bisect.c
@@ -132,7 +132,8 @@ static void show_list(const char *debug, int counted, int 
nr,
unsigned flags = commit->object.flags;
enum object_type type;
unsigned long size;
-   char *buf = read_sha1_file(commit->object.oid.hash, , 
);
+   char *buf = read_object_file(>object.oid, ,
+);
const char *subject_start;
int subject_len;
 
diff --git a/blame.c b/blame.c
index 3f9bd29615..e5ed80083e 100644
--- a/blame.c
+++ b/blame.c
@@ -297,8 +297,8 @@ static void 

[PATCH 30/36] streaming: convert istream internals to struct object_id

2018-02-19 Thread brian m. carlson
Convert the various open_istream variants to take a pointer to struct
object_id.  Introduce a temporary, which will be removed later, to work
around the fact that lookup_replace_object still returns a pointer to
unsigned char.

Signed-off-by: brian m. carlson 
---
 streaming.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/streaming.c b/streaming.c
index 344678e95f..3f4be1ea2c 100644
--- a/streaming.c
+++ b/streaming.c
@@ -14,7 +14,7 @@ enum input_source {
 
 typedef int (*open_istream_fn)(struct git_istream *,
   struct object_info *,
-  const unsigned char *,
+  const struct object_id *,
   enum object_type *);
 typedef int (*close_istream_fn)(struct git_istream *);
 typedef ssize_t (*read_istream_fn)(struct git_istream *, char *, size_t);
@@ -27,7 +27,7 @@ struct stream_vtbl {
 #define open_method_decl(name) \
int open_istream_ ##name \
(struct git_istream *st, struct object_info *oi, \
-const unsigned char *sha1, \
+const struct object_id *oid, \
 enum object_type *type)
 
 #define close_method_decl(name) \
@@ -142,13 +142,16 @@ struct git_istream *open_istream(const struct object_id 
*oid,
struct object_info oi = OBJECT_INFO_INIT;
const unsigned char *real = lookup_replace_object(oid->hash);
enum input_source src = istream_source(real, type, );
+   struct object_id realoid;
+
+   hashcpy(realoid.hash, real);
 
if (src < 0)
return NULL;
 
st = xmalloc(sizeof(*st));
-   if (open_istream_tbl[src](st, , real, type)) {
-   if (open_istream_incore(st, , real, type)) {
+   if (open_istream_tbl[src](st, , , type)) {
+   if (open_istream_incore(st, , , type)) {
free(st);
return NULL;
}
@@ -338,7 +341,7 @@ static struct stream_vtbl loose_vtbl = {
 
 static open_method_decl(loose)
 {
-   st->u.loose.mapped = map_sha1_file(sha1, >u.loose.mapsize);
+   st->u.loose.mapped = map_sha1_file(oid->hash, >u.loose.mapsize);
if (!st->u.loose.mapped)
return -1;
if ((unpack_sha1_header(>z,
@@ -489,7 +492,7 @@ static struct stream_vtbl incore_vtbl = {
 
 static open_method_decl(incore)
 {
-   st->u.incore.buf = read_sha1_file_extended(sha1, type, >size, 0);
+   st->u.incore.buf = read_sha1_file_extended(oid->hash, type, >size, 
0);
st->u.incore.read_ptr = 0;
st->vtbl = _vtbl;
 


[PATCH 35/36] sha1_file: introduce a constant for max header length

2018-02-19 Thread brian m. carlson
There were several instances of 32 sprinkled throughout this file, all
of which were used for allocating a buffer to store the header of an
object.  Introduce a constant, MAX_HEADER_LEN, for this purpose.

Note that this constant is slightly larger than required; the longest
possible header is 28 (7 for "commit", 1 for a space, 20 for a 63-bit
length in decimal, and 1 for the NUL).  However, the overallocation
should not cause any problems, so leave it as it is.

Signed-off-by: brian m. carlson 
---
 sha1_file.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index c41fbe2f01..4d4f9248c7 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -30,6 +30,9 @@
 #include "packfile.h"
 #include "fetch-object.h"
 
+/* The maximum size for an object header. */
+#define MAX_HEADER_LEN 32
+
 const unsigned char null_sha1[GIT_MAX_RAWSZ];
 const struct object_id null_oid;
 const struct object_id empty_tree_oid = {
@@ -791,7 +794,7 @@ int check_object_signature(const struct object_id *oid, 
void *map,
enum object_type obj_type;
struct git_istream *st;
git_hash_ctx c;
-   char hdr[32];
+   char hdr[MAX_HEADER_LEN];
int hdrlen;
 
if (map) {
@@ -1150,7 +1153,7 @@ static int sha1_loose_object_info(const unsigned char 
*sha1,
unsigned long mapsize;
void *map;
git_zstream stream;
-   char hdr[32];
+   char hdr[MAX_HEADER_LEN];
struct strbuf hdrbuf = STRBUF_INIT;
unsigned long size_scratch;
 
@@ -1512,7 +1515,7 @@ static int write_buffer(int fd, const void *buf, size_t 
len)
 int hash_object_file(const void *buf, unsigned long len, const char *type,
 struct object_id *oid)
 {
-   char hdr[32];
+   char hdr[MAX_HEADER_LEN];
int hdrlen = sizeof(hdr);
write_object_file_prepare(buf, len, type, oid, hdr, );
return 0;
@@ -1667,7 +1670,7 @@ static int freshen_packed_object(const unsigned char 
*sha1)
 int write_object_file(const void *buf, unsigned long len, const char *type,
  struct object_id *oid)
 {
-   char hdr[32];
+   char hdr[MAX_HEADER_LEN];
int hdrlen = sizeof(hdr);
 
/* Normally if we have it in the pack then we do not bother writing
@@ -1687,7 +1690,7 @@ int hash_object_file_literally(const void *buf, unsigned 
long len,
int hdrlen, status = 0;
 
/* type string, SP, %lu of the length plus NUL must fit this */
-   hdrlen = strlen(type) + 32;
+   hdrlen = strlen(type) + MAX_HEADER_LEN;
header = xmalloc(hdrlen);
write_object_file_prepare(buf, len, type, oid, header, );
 
@@ -1707,7 +1710,7 @@ int force_object_loose(const struct object_id *oid, 
time_t mtime)
void *buf;
unsigned long len;
enum object_type type;
-   char hdr[32];
+   char hdr[MAX_HEADER_LEN];
int hdrlen;
int ret;
 
@@ -2189,7 +2192,7 @@ int read_loose_object(const char *path,
void *map = NULL;
unsigned long mapsize;
git_zstream stream;
-   char hdr[32];
+   char hdr[MAX_HEADER_LEN];
 
*contents = NULL;
 


[PATCH 25/36] Convert remaining callers of sha1_object_info_extended to object_id

2018-02-19 Thread brian m. carlson
Convert the remaining caller of sha1_object_info_extended to use struct
object_id.  Introduce temporaries, which will be removed later, since
there is a dependency loop between sha1_object_info_extended and
lookup_replace_object_extended.  This allows us to convert the code in a
piecemeal fashion instead of all at once.

Signed-off-by: brian m. carlson 
---
 sha1_file.c | 14 +++---
 streaming.c |  5 -
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index fa69d86309..19995766b6 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1231,6 +1231,9 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
lookup_replace_object(sha1) :
sha1;
int already_retried = 0;
+   struct object_id realoid;
+
+   hashcpy(realoid.hash, real);
 
if (is_null_sha1(real))
return -1;
@@ -1295,7 +1298,7 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
rtype = packed_object_info(e.p, e.offset, oi);
if (rtype < 0) {
mark_bad_packed_object(e.p, real);
-   return sha1_object_info_extended(real, oi, 0);
+   return sha1_object_info_extended(realoid.hash, oi, 0);
} else if (oi->whence == OI_PACKED) {
oi->u.packed.offset = e.offset;
oi->u.packed.pack = e.p;
@@ -1323,13 +1326,16 @@ int sha1_object_info(const unsigned char *sha1, 
unsigned long *sizep)
 static void *read_object(const unsigned char *sha1, enum object_type *type,
 unsigned long *size)
 {
+   struct object_id oid;
struct object_info oi = OBJECT_INFO_INIT;
void *content;
oi.typep = type;
oi.sizep = size;
oi.contentp = 
 
-   if (sha1_object_info_extended(sha1, , 0) < 0)
+   hashcpy(oid.hash, sha1);
+
+   if (sha1_object_info_extended(oid.hash, , 0) < 0)
return NULL;
return content;
 }
@@ -1723,9 +1729,11 @@ int force_object_loose(const struct object_id *oid, 
time_t mtime)
 
 int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
 {
+   struct object_id oid;
if (!startup_info->have_repository)
return 0;
-   return sha1_object_info_extended(sha1, NULL,
+   hashcpy(oid.hash, sha1);
+   return sha1_object_info_extended(oid.hash, NULL,
 flags | OBJECT_INFO_SKIP_CACHED) >= 0;
 }
 
diff --git a/streaming.c b/streaming.c
index be85507922..042d6082e8 100644
--- a/streaming.c
+++ b/streaming.c
@@ -111,10 +111,13 @@ static enum input_source istream_source(const unsigned 
char *sha1,
 {
unsigned long size;
int status;
+   struct object_id oid;
+
+   hashcpy(oid.hash, sha1);
 
oi->typep = type;
oi->sizep = 
-   status = sha1_object_info_extended(sha1, oi, 0);
+   status = sha1_object_info_extended(oid.hash, oi, 0);
if (status < 0)
return stream_error;
 


[PATCH 27/36] builtin/fmt-merge-msg: convert remaining code to object_id

2018-02-19 Thread brian m. carlson
We were using the util pointer, which is a pointer to void, as an
unsigned char pointer.  The pointer actually points to a struct
origin_data, which has a struct object_id as its first member, which in
turn has an unsigned char array as its first member, so this was valid.
Since we want to convert this to struct object_id, simply change the
pointer we're using.

Signed-off-by: brian m. carlson 
---
 builtin/fmt-merge-msg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index 8e8a15ea4a..0bc0dda99c 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -485,10 +485,10 @@ static void fmt_merge_msg_sigs(struct strbuf *out)
struct strbuf tagbuf = STRBUF_INIT;
 
for (i = 0; i < origins.nr; i++) {
-   unsigned char *sha1 = origins.items[i].util;
+   struct object_id *oid = origins.items[i].util;
enum object_type type;
unsigned long size, len;
-   char *buf = read_sha1_file(sha1, , );
+   char *buf = read_sha1_file(oid->hash, , );
struct strbuf sig = STRBUF_INIT;
 
if (!buf || type != OBJ_TAG)


[PATCH 26/36] sha1_file: convert sha1_object_info* to object_id

2018-02-19 Thread brian m. carlson
Convert sha1_object_info and sha1_object_info_extended to take pointers
to struct object_id and rename them to use "oid" instead of "sha1" in
their names.  Update the declaration and definition and apply the
following semantic patch, plus the standard object_id transforms:

@@
expression E1, E2;
@@
- sha1_object_info(E1.hash, E2)
+ oid_object_info(, E2)

@@
expression E1, E2;
@@
- sha1_object_info(E1->hash, E2)
+ oid_object_info(E1, E2)

@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1.hash, E2, E3)
+ oid_object_info_extended(, E2, E3)

@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1->hash, E2, E3)
+ oid_object_info_extended(E1, E2, E3)
---
 archive-tar.c|  2 +-
 archive-zip.c|  2 +-
 blame.c  |  4 ++--
 builtin/blame.c  |  2 +-
 builtin/cat-file.c   | 14 +++---
 builtin/describe.c   |  2 +-
 builtin/fast-export.c|  2 +-
 builtin/fetch.c  |  2 +-
 builtin/fsck.c   |  2 +-
 builtin/index-pack.c |  4 ++--
 builtin/ls-tree.c|  2 +-
 builtin/mktree.c |  2 +-
 builtin/pack-objects.c   |  7 +++
 builtin/prune.c  |  2 +-
 builtin/replace.c| 10 +-
 builtin/tag.c|  4 ++--
 builtin/unpack-objects.c |  2 +-
 cache.h  |  6 +++---
 diff.c   |  2 +-
 fast-import.c| 10 +-
 list-objects-filter.c|  2 +-
 object.c |  2 +-
 pack-bitmap-write.c  |  3 +--
 packfile.c   |  4 ++--
 reachable.c  |  2 +-
 refs.c   |  2 +-
 remote.c |  2 +-
 sequencer.c  |  3 ++-
 sha1_file.c  | 22 +++---
 sha1_name.c  | 12 ++--
 streaming.c  |  2 +-
 submodule.c  |  2 +-
 tag.c|  2 +-
 33 files changed, 71 insertions(+), 72 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 7a0d31d847..3563bcb9f2 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -276,7 +276,7 @@ static int write_tar_entry(struct archiver_args *args,
memcpy(header.name, path, pathlen);
 
if (S_ISREG(mode) && !args->convert &&
-   sha1_object_info(oid->hash, ) == OBJ_BLOB &&
+   oid_object_info(oid, ) == OBJ_BLOB &&
size > big_file_threshold)
buffer = NULL;
else if (S_ISLNK(mode) || S_ISREG(mode)) {
diff --git a/archive-zip.c b/archive-zip.c
index 18b951b732..6b20bce4d1 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -325,7 +325,7 @@ static int write_zip_entry(struct archiver_args *args,
compressed_size = 0;
buffer = NULL;
} else if (S_ISREG(mode) || S_ISLNK(mode)) {
-   enum object_type type = sha1_object_info(oid->hash, );
+   enum object_type type = oid_object_info(oid, );
 
method = 0;
attr2 = S_ISLNK(mode) ? ((mode | 0777) << 16) :
diff --git a/blame.c b/blame.c
index 1fc22b304b..c1df10cdd2 100644
--- a/blame.c
+++ b/blame.c
@@ -81,7 +81,7 @@ static void verify_working_tree_path(struct commit 
*work_tree, const char *path)
unsigned mode;
 
if (!get_tree_entry(commit_oid->hash, path, blob_oid.hash, 
) &&
-   sha1_object_info(blob_oid.hash, NULL) == OBJ_BLOB)
+   oid_object_info(_oid, NULL) == OBJ_BLOB)
return;
}
 
@@ -506,7 +506,7 @@ static int fill_blob_sha1_and_mode(struct blame_origin 
*origin)
   origin->path,
   origin->blob_oid.hash, >mode))
goto error_out;
-   if (sha1_object_info(origin->blob_oid.hash, NULL) != OBJ_BLOB)
+   if (oid_object_info(>blob_oid, NULL) != OBJ_BLOB)
goto error_out;
return 0;
  error_out:
diff --git a/builtin/blame.c b/builtin/blame.c
index b980e8a1dd..f1a2fd6702 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -655,7 +655,7 @@ static int is_a_rev(const char *name)
 
if (get_oid(name, ))
return 0;
-   return OBJ_NONE < sha1_object_info(oid.hash, NULL);
+   return OBJ_NONE < oid_object_info(, NULL);
 }
 
 int cmd_blame(int argc, const char **argv, const char *prefix)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index cf9ea5c796..3264bada39 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -77,7 +77,7 @@ static int cat_one_file(int opt, const char *exp_type, const 
char *obj_name,
switch (opt) {
case 't':
oi.typename = 
-   if (sha1_object_info_extended(oid.hash, , flags) < 0)
+   if (oid_object_info_extended(, , flags) < 0)
die("git cat-file: could not get object info");
if (sb.len) {
printf("%s\n", sb.buf);
@@ -88,7 +88,7 @@ static int cat_one_file(int opt, const char *exp_type, const 
char *obj_name,
 
  

[PATCH 23/36] sha1_file: convert retry_bad_packed_offset to struct object_id

2018-02-19 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 packfile.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/packfile.c b/packfile.c
index 5d07f330c8..bfa6438b4e 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1095,13 +1095,13 @@ static int retry_bad_packed_offset(struct packed_git 
*p, off_t obj_offset)
 {
int type;
struct revindex_entry *revidx;
-   const unsigned char *sha1;
+   struct object_id oid;
revidx = find_pack_revindex(p, obj_offset);
if (!revidx)
return OBJ_BAD;
-   sha1 = nth_packed_object_sha1(p, revidx->nr);
-   mark_bad_packed_object(p, sha1);
-   type = sha1_object_info(sha1, NULL);
+   nth_packed_object_oid(, p, revidx->nr);
+   mark_bad_packed_object(p, oid.hash);
+   type = sha1_object_info(oid.hash, NULL);
if (type <= OBJ_NONE)
return OBJ_BAD;
return type;


[PATCH 34/36] Convert lookup_replace_object to struct object_id

2018-02-19 Thread brian m. carlson
Convert both the argument and the return value to be pointers to struct
object_id.  Update the callers and their internals to deal with the new
type.  Remove several temporaries which are no longer needed.

Signed-off-by: brian m. carlson 
---
 builtin/mktag.c  |  7 ++-
 cache.h  |  8 
 object.c | 14 --
 replace_object.c | 14 +++---
 sha1_file.c  | 44 
 streaming.c  | 16 +---
 6 files changed, 42 insertions(+), 61 deletions(-)

diff --git a/builtin/mktag.c b/builtin/mktag.c
index cfb777b3c8..9f5a50a8fd 100644
--- a/builtin/mktag.c
+++ b/builtin/mktag.c
@@ -24,14 +24,11 @@ static int verify_object(const struct object_id *oid, const 
char *expected_type)
enum object_type type;
unsigned long size;
void *buffer = read_object_file(oid, , );
-   const unsigned char *repl = lookup_replace_object(oid->hash);
+   const struct object_id *repl = lookup_replace_object(oid);
 
if (buffer) {
-   struct object_id reploid;
-   hashcpy(reploid.hash, repl);
-
if (type == type_from_string(expected_type))
-   ret = check_object_signature(, buffer, size, 
expected_type);
+   ret = check_object_signature(repl, buffer, size, 
expected_type);
free(buffer);
}
return ret;
diff --git a/cache.h b/cache.h
index 970d6edd04..a70c52b22f 100644
--- a/cache.h
+++ b/cache.h
@@ -1197,7 +1197,7 @@ static inline void *read_object_file(const struct 
object_id *oid, enum object_ty
  * This internal function is only declared here for the benefit of
  * lookup_replace_object().  Please do not call it directly.
  */
-extern const unsigned char *do_lookup_replace_object(const unsigned char 
*sha1);
+extern const struct object_id *do_lookup_replace_object(const struct object_id 
*oid);
 
 /*
  * If object sha1 should be replaced, return the replacement object's
@@ -1205,11 +1205,11 @@ extern const unsigned char 
*do_lookup_replace_object(const unsigned char *sha1);
  * either sha1 or a pointer to a permanently-allocated value.  When
  * object replacement is suppressed, always return sha1.
  */
-static inline const unsigned char *lookup_replace_object(const unsigned char 
*sha1)
+static inline const struct object_id *lookup_replace_object(const struct 
object_id *oid)
 {
if (!check_replace_refs)
-   return sha1;
-   return do_lookup_replace_object(sha1);
+   return oid;
+   return do_lookup_replace_object(oid);
 }
 
 /* Read and unpack an object file into memory, write memory to an object file 
*/
diff --git a/object.c b/object.c
index ea1a6f18e8..4e1c065d55 100644
--- a/object.c
+++ b/object.c
@@ -244,7 +244,7 @@ struct object *parse_object(const struct object_id *oid)
unsigned long size;
enum object_type type;
int eaten;
-   const unsigned char *repl = lookup_replace_object(oid->hash);
+   const struct object_id *repl = lookup_replace_object(oid);
void *buffer;
struct object *obj;
 
@@ -255,10 +255,7 @@ struct object *parse_object(const struct object_id *oid)
if ((obj && obj->type == OBJ_BLOB && has_object_file(oid)) ||
(!obj && has_object_file(oid) &&
 oid_object_info(oid, NULL) == OBJ_BLOB)) {
-   struct object_id reploid;
-   hashcpy(reploid.hash, repl);
-
-   if (check_object_signature(, NULL, 0, NULL) < 0) {
+   if (check_object_signature(repl, NULL, 0, NULL) < 0) {
error("sha1 mismatch %s", oid_to_hex(oid));
return NULL;
}
@@ -268,12 +265,9 @@ struct object *parse_object(const struct object_id *oid)
 
buffer = read_object_file(oid, , );
if (buffer) {
-   struct object_id reploid;
-   hashcpy(reploid.hash, repl);
-
-   if (check_object_signature(, buffer, size, 
typename(type)) < 0) {
+   if (check_object_signature(repl, buffer, size, typename(type)) 
< 0) {
free(buffer);
-   error("sha1 mismatch %s", sha1_to_hex(repl));
+   error("sha1 mismatch %s", oid_to_hex(repl));
return NULL;
}
 
diff --git a/replace_object.c b/replace_object.c
index 232e8b8550..336357394d 100644
--- a/replace_object.c
+++ b/replace_object.c
@@ -92,16 +92,16 @@ static void prepare_replace_object(void)
 #define MAXREPLACEDEPTH 5
 
 /*
- * If a replacement for object sha1 has been set up, return the
+ * If a replacement for object oid has been set up, return the
  * replacement object's name (replaced recursively, if necessary).
- * The return value is either sha1 or a pointer to a
+ * The return value is either oid or a pointer to a
  * permanently-allocated value.  This function always 

[PATCH 36/36] convert: convert to struct object_id

2018-02-19 Thread brian m. carlson
Convert convert.c to struct object_id.  Add a use of the_hash_algo to
replace hard-coded constants and change a strbuf_add to a strbuf_addstr
to avoid another hard-coded constant.

Note that a strict conversion using the hexsz constant would cause
problems in the future if the internal and user-visible hash algorithms
differed, as anticipated by the hash function transition plan.

Signed-off-by: brian m. carlson 
---
 convert.c | 12 ++--
 convert.h |  2 +-
 entry.c   |  2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/convert.c b/convert.c
index cc562f6509..c480097a2a 100644
--- a/convert.c
+++ b/convert.c
@@ -914,7 +914,7 @@ static int ident_to_worktree(const char *path, const char 
*src, size_t len,
to_free = strbuf_detach(buf, NULL);
hash_object_file(src, len, "blob", );
 
-   strbuf_grow(buf, len + cnt * 43);
+   strbuf_grow(buf, len + cnt * (the_hash_algo->hexsz + 3));
for (;;) {
/* step 1: run to the next '$' */
dollar = memchr(src, '$', len);
@@ -1510,7 +1510,7 @@ struct ident_filter {
struct stream_filter filter;
struct strbuf left;
int state;
-   char ident[45]; /* ": x40 $" */
+   char ident[GIT_MAX_HEXSZ + 5]; /* ": x40 $" */
 };
 
 static int is_foreign_ident(const char *str)
@@ -1635,12 +1635,12 @@ static struct stream_filter_vtbl ident_vtbl = {
ident_free_fn,
 };
 
-static struct stream_filter *ident_filter(const unsigned char *sha1)
+static struct stream_filter *ident_filter(const struct object_id *oid)
 {
struct ident_filter *ident = xmalloc(sizeof(*ident));
 
xsnprintf(ident->ident, sizeof(ident->ident),
- ": %s $", sha1_to_hex(sha1));
+ ": %s $", oid_to_hex(oid));
strbuf_init(>left, 0);
ident->filter.vtbl = _vtbl;
ident->state = 0;
@@ -1655,7 +1655,7 @@ static struct stream_filter *ident_filter(const unsigned 
char *sha1)
  * Note that you would be crazy to set CRLF, smuge/clean or ident to a
  * large binary blob you would want us not to slurp into the memory!
  */
-struct stream_filter *get_stream_filter(const char *path, const unsigned char 
*sha1)
+struct stream_filter *get_stream_filter(const char *path, const struct 
object_id *oid)
 {
struct conv_attrs ca;
struct stream_filter *filter = NULL;
@@ -1668,7 +1668,7 @@ struct stream_filter *get_stream_filter(const char *path, 
const unsigned char *s
return NULL;
 
if (ca.ident)
-   filter = ident_filter(sha1);
+   filter = ident_filter(oid);
 
if (output_eol(ca.crlf_action) == EOL_CRLF)
filter = cascade_filter(filter, lf_to_crlf_filter());
diff --git a/convert.h b/convert.h
index 65ab3e5167..2e9b4f49cc 100644
--- a/convert.h
+++ b/convert.h
@@ -93,7 +93,7 @@ extern int would_convert_to_git_filter_fd(const char *path);
 
 struct stream_filter; /* opaque */
 
-extern struct stream_filter *get_stream_filter(const char *path, const 
unsigned char *);
+extern struct stream_filter *get_stream_filter(const char *path, const struct 
object_id *);
 extern void free_stream_filter(struct stream_filter *);
 extern int is_null_stream_filter(struct stream_filter *);
 
diff --git a/entry.c b/entry.c
index 3f21d2c913..49b83f4a97 100644
--- a/entry.c
+++ b/entry.c
@@ -266,7 +266,7 @@ static int write_entry(struct cache_entry *ce,
 
if (ce_mode_s_ifmt == S_IFREG) {
struct stream_filter *filter = get_stream_filter(ce->name,
-ce->oid.hash);
+>oid);
if (filter &&
!streaming_write_entry(ce, path, filter,
   state, to_tempfile,


[PATCH 32/36] sha1_file: convert read_object_with_reference to object_id

2018-02-19 Thread brian m. carlson
Convert read_object_with_reference to take pointers to struct object_id.
Update the internals of the function accordingly.

Signed-off-by: brian m. carlson 
---
 builtin/cat-file.c |  2 +-
 builtin/grep.c |  4 ++--
 builtin/pack-objects.c |  2 +-
 cache.h|  4 ++--
 fast-import.c  | 15 ---
 sha1_file.c| 20 ++--
 tree-walk.c|  9 -
 7 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 3264bada39..17382f2fa4 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -159,7 +159,7 @@ static int cat_one_file(int opt, const char *exp_type, 
const char *obj_name,
 * fall-back to the usual case.
 */
}
-   buf = read_object_with_reference(oid.hash, exp_type, , 
NULL);
+   buf = read_object_with_reference(, exp_type, , NULL);
break;
 
default:
diff --git a/builtin/grep.c b/builtin/grep.c
index 3ca4ac80d8..7c2843a492 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -445,7 +445,7 @@ static int grep_submodule(struct grep_opt *opt, struct 
repository *superproject,
object = parse_object_or_die(oid, oid_to_hex(oid));
 
grep_read_lock();
-   data = read_object_with_reference(object->oid.hash, tree_type,
+   data = read_object_with_reference(>oid, tree_type,
  , NULL);
grep_read_unlock();
 
@@ -607,7 +607,7 @@ static int grep_object(struct grep_opt *opt, const struct 
pathspec *pathspec,
int hit, len;
 
grep_read_lock();
-   data = read_object_with_reference(obj->oid.hash, tree_type,
+   data = read_object_with_reference(>oid, tree_type,
  , NULL);
grep_read_unlock();
 
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f8148eb9d5..16d6069e16 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1351,7 +1351,7 @@ static void add_preferred_base(struct object_id *oid)
if (window <= num_preferred_base++)
return;
 
-   data = read_object_with_reference(oid->hash, tree_type, , 
tree_oid.hash);
+   data = read_object_with_reference(oid, tree_type, , _oid);
if (!data)
return;
 
diff --git a/cache.h b/cache.h
index b965a073f2..c482fb92fe 100644
--- a/cache.h
+++ b/cache.h
@@ -1431,10 +1431,10 @@ extern int df_name_compare(const char *name1, int len1, 
int mode1, const char *n
 extern int name_compare(const char *name1, size_t len1, const char *name2, 
size_t len2);
 extern int cache_name_stage_compare(const char *name1, int len1, int stage1, 
const char *name2, int len2, int stage2);
 
-extern void *read_object_with_reference(const unsigned char *sha1,
+extern void *read_object_with_reference(const struct object_id *oid,
const char *required_type,
unsigned long *size,
-   unsigned char *sha1_ret);
+   struct object_id *oid_ret);
 
 extern struct object *peel_to_type(const char *name, int namelen,
   struct object *o, enum object_type);
diff --git a/fast-import.c b/fast-import.c
index 71b60f9068..004a9c9f99 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -2583,8 +2583,9 @@ static void note_change_n(const char *p, struct branch 
*b, unsigned char *old_fa
oidcpy(_oid, _oe->idx.oid);
} else if (!get_oid(p, _oid)) {
unsigned long size;
-   char *buf = read_object_with_reference(commit_oid.hash,
-   commit_type, , commit_oid.hash);
+   char *buf = read_object_with_reference(_oid,
+  commit_type, ,
+  _oid);
if (!buf || size < 46)
die("Not a valid commit: %s", p);
free(buf);
@@ -2653,9 +2654,8 @@ static void parse_from_existing(struct branch *b)
unsigned long size;
char *buf;
 
-   buf = read_object_with_reference(b->oid.hash,
-commit_type, ,
-b->oid.hash);
+   buf = read_object_with_reference(>oid, commit_type, ,
+>oid);
parse_from_commit(b, buf, size);
free(buf);
}
@@ -2732,8 +2732,9 @@ static struct hash_list *parse_merge(unsigned int *count)
oidcpy(>oid, >idx.oid);
} else if (!get_oid(from, >oid)) {

[PATCH 28/36] builtin/notes: convert static functions to object_id

2018-02-19 Thread brian m. carlson
Convert the remaining static functions to take pointers to struct
object_id.

Signed-off-by: brian m. carlson 
---
 builtin/notes.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin/notes.c b/builtin/notes.c
index 39304ba743..08ab9d7130 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -118,11 +118,11 @@ static int list_each_note(const struct object_id 
*object_oid,
return 0;
 }
 
-static void copy_obj_to_fd(int fd, const unsigned char *sha1)
+static void copy_obj_to_fd(int fd, const struct object_id *oid)
 {
unsigned long size;
enum object_type type;
-   char *buf = read_sha1_file(sha1, , );
+   char *buf = read_sha1_file(oid->hash, , );
if (buf) {
if (size)
write_or_die(fd, buf, size);
@@ -162,7 +162,7 @@ static void write_commented_object(int fd, const struct 
object_id *object)
 }
 
 static void prepare_note_data(const struct object_id *object, struct note_data 
*d,
-   const unsigned char *old_note)
+   const struct object_id *old_note)
 {
if (d->use_editor || !d->given) {
int fd;
@@ -457,7 +457,7 @@ static int add(int argc, const char **argv, const char 
*prefix)
oid_to_hex());
}
 
-   prepare_note_data(, , note ? note->hash : NULL);
+   prepare_note_data(, , note);
if (d.buf.len || allow_empty) {
write_note_data(, _note);
if (add_note(t, , _note, combine_notes_overwrite))
@@ -602,7 +602,7 @@ static int append_edit(int argc, const char **argv, const 
char *prefix)
t = init_notes_check(argv[0], NOTES_INIT_WRITABLE);
note = get_note(t, );
 
-   prepare_note_data(, , edit && note ? note->hash : NULL);
+   prepare_note_data(, , edit && note ? note : NULL);
 
if (note && !edit) {
/* Append buf to previous note contents */


[PATCH 08/36] strbuf: convert strbuf_add_unique_abbrev to use struct object_id

2018-02-19 Thread brian m. carlson
Convert the declaration and definition of strbuf_add_unique_abbrev to
make it take a pointer to struct object_id.  Predeclare the struct in
strbuf.h, as cache.h includes strbuf.h before it declares the struct,
and otherwise the struct declaration would have the wrong scope.

Apply the following semantic patch, along with the standard object_id
transforms, to adjust the callers:

@@
expression E1, E2, E3;
@@
- strbuf_add_unique_abbrev(E1, E2.hash, E3);
+ strbuf_add_unique_abbrev(E1, , E3);

@@
expression E1, E2, E3;
@@
- strbuf_add_unique_abbrev(E1, E2->hash, E3);
+ strbuf_add_unique_abbrev(E1, E2, E3);

Signed-off-by: brian m. carlson 
---
 builtin/checkout.c | 2 +-
 builtin/fetch.c| 8 
 builtin/tag.c  | 2 +-
 merge-recursive.c  | 2 +-
 pretty.c   | 8 
 strbuf.c   | 4 ++--
 strbuf.h   | 8 +++-
 submodule.c| 4 ++--
 transport.c| 4 ++--
 wt-status.c| 6 +++---
 10 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 87182aa018..bc68d423e5 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -720,7 +720,7 @@ static int add_pending_uninteresting_ref(const char 
*refname,
 static void describe_one_orphan(struct strbuf *sb, struct commit *commit)
 {
strbuf_addstr(sb, "  ");
-   strbuf_add_unique_abbrev(sb, commit->object.oid.hash, DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(sb, >object.oid, DEFAULT_ABBREV);
strbuf_addch(sb, ' ');
if (!parse_commit(commit))
pp_commit_easy(CMIT_FMT_ONELINE, commit, sb);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 8ee998ea2e..3b8988e51d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -697,9 +697,9 @@ static int update_local_ref(struct ref *ref,
if (in_merge_bases(current, updated)) {
struct strbuf quickref = STRBUF_INIT;
int r;
-   strbuf_add_unique_abbrev(, current->object.oid.hash, 
DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(, >object.oid, 
DEFAULT_ABBREV);
strbuf_addstr(, "..");
-   strbuf_add_unique_abbrev(, ref->new_oid.hash, 
DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(, >new_oid, 
DEFAULT_ABBREV);
if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
(recurse_submodules != RECURSE_SUBMODULES_ON))
check_for_new_submodule_commits(>new_oid);
@@ -712,9 +712,9 @@ static int update_local_ref(struct ref *ref,
} else if (force || ref->force) {
struct strbuf quickref = STRBUF_INIT;
int r;
-   strbuf_add_unique_abbrev(, current->object.oid.hash, 
DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(, >object.oid, 
DEFAULT_ABBREV);
strbuf_addstr(, "...");
-   strbuf_add_unique_abbrev(, ref->new_oid.hash, 
DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(, >new_oid, 
DEFAULT_ABBREV);
if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
(recurse_submodules != RECURSE_SUBMODULES_ON))
check_for_new_submodule_commits(>new_oid);
diff --git a/builtin/tag.c b/builtin/tag.c
index 8885e21ddc..99dba2d907 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -289,7 +289,7 @@ static void create_reflog_msg(const struct object_id *oid, 
struct strbuf *sb)
strbuf_addstr(sb, rla);
} else {
strbuf_addstr(sb, "tag: tagging ");
-   strbuf_add_unique_abbrev(sb, oid->hash, DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(sb, oid, DEFAULT_ABBREV);
}
 
strbuf_addstr(sb, " (");
diff --git a/merge-recursive.c b/merge-recursive.c
index c886f2e9cd..f58f13957e 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -228,7 +228,7 @@ static void output_commit_title(struct merge_options *o, 
struct commit *commit)
strbuf_addf(>obuf, "virtual %s\n",
merge_remote_util(commit)->name);
else {
-   strbuf_add_unique_abbrev(>obuf, commit->object.oid.hash,
+   strbuf_add_unique_abbrev(>obuf, >object.oid,
 DEFAULT_ABBREV);
strbuf_addch(>obuf, ' ');
if (parse_commit(commit) != 0)
diff --git a/pretty.c b/pretty.c
index f7ce490230..34fe891fc0 100644
--- a/pretty.c
+++ b/pretty.c
@@ -549,7 +549,7 @@ static void add_merge_info(const struct 
pretty_print_context *pp,
struct object_id *oidp = >item->object.oid;
strbuf_addch(sb, ' ');
if (pp->abbrev)
-   strbuf_add_unique_abbrev(sb, oidp->hash, pp->abbrev);
+   strbuf_add_unique_abbrev(sb, oidp, pp->abbrev);
else
strbuf_addstr(sb, oid_to_hex(oidp));
parent = parent->next;
@@ 

[PATCH 04/36] cache-tree: convert remnants to struct object_id

2018-02-19 Thread brian m. carlson
Convert the remaining portions of cache-tree.c to use struct object_id.
Convert several instances of 20 to use the_hash_algo instead.

Signed-off-by: brian m. carlson 
---
 cache-tree.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index ba07a8067e..601dd1bb1f 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -320,7 +320,7 @@ static int update_one(struct cache_tree *it,
struct cache_tree_sub *sub = NULL;
const char *path, *slash;
int pathlen, entlen;
-   const unsigned char *sha1;
+   const struct object_id *oid;
unsigned mode;
int expected_missing = 0;
int contains_ita = 0;
@@ -338,7 +338,7 @@ static int update_one(struct cache_tree *it,
die("cache-tree.c: '%.*s' in '%s' not found",
entlen, path + baselen, path);
i += sub->count;
-   sha1 = sub->cache_tree->oid.hash;
+   oid = >cache_tree->oid;
mode = S_IFDIR;
contains_ita = sub->cache_tree->entry_count < 0;
if (contains_ita) {
@@ -347,19 +347,19 @@ static int update_one(struct cache_tree *it,
}
}
else {
-   sha1 = ce->oid.hash;
+   oid = >oid;
mode = ce->ce_mode;
entlen = pathlen - baselen;
i++;
}
 
-   if (is_null_sha1(sha1) ||
-   (mode != S_IFGITLINK && !missing_ok && 
!has_sha1_file(sha1))) {
+   if (is_null_oid(oid) ||
+   (mode != S_IFGITLINK && !missing_ok && 
!has_object_file(oid))) {
strbuf_release();
if (expected_missing)
return -1;
return error("invalid object %06o %s for '%.*s'",
-   mode, sha1_to_hex(sha1), entlen+baselen, path);
+   mode, oid_to_hex(oid), entlen+baselen, path);
}
 
/*
@@ -385,12 +385,12 @@ static int update_one(struct cache_tree *it,
/*
 * "sub" can be an empty tree if all subentries are i-t-a.
 */
-   if (contains_ita && !hashcmp(sha1, EMPTY_TREE_SHA1_BIN))
+   if (contains_ita && !oidcmp(oid, _tree_oid))
continue;
 
strbuf_grow(, entlen + 100);
strbuf_addf(, "%o %.*s%c", mode, entlen, path + baselen, 
'\0');
-   strbuf_add(, sha1, 20);
+   strbuf_add(, oid->hash, the_hash_algo->rawsz);
 
 #if DEBUG
fprintf(stderr, "cache-tree update-one %o %.*s\n",
@@ -401,7 +401,7 @@ static int update_one(struct cache_tree *it,
if (repair) {
struct object_id oid;
hash_object_file(buffer.buf, buffer.len, tree_type, );
-   if (has_sha1_file(oid.hash))
+   if (has_object_file())
oidcpy(>oid, );
else
to_invalidate = 1;
@@ -465,7 +465,7 @@ static void write_one(struct strbuf *buffer, struct 
cache_tree *it,
 #endif
 
if (0 <= it->entry_count) {
-   strbuf_add(buffer, it->oid.hash, 20);
+   strbuf_add(buffer, it->oid.hash, the_hash_algo->rawsz);
}
for (i = 0; i < it->subtree_nr; i++) {
struct cache_tree_sub *down = it->down[i];
@@ -520,11 +520,11 @@ static struct cache_tree *read_one(const char **buffer, 
unsigned long *size_p)
goto free_return;
buf++; size--;
if (0 <= it->entry_count) {
-   if (size < 20)
+   if (size < the_hash_algo->rawsz)
goto free_return;
hashcpy(it->oid.hash, (const unsigned char*)buf);
-   buf += 20;
-   size -= 20;
+   buf += the_hash_algo->rawsz;
+   size -= the_hash_algo->rawsz;
}
 
 #if DEBUG


[PATCH 10/36] Convert find_unique_abbrev* to struct object_id

2018-02-19 Thread brian m. carlson
Convert find_unique_abbrev and find_unique_abbrev_r to each take a
pointer to struct object_id.

Signed-off-by: brian m. carlson 
---
 builtin/blame.c|  2 +-
 builtin/branch.c   |  2 +-
 builtin/checkout.c |  6 +++---
 builtin/describe.c |  4 ++--
 builtin/log.c  |  4 ++--
 builtin/ls-files.c |  4 ++--
 builtin/ls-tree.c  |  4 ++--
 builtin/merge.c|  6 +++---
 builtin/name-rev.c |  2 +-
 builtin/receive-pack.c |  8 
 builtin/reset.c|  2 +-
 builtin/rev-list.c |  2 +-
 builtin/rev-parse.c|  2 +-
 builtin/show-branch.c  |  2 +-
 builtin/show-ref.c |  4 ++--
 builtin/tag.c  |  6 --
 builtin/worktree.c |  4 ++--
 cache.h|  6 +++---
 combine-diff.c |  4 ++--
 diff.c |  2 +-
 log-tree.c | 12 ++--
 ref-filter.c   |  4 ++--
 sequencer.c|  2 +-
 sha1_name.c| 12 ++--
 strbuf.c   |  2 +-
 tag.c  |  4 ++--
 transport.c|  2 +-
 wt-status.c|  6 +++---
 28 files changed, 61 insertions(+), 59 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 9dcb367b90..b980e8a1dd 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -499,7 +499,7 @@ static int read_ancestry(const char *graft_file)
 
 static int update_auto_abbrev(int auto_abbrev, struct blame_origin *suspect)
 {
-   const char *uniq = find_unique_abbrev(suspect->commit->object.oid.hash,
+   const char *uniq = find_unique_abbrev(>commit->object.oid,
  auto_abbrev);
int len = strlen(uniq);
if (auto_abbrev < len)
diff --git a/builtin/branch.c b/builtin/branch.c
index 8dcc2ed058..659deb36f3 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -273,7 +273,7 @@ static int delete_branches(int argc, const char **argv, int 
force, int kinds,
   bname.buf,
   (flags & REF_ISBROKEN) ? "broken"
   : (flags & REF_ISSYMREF) ? target
-  : find_unique_abbrev(oid.hash, DEFAULT_ABBREV));
+  : find_unique_abbrev(, DEFAULT_ABBREV));
}
delete_branch_config(bname.buf);
 
diff --git a/builtin/checkout.c b/builtin/checkout.c
index bc68d423e5..e6ee955671 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -405,10 +405,10 @@ static void describe_detached_head(const char *msg, 
struct commit *commit)
pp_commit_easy(CMIT_FMT_ONELINE, commit, );
if (print_sha1_ellipsis()) {
fprintf(stderr, "%s %s... %s\n", msg,
-   find_unique_abbrev(commit->object.oid.hash, 
DEFAULT_ABBREV), sb.buf);
+   find_unique_abbrev(>object.oid, 
DEFAULT_ABBREV), sb.buf);
} else {
fprintf(stderr, "%s %s %s\n", msg,
-   find_unique_abbrev(commit->object.oid.hash, 
DEFAULT_ABBREV), sb.buf);
+   find_unique_abbrev(>object.oid, 
DEFAULT_ABBREV), sb.buf);
}
strbuf_release();
 }
@@ -778,7 +778,7 @@ static void suggest_reattach(struct commit *commit, struct 
rev_info *revs)
" git branch  %s\n\n",
/* Give ngettext() the count */
lost),
-   find_unique_abbrev(commit->object.oid.hash, 
DEFAULT_ABBREV));
+   find_unique_abbrev(>object.oid, 
DEFAULT_ABBREV));
 }
 
 /*
diff --git a/builtin/describe.c b/builtin/describe.c
index e4869df7b4..7e6535a8bd 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -285,7 +285,7 @@ static void append_name(struct commit_name *n, struct 
strbuf *dst)
 
 static void append_suffix(int depth, const struct object_id *oid, struct 
strbuf *dst)
 {
-   strbuf_addf(dst, "-%d-g%s", depth, find_unique_abbrev(oid->hash, 
abbrev));
+   strbuf_addf(dst, "-%d-g%s", depth, find_unique_abbrev(oid, abbrev));
 }
 
 static void describe_commit(struct object_id *oid, struct strbuf *dst)
@@ -383,7 +383,7 @@ static void describe_commit(struct object_id *oid, struct 
strbuf *dst)
if (!match_cnt) {
struct object_id *cmit_oid = >object.oid;
if (always) {
-   strbuf_add_unique_abbrev(dst, cmit_oid->hash, abbrev);
+   strbuf_add_unique_abbrev(dst, cmit_oid, abbrev);
if (suffix)
strbuf_addstr(dst, suffix);
return;
diff --git a/builtin/log.c b/builtin/log.c
index 32a744bfd2..411339c1ed 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1873,12 +1873,12 @@ static void print_commit(char sign, struct commit 
*commit, int verbose,
 {
if (!verbose) {
fprintf(file, "%c %s\n", sign,
-  

[PATCH 12/36] send-pack: convert remaining functions to struct object_id

2018-02-19 Thread brian m. carlson
Convert the remaining function, feed_object, to use struct object_id.

Signed-off-by: brian m. carlson 
---
 send-pack.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/send-pack.c b/send-pack.c
index 8d9190f5e7..19025a7aca 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -37,14 +37,14 @@ int option_parse_push_signed(const struct option *opt,
die("bad %s argument: %s", opt->long_name, arg);
 }
 
-static void feed_object(const unsigned char *sha1, FILE *fh, int negative)
+static void feed_object(const struct object_id *oid, FILE *fh, int negative)
 {
-   if (negative && !has_sha1_file(sha1))
+   if (negative && !has_sha1_file(oid->hash))
return;
 
if (negative)
putc('^', fh);
-   fputs(sha1_to_hex(sha1), fh);
+   fputs(oid_to_hex(oid), fh);
putc('\n', fh);
 }
 
@@ -89,13 +89,13 @@ static int pack_objects(int fd, struct ref *refs, struct 
oid_array *extra, struc
 */
po_in = xfdopen(po.in, "w");
for (i = 0; i < extra->nr; i++)
-   feed_object(extra->oid[i].hash, po_in, 1);
+   feed_object(>oid[i], po_in, 1);
 
while (refs) {
if (!is_null_oid(>old_oid))
-   feed_object(refs->old_oid.hash, po_in, 1);
+   feed_object(>old_oid, po_in, 1);
if (!is_null_oid(>new_oid))
-   feed_object(refs->new_oid.hash, po_in, 0);
+   feed_object(>new_oid, po_in, 0);
refs = refs->next;
}
 


[PATCH 05/36] resolve-undo: convert struct resolve_undo_info to object_id

2018-02-19 Thread brian m. carlson
Convert the sha1 member of this struct to be an array of struct
object_id instead.  This change is needed to convert find_unique_abbrev.

Signed-off-by: brian m. carlson 
---
 builtin/ls-files.c | 2 +-
 resolve-undo.c | 8 
 resolve-undo.h | 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 2fc836e330..9df66ba307 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -271,7 +271,7 @@ static void show_ru_info(const struct index_state *istate)
if (!ui->mode[i])
continue;
printf("%s%06o %s %d\t", tag_resolve_undo, ui->mode[i],
-  find_unique_abbrev(ui->sha1[i], abbrev),
+  find_unique_abbrev(ui->oid[i].hash, abbrev),
   i + 1);
write_name(path);
}
diff --git a/resolve-undo.c b/resolve-undo.c
index b40f3173d3..109c658a85 100644
--- a/resolve-undo.c
+++ b/resolve-undo.c
@@ -24,7 +24,7 @@ void record_resolve_undo(struct index_state *istate, struct 
cache_entry *ce)
if (!lost->util)
lost->util = xcalloc(1, sizeof(*ui));
ui = lost->util;
-   hashcpy(ui->sha1[stage - 1], ce->oid.hash);
+   oidcpy(>oid[stage - 1], >oid);
ui->mode[stage - 1] = ce->ce_mode;
 }
 
@@ -44,7 +44,7 @@ void resolve_undo_write(struct strbuf *sb, struct string_list 
*resolve_undo)
for (i = 0; i < 3; i++) {
if (!ui->mode[i])
continue;
-   strbuf_add(sb, ui->sha1[i], 20);
+   strbuf_add(sb, ui->oid[i].hash, the_hash_algo->rawsz);
}
}
 }
@@ -89,7 +89,7 @@ struct string_list *resolve_undo_read(const char *data, 
unsigned long size)
continue;
if (size < 20)
goto error;
-   hashcpy(ui->sha1[i], (const unsigned char *)data);
+   hashcpy(ui->oid[i].hash, (const unsigned char *)data);
size -= 20;
data += 20;
}
@@ -145,7 +145,7 @@ int unmerge_index_entry_at(struct index_state *istate, int 
pos)
struct cache_entry *nce;
if (!ru->mode[i])
continue;
-   nce = make_cache_entry(ru->mode[i], ru->sha1[i],
+   nce = make_cache_entry(ru->mode[i], ru->oid[i].hash,
   name, i + 1, 0);
if (matched)
nce->ce_flags |= CE_MATCHED;
diff --git a/resolve-undo.h b/resolve-undo.h
index 46306455ed..87291904bd 100644
--- a/resolve-undo.h
+++ b/resolve-undo.h
@@ -3,7 +3,7 @@
 
 struct resolve_undo_info {
unsigned int mode[3];
-   unsigned char sha1[3][20];
+   struct object_id oid[3];
 };
 
 extern void record_resolve_undo(struct index_state *, struct cache_entry *);


[PATCH 16/36] archive: convert sha1_file_to_archive to struct object_id

2018-02-19 Thread brian m. carlson
Convert this function to take a pointer to struct object_id and rename
it object_file_to_archive.

Signed-off-by: brian m. carlson 
---
 archive-tar.c |  2 +-
 archive-zip.c |  4 ++--
 archive.c | 10 +-
 archive.h |  8 
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 24b1ccef3a..fd622eacc0 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -281,7 +281,7 @@ static int write_tar_entry(struct archiver_args *args,
buffer = NULL;
else if (S_ISLNK(mode) || S_ISREG(mode)) {
enum object_type type;
-   buffer = sha1_file_to_archive(args, path, oid->hash, old_mode, 
, );
+   buffer = object_file_to_archive(args, path, oid, old_mode, 
, );
if (!buffer)
return error("cannot read %s", oid_to_hex(oid));
} else {
diff --git a/archive-zip.c b/archive-zip.c
index e2e5513c03..5841a6ceb6 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -344,8 +344,8 @@ static int write_zip_entry(struct archiver_args *args,
flags |= ZIP_STREAM;
out = buffer = NULL;
} else {
-   buffer = sha1_file_to_archive(args, path, oid->hash, 
mode,
- , );
+   buffer = object_file_to_archive(args, path, oid, mode,
+   , );
if (!buffer)
return error("cannot read %s",
 oid_to_hex(oid));
diff --git a/archive.c b/archive.c
index 4942b5632b..da62b2f541 100644
--- a/archive.c
+++ b/archive.c
@@ -63,16 +63,16 @@ static void format_subst(const struct commit *commit,
free(to_free);
 }
 
-void *sha1_file_to_archive(const struct archiver_args *args,
-  const char *path, const unsigned char *sha1,
-  unsigned int mode, enum object_type *type,
-  unsigned long *sizep)
+void *object_file_to_archive(const struct archiver_args *args,
+const char *path, const struct object_id *oid,
+unsigned int mode, enum object_type *type,
+unsigned long *sizep)
 {
void *buffer;
const struct commit *commit = args->convert ? args->commit : NULL;
 
path += args->baselen;
-   buffer = read_sha1_file(sha1, type, sizep);
+   buffer = read_sha1_file(oid->hash, type, sizep);
if (buffer && S_ISREG(mode)) {
struct strbuf buf = STRBUF_INIT;
size_t size = 0;
diff --git a/archive.h b/archive.h
index 741991bfb6..1f9954f7cd 100644
--- a/archive.h
+++ b/archive.h
@@ -39,9 +39,9 @@ extern int write_archive_entries(struct archiver_args *args, 
write_archive_entry
 extern int write_archive(int argc, const char **argv, const char *prefix, 
const char *name_hint, int remote);
 
 const char *archive_format_from_filename(const char *filename);
-extern void *sha1_file_to_archive(const struct archiver_args *args,
- const char *path, const unsigned char *sha1,
- unsigned int mode, enum object_type *type,
- unsigned long *sizep);
+extern void *object_file_to_archive(const struct archiver_args *args,
+   const char *path, const struct object_id 
*oid,
+   unsigned int mode, enum object_type *type,
+   unsigned long *sizep);
 
 #endif /* ARCHIVE_H */


[PATCH 09/36] wt-status: convert struct wt_status_state to object_id

2018-02-19 Thread brian m. carlson
Convert the various *_sha1 members to use struct object_id instead.

Signed-off-by: brian m. carlson 
---
 wt-status.c | 12 ++--
 wt-status.h |  6 +++---
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/wt-status.c b/wt-status.c
index 98e558a70b..698fa1d42a 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1349,7 +1349,7 @@ static void show_cherry_pick_in_progress(struct wt_status 
*s,
const char *color)
 {
status_printf_ln(s, color, _("You are currently cherry-picking commit 
%s."),
-   find_unique_abbrev(state->cherry_pick_head_sha1, 
DEFAULT_ABBREV));
+   find_unique_abbrev(state->cherry_pick_head_oid.hash, 
DEFAULT_ABBREV));
if (s->hints) {
if (has_unmerged(s))
status_printf_ln(s, color,
@@ -1368,7 +1368,7 @@ static void show_revert_in_progress(struct wt_status *s,
const char *color)
 {
status_printf_ln(s, color, _("You are currently reverting commit %s."),
-find_unique_abbrev(state->revert_head_sha1, 
DEFAULT_ABBREV));
+find_unique_abbrev(state->revert_head_oid.hash, 
DEFAULT_ABBREV));
if (s->hints) {
if (has_unmerged(s))
status_printf_ln(s, color,
@@ -1489,9 +1489,9 @@ static void wt_status_get_detached_from(struct 
wt_status_state *state)
} else
state->detached_from =
xstrdup(find_unique_abbrev(cb.noid.hash, 
DEFAULT_ABBREV));
-   hashcpy(state->detached_sha1, cb.noid.hash);
+   oidcpy(>detached_oid, );
state->detached_at = !get_oid("HEAD", ) &&
-!hashcmp(oid.hash, state->detached_sha1);
+!oidcmp(, >detached_oid);
 
free(ref);
strbuf_release();
@@ -1550,13 +1550,13 @@ void wt_status_get_state(struct wt_status_state *state,
} else if (!stat(git_path_cherry_pick_head(), ) &&
!get_oid("CHERRY_PICK_HEAD", )) {
state->cherry_pick_in_progress = 1;
-   hashcpy(state->cherry_pick_head_sha1, oid.hash);
+   oidcpy(>cherry_pick_head_oid, );
}
wt_status_check_bisect(NULL, state);
if (!stat(git_path_revert_head(), ) &&
!get_oid("REVERT_HEAD", )) {
state->revert_in_progress = 1;
-   hashcpy(state->revert_head_sha1, oid.hash);
+   oidcpy(>revert_head_oid, );
}
 
if (get_detached_from)
diff --git a/wt-status.h b/wt-status.h
index 3f84d5c29f..c7ef173284 100644
--- a/wt-status.h
+++ b/wt-status.h
@@ -116,9 +116,9 @@ struct wt_status_state {
char *branch;
char *onto;
char *detached_from;
-   unsigned char detached_sha1[20];
-   unsigned char revert_head_sha1[20];
-   unsigned char cherry_pick_head_sha1[20];
+   struct object_id detached_oid;
+   struct object_id revert_head_oid;
+   struct object_id cherry_pick_head_oid;
 };
 
 size_t wt_status_locate_end(const char *s, size_t len);


[PATCH 15/36] archive: convert write_archive_entry_fn_t to object_id

2018-02-19 Thread brian m. carlson
Convert the write_archive_entry_fn_t type to use a pointer to struct
object_id.  Convert various static functions in the tar and zip
archivers also.

Signed-off-by: brian m. carlson 
---
 archive-tar.c | 28 ++--
 archive-zip.c | 16 
 archive.c | 12 ++--
 archive.h |  2 +-
 4 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index c6ed96ee74..24b1ccef3a 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -111,7 +111,7 @@ static void write_trailer(void)
  * queues up writes, so that all our write(2) calls write exactly one
  * full block; pads writes to RECORDSIZE
  */
-static int stream_blocked(const unsigned char *sha1)
+static int stream_blocked(const struct object_id *oid)
 {
struct git_istream *st;
enum object_type type;
@@ -119,9 +119,9 @@ static int stream_blocked(const unsigned char *sha1)
char buf[BLOCKSIZE];
ssize_t readlen;
 
-   st = open_istream(sha1, , , NULL);
+   st = open_istream(oid->hash, , , NULL);
if (!st)
-   return error("cannot stream blob %s", sha1_to_hex(sha1));
+   return error("cannot stream blob %s", oid_to_hex(oid));
for (;;) {
readlen = read_istream(st, buf, sizeof(buf));
if (readlen <= 0)
@@ -218,7 +218,7 @@ static void prepare_header(struct archiver_args *args,
 }
 
 static void write_extended_header(struct archiver_args *args,
- const unsigned char *sha1,
+ const struct object_id *oid,
  const void *buffer, unsigned long size)
 {
struct ustar_header header;
@@ -226,14 +226,14 @@ static void write_extended_header(struct archiver_args 
*args,
memset(, 0, sizeof(header));
*header.typeflag = TYPEFLAG_EXT_HEADER;
mode = 0100666;
-   xsnprintf(header.name, sizeof(header.name), "%s.paxheader", 
sha1_to_hex(sha1));
+   xsnprintf(header.name, sizeof(header.name), "%s.paxheader", 
oid_to_hex(oid));
prepare_header(args, , mode, size);
write_blocked(, sizeof(header));
write_blocked(buffer, size);
 }
 
 static int write_tar_entry(struct archiver_args *args,
-  const unsigned char *sha1,
+  const struct object_id *oid,
   const char *path, size_t pathlen,
   unsigned int mode)
 {
@@ -257,7 +257,7 @@ static int write_tar_entry(struct archiver_args *args,
mode = (mode | ((mode & 0100) ? 0777 : 0666)) & ~tar_umask;
} else {
return error("unsupported file mode: 0%o (SHA1: %s)",
-mode, sha1_to_hex(sha1));
+mode, oid_to_hex(oid));
}
if (pathlen > sizeof(header.name)) {
size_t plen = get_path_prefix(path, pathlen,
@@ -268,7 +268,7 @@ static int write_tar_entry(struct archiver_args *args,
memcpy(header.name, path + plen + 1, rest);
} else {
xsnprintf(header.name, sizeof(header.name), "%s.data",
- sha1_to_hex(sha1));
+ oid_to_hex(oid));
strbuf_append_ext_header(_header, "path",
 path, pathlen);
}
@@ -276,14 +276,14 @@ static int write_tar_entry(struct archiver_args *args,
memcpy(header.name, path, pathlen);
 
if (S_ISREG(mode) && !args->convert &&
-   sha1_object_info(sha1, ) == OBJ_BLOB &&
+   sha1_object_info(oid->hash, ) == OBJ_BLOB &&
size > big_file_threshold)
buffer = NULL;
else if (S_ISLNK(mode) || S_ISREG(mode)) {
enum object_type type;
-   buffer = sha1_file_to_archive(args, path, sha1, old_mode, 
, );
+   buffer = sha1_file_to_archive(args, path, oid->hash, old_mode, 
, );
if (!buffer)
-   return error("cannot read %s", sha1_to_hex(sha1));
+   return error("cannot read %s", oid_to_hex(oid));
} else {
buffer = NULL;
size = 0;
@@ -292,7 +292,7 @@ static int write_tar_entry(struct archiver_args *args,
if (S_ISLNK(mode)) {
if (size > sizeof(header.linkname)) {
xsnprintf(header.linkname, sizeof(header.linkname),
- "see %s.paxheader", sha1_to_hex(sha1));
+ "see %s.paxheader", oid_to_hex(oid));
strbuf_append_ext_header(_header, "linkpath",
 buffer, size);
} else
@@ -308,7 +308,7 @@ static int write_tar_entry(struct archiver_args *args,
prepare_header(args, , 

[PATCH 06/36] tree: convert read_tree_recursive to struct object_id

2018-02-19 Thread brian m. carlson
Convert the callback functions for read_tree_recursive to take a pointer
to struct object_id.

Signed-off-by: brian m. carlson 
---
 archive.c  |  8 
 builtin/checkout.c |  4 ++--
 builtin/log.c  |  2 +-
 builtin/ls-tree.c  |  8 
 merge-recursive.c  |  2 +-
 tree.c | 14 +++---
 tree.h |  2 +-
 7 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/archive.c b/archive.c
index 0b7b62af0c..e664cdb624 100644
--- a/archive.c
+++ b/archive.c
@@ -198,7 +198,7 @@ static int write_directory(struct archiver_context *c)
return ret ? -1 : 0;
 }
 
-static int queue_or_write_archive_entry(const unsigned char *sha1,
+static int queue_or_write_archive_entry(const struct object_id *oid,
struct strbuf *base, const char *filename,
unsigned mode, int stage, void *context)
 {
@@ -224,14 +224,14 @@ static int queue_or_write_archive_entry(const unsigned 
char *sha1,
 
if (check_attr_export_ignore(check))
return 0;
-   queue_directory(sha1, base, filename,
+   queue_directory(oid->hash, base, filename,
mode, stage, c);
return READ_TREE_RECURSIVE;
}
 
if (write_directory(c))
return -1;
-   return write_archive_entry(sha1, base->buf, base->len, filename, mode,
+   return write_archive_entry(oid->hash, base->buf, base->len, filename, 
mode,
   stage, context);
 }
 
@@ -303,7 +303,7 @@ static const struct archiver *lookup_archiver(const char 
*name)
return NULL;
 }
 
-static int reject_entry(const unsigned char *sha1, struct strbuf *base,
+static int reject_entry(const struct object_id *oid, struct strbuf *base,
const char *filename, unsigned mode,
int stage, void *context)
 {
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 191b96c49c..87182aa018 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -66,7 +66,7 @@ static int post_checkout_hook(struct commit *old, struct 
commit *new,
 
 }
 
-static int update_some(const unsigned char *sha1, struct strbuf *base,
+static int update_some(const struct object_id *oid, struct strbuf *base,
const char *pathname, unsigned mode, int stage, void *context)
 {
int len;
@@ -78,7 +78,7 @@ static int update_some(const unsigned char *sha1, struct 
strbuf *base,
 
len = base->len + strlen(pathname);
ce = xcalloc(1, cache_entry_size(len));
-   hashcpy(ce->oid.hash, sha1);
+   oidcpy(>oid, oid);
memcpy(ce->name, base->buf, base->len);
memcpy(ce->name + base->len, pathname, len - base->len);
ce->ce_flags = create_ce_flags(0) | CE_UPDATE;
diff --git a/builtin/log.c b/builtin/log.c
index 94ee177d56..32a744bfd2 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -541,7 +541,7 @@ static int show_tag_object(const struct object_id *oid, 
struct rev_info *rev)
return 0;
 }
 
-static int show_tree_object(const unsigned char *sha1,
+static int show_tree_object(const struct object_id *oid,
struct strbuf *base,
const char *pathname, unsigned mode, int stage, void *context)
 {
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index ef965408e8..c613dd7b82 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -60,7 +60,7 @@ static int show_recursive(const char *base, int baselen, 
const char *pathname)
return 0;
 }
 
-static int show_tree(const unsigned char *sha1, struct strbuf *base,
+static int show_tree(const struct object_id *oid, struct strbuf *base,
const char *pathname, unsigned mode, int stage, void *context)
 {
int retval = 0;
@@ -94,7 +94,7 @@ static int show_tree(const unsigned char *sha1, struct strbuf 
*base,
char size_text[24];
if (!strcmp(type, blob_type)) {
unsigned long size;
-   if (sha1_object_info(sha1, ) == OBJ_BAD)
+   if (sha1_object_info(oid->hash, ) == 
OBJ_BAD)
xsnprintf(size_text, sizeof(size_text),
  "BAD");
else
@@ -103,11 +103,11 @@ static int show_tree(const unsigned char *sha1, struct 
strbuf *base,
} else
xsnprintf(size_text, sizeof(size_text), "-");
printf("%06o %s %s %7s\t", mode, type,
-  find_unique_abbrev(sha1, abbrev),
+  find_unique_abbrev(oid->hash, abbrev),
   size_text);
} else
printf("%06o %s %s\t", mode, type,
-  find_unique_abbrev(sha1, abbrev));
+   

[PATCH 19/36] sha1_file: convert check_sha1_signature to struct object_id

2018-02-19 Thread brian m. carlson
Convert this function to take a pointer to struct object_id and rename
it check_object_signature.  Introduce temporaries to convert the return
values of lookup_replace_object and lookup_replace_object_extended into
struct object_id.

The temporaries are needed because in order to convert
lookup_replace_object, open_istream needs to be converted, and
open_istream needs check_sha1_signature to be converted, causing a loop
of dependencies.  The temporaries will be removed in a future patch.

Signed-off-by: brian m. carlson 
---
 builtin/fast-export.c |  2 +-
 builtin/index-pack.c  |  2 +-
 builtin/mktag.c   |  5 -
 cache.h   |  2 +-
 object.c  | 10 --
 pack-check.c  |  4 ++--
 sha1_file.c   | 12 ++--
 7 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 796d0cd66c..293a6615fa 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -240,7 +240,7 @@ static void export_blob(const struct object_id *oid)
buf = read_sha1_file(oid->hash, , );
if (!buf)
die ("Could not read blob %s", oid_to_hex(oid));
-   if (check_sha1_signature(oid->hash, buf, size, typename(type)) 
< 0)
+   if (check_object_signature(oid, buf, size, typename(type)) < 0)
die("sha1 mismatch in blob %s", oid_to_hex(oid));
object = parse_object_buffer(oid, type, size, buf, );
}
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 5c7ab47c36..e0a776f1ac 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1377,7 +1377,7 @@ static void fix_unresolved_deltas(struct hashfile *f)
if (!base_obj->data)
continue;
 
-   if (check_sha1_signature(d->oid.hash, base_obj->data,
+   if (check_object_signature(>oid, base_obj->data,
base_obj->size, typename(type)))
die(_("local object %s is corrupt"), 
oid_to_hex(>oid));
base_obj->obj = append_obj_to_pack(f, d->oid.hash,
diff --git a/builtin/mktag.c b/builtin/mktag.c
index 65bb41e3cd..810b24bef3 100644
--- a/builtin/mktag.c
+++ b/builtin/mktag.c
@@ -27,8 +27,11 @@ static int verify_object(const struct object_id *oid, const 
char *expected_type)
const unsigned char *repl = lookup_replace_object(oid->hash);
 
if (buffer) {
+   struct object_id reploid;
+   hashcpy(reploid.hash, repl);
+
if (type == type_from_string(expected_type))
-   ret = check_sha1_signature(repl, buffer, size, 
expected_type);
+   ret = check_object_signature(, buffer, size, 
expected_type);
free(buffer);
}
return ret;
diff --git a/cache.h b/cache.h
index 8a9055f4e7..f29ff43bbd 100644
--- a/cache.h
+++ b/cache.h
@@ -1236,7 +1236,7 @@ extern void *map_sha1_file(const unsigned char *sha1, 
unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, 
unsigned long mapsize, void *buffer, unsigned long bufsiz);
 extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
 
-extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned 
long size, const char *type);
+extern int check_object_signature(const struct object_id *oid, void *buf, 
unsigned long size, const char *type);
 
 extern int finalize_object_file(const char *tmpfile, const char *filename);
 
diff --git a/object.c b/object.c
index 9e6f9ff20b..c63f02a40f 100644
--- a/object.c
+++ b/object.c
@@ -255,7 +255,10 @@ struct object *parse_object(const struct object_id *oid)
if ((obj && obj->type == OBJ_BLOB && has_object_file(oid)) ||
(!obj && has_object_file(oid) &&
 sha1_object_info(oid->hash, NULL) == OBJ_BLOB)) {
-   if (check_sha1_signature(repl, NULL, 0, NULL) < 0) {
+   struct object_id reploid;
+   hashcpy(reploid.hash, repl);
+
+   if (check_object_signature(, NULL, 0, NULL) < 0) {
error("sha1 mismatch %s", oid_to_hex(oid));
return NULL;
}
@@ -265,7 +268,10 @@ struct object *parse_object(const struct object_id *oid)
 
buffer = read_sha1_file(oid->hash, , );
if (buffer) {
-   if (check_sha1_signature(repl, buffer, size, typename(type)) < 
0) {
+   struct object_id reploid;
+   hashcpy(reploid.hash, repl);
+
+   if (check_object_signature(, buffer, size, 
typename(type)) < 0) {
free(buffer);
error("sha1 mismatch %s", sha1_to_hex(repl));
return NULL;
diff --git a/pack-check.c b/pack-check.c
index 403a572567..80ef7c1c63 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -126,7 +126,7 @@ 

[PATCH 11/36] http-walker: convert struct object_request to use struct object_id

2018-02-19 Thread brian m. carlson
Convert struct object_request to use struct object_id by updating the
definition and applying the following semantic patch, plus the standard
object_id transforms:

@@
struct object_request E1;
@@
- E1.sha1
+ E1.oid.hash

@@
struct object_request *E1;
@@
- E1->sha1
+ E1->oid.hash

Signed-off-by: brian m. carlson 
---
 http-walker.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/http-walker.c b/http-walker.c
index 07c2b1af82..f506f394ac 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -22,7 +22,7 @@ enum object_request_state {
 
 struct object_request {
struct walker *walker;
-   unsigned char sha1[20];
+   struct object_id oid;
struct alt_base *repo;
enum object_request_state state;
struct http_object_request *req;
@@ -56,7 +56,7 @@ static void start_object_request(struct walker *walker,
struct active_request_slot *slot;
struct http_object_request *req;
 
-   req = new_http_object_request(obj_req->repo->base, obj_req->sha1);
+   req = new_http_object_request(obj_req->repo->base, obj_req->oid.hash);
if (req == NULL) {
obj_req->state = ABORTED;
return;
@@ -82,7 +82,7 @@ static void finish_object_request(struct object_request 
*obj_req)
return;
 
if (obj_req->req->rename == 0)
-   walker_say(obj_req->walker, "got %s\n", 
sha1_to_hex(obj_req->sha1));
+   walker_say(obj_req->walker, "got %s\n", 
oid_to_hex(_req->oid));
 }
 
 static void process_object_response(void *callback_data)
@@ -129,7 +129,7 @@ static int fill_active_slot(struct walker *walker)
list_for_each_safe(pos, tmp, head) {
obj_req = list_entry(pos, struct object_request, node);
if (obj_req->state == WAITING) {
-   if (has_sha1_file(obj_req->sha1))
+   if (has_sha1_file(obj_req->oid.hash))
obj_req->state = COMPLETE;
else {
start_object_request(walker, obj_req);
@@ -148,7 +148,7 @@ static void prefetch(struct walker *walker, unsigned char 
*sha1)
 
newreq = xmalloc(sizeof(*newreq));
newreq->walker = walker;
-   hashcpy(newreq->sha1, sha1);
+   hashcpy(newreq->oid.hash, sha1);
newreq->repo = data->alt;
newreq->state = WAITING;
newreq->req = NULL;
@@ -481,13 +481,13 @@ static int fetch_object(struct walker *walker, unsigned 
char *sha1)
 
list_for_each(pos, head) {
obj_req = list_entry(pos, struct object_request, node);
-   if (!hashcmp(obj_req->sha1, sha1))
+   if (!hashcmp(obj_req->oid.hash, sha1))
break;
}
if (obj_req == NULL)
return error("Couldn't find request for %s in the queue", hex);
 
-   if (has_sha1_file(obj_req->sha1)) {
+   if (has_sha1_file(obj_req->oid.hash)) {
if (obj_req->req != NULL)
abort_http_object_request(obj_req->req);
abort_object_request(obj_req);
@@ -541,7 +541,7 @@ static int fetch_object(struct walker *walker, unsigned 
char *sha1)
} else if (req->zret != Z_STREAM_END) {
walker->corrupt_object_found++;
ret = error("File %s (%s) corrupt", hex, req->url);
-   } else if (hashcmp(obj_req->sha1, req->real_sha1)) {
+   } else if (hashcmp(obj_req->oid.hash, req->real_sha1)) {
ret = error("File %s has bad hash", hex);
} else if (req->rename < 0) {
struct strbuf buf = STRBUF_INIT;


[PATCH 17/36] builtin/index-pack: convert struct ref_delta_entry to object_id

2018-02-19 Thread brian m. carlson
Convert this struct to use a member of type object_id.  Convert various
static functions as well.

Signed-off-by: brian m. carlson 
---
 builtin/index-pack.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 7e3e1a461c..5c7ab47c36 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -58,7 +58,7 @@ struct ofs_delta_entry {
 };
 
 struct ref_delta_entry {
-   unsigned char sha1[20];
+   struct object_id oid;
int obj_no;
 };
 
@@ -671,18 +671,18 @@ static void find_ofs_delta_children(off_t offset,
*last_index = last;
 }
 
-static int compare_ref_delta_bases(const unsigned char *sha1,
-  const unsigned char *sha2,
+static int compare_ref_delta_bases(const struct object_id *oid1,
+  const struct object_id *oid2,
   enum object_type type1,
   enum object_type type2)
 {
int cmp = type1 - type2;
if (cmp)
return cmp;
-   return hashcmp(sha1, sha2);
+   return oidcmp(oid1, oid2);
 }
 
-static int find_ref_delta(const unsigned char *sha1, enum object_type type)
+static int find_ref_delta(const struct object_id *oid, enum object_type type)
 {
int first = 0, last = nr_ref_deltas;
 
@@ -691,7 +691,7 @@ static int find_ref_delta(const unsigned char *sha1, enum 
object_type type)
struct ref_delta_entry *delta = _deltas[next];
int cmp;
 
-   cmp = compare_ref_delta_bases(sha1, delta->sha1,
+   cmp = compare_ref_delta_bases(oid, >oid,
  type, 
objects[delta->obj_no].type);
if (!cmp)
return next;
@@ -704,11 +704,11 @@ static int find_ref_delta(const unsigned char *sha1, enum 
object_type type)
return -first-1;
 }
 
-static void find_ref_delta_children(const unsigned char *sha1,
+static void find_ref_delta_children(const struct object_id *oid,
int *first_index, int *last_index,
enum object_type type)
 {
-   int first = find_ref_delta(sha1, type);
+   int first = find_ref_delta(oid, type);
int last = first;
int end = nr_ref_deltas - 1;
 
@@ -717,9 +717,9 @@ static void find_ref_delta_children(const unsigned char 
*sha1,
*last_index = -1;
return;
}
-   while (first > 0 && !hashcmp(ref_deltas[first - 1].sha1, sha1))
+   while (first > 0 && !oidcmp(_deltas[first - 1].oid, oid))
--first;
-   while (last < end && !hashcmp(ref_deltas[last + 1].sha1, sha1))
+   while (last < end && !oidcmp(_deltas[last + 1].oid, oid))
++last;
*first_index = first;
*last_index = last;
@@ -991,7 +991,7 @@ static struct base_data *find_unresolved_deltas_1(struct 
base_data *base,
  struct base_data *prev_base)
 {
if (base->ref_last == -1 && base->ofs_last == -1) {
-   find_ref_delta_children(base->obj->idx.oid.hash,
+   find_ref_delta_children(>obj->idx.oid,
>ref_first, >ref_last,
OBJ_REF_DELTA);
 
@@ -1075,7 +1075,7 @@ static int compare_ref_delta_entry(const void *a, const 
void *b)
const struct ref_delta_entry *delta_a = a;
const struct ref_delta_entry *delta_b = b;
 
-   return hashcmp(delta_a->sha1, delta_b->sha1);
+   return oidcmp(_a->oid, _b->oid);
 }
 
 static void resolve_base(struct object_entry *obj)
@@ -1141,7 +1141,7 @@ static void parse_pack_objects(unsigned char *hash)
ofs_delta++;
} else if (obj->type == OBJ_REF_DELTA) {
ALLOC_GROW(ref_deltas, nr_ref_deltas + 1, 
ref_deltas_alloc);
-   hashcpy(ref_deltas[nr_ref_deltas].sha1, 
ref_delta_oid.hash);
+   oidcpy(_deltas[nr_ref_deltas].oid, _delta_oid);
ref_deltas[nr_ref_deltas].obj_no = i;
nr_ref_deltas++;
} else if (!data) {
@@ -1373,14 +1373,14 @@ static void fix_unresolved_deltas(struct hashfile *f)
 
if (objects[d->obj_no].real_type != OBJ_REF_DELTA)
continue;
-   base_obj->data = read_sha1_file(d->sha1, , 
_obj->size);
+   base_obj->data = read_sha1_file(d->oid.hash, , 
_obj->size);
if (!base_obj->data)
continue;
 
-   if (check_sha1_signature(d->sha1, base_obj->data,
+   if (check_sha1_signature(d->oid.hash, base_obj->data,
base_obj->size, typename(type)))
-   die(_("local object %s