Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-11 Thread Martin Ågren
On 11 May 2018 at 19:23, Derrick Stolee  wrote:

> Martin's initial test cases are wonderful. I've adapted them to test the
> other conditions in the verify_commit_graph() method and caught some
> interesting behavior in the process. I'm preparing v2 so we can investigate
> the direction of the tests.

Cool, I'm glad you found them useful. One thought I had was that you
could possibly write the tests such that you introduce errors from the
back of the file. That might enable you to do less of the "backup
commit-graph file and restore it"-dance. Just a thought.

Martin


Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-11 Thread Derrick Stolee

On 5/10/2018 3:22 PM, Stefan Beller wrote:

On Thu, May 10, 2018 at 12:05 PM, Martin Ågren  wrote:

On 10 May 2018 at 19:34, Derrick Stolee  wrote:


Commits 01-07 focus on the 'git commit-graph verify' subcommand. These
are ready for full, rigorous review.

I don't know about "full" and "rigorous", but I tried to offer my thoughts.

A lingering feeling I have is that users could possibly benefit from
seeing "the commit-graph has a bad foo" a bit more than just "the
commit-graph is bad". But adding "the bar is baz, should have been
frotz" might not bring that much. Maybe you could keep the translatable
string somewhat simple, then, if the more technical data could be useful
to Git developers, dump it in a non-translated format. (I guess it could
be hidden behind a debug switch, but let's take one step at a time..)
This is nothing I feel strongly about.


  t/t5318-commit-graph.sh  |  25 +

I wonder about tests. Some tests seem to use `dd` to corrupt files and
check that it gets caught. Doing this in a a hash-agnostic way could be
tricky, but maybe not impossible. I guess we could do something
probabilistic, like "set the first two bytes of the very last OID to
zero -- surely all OIDs can't start with 16 zero bits". Hmm, that might
still require knowing the size of the OIDs...

I hope to find time to do some more hands-on testing of this to see that
errors actually do get caught.

Given that the commit graph is secondary data, the users work around
to quickly get back to a well working repository is most likely to remove
the file and regenerate it.
As a developer who wants to fix the bug, a stacktrace/datadump and the
history of git commands might be most valuable, but I agree we should
hide that behind a debug flag.

Packfiles and loose objects are primary data, which means that those
need a more advanced way to diagnose and repair them, so I would imagine
the commit graph fsck is closer to bitmaps fsck, which I would have suspected
to be found in t5310, but a quick read doesn't reveal many tests that are
checking for integrity. So I guess the test coverage here is ok, (although we
should always ask for more)


My main goal is to help developers figure out what is wrong with a file, 
and then we can use other diagnostic tools to discover how it got into 
that state.


Martin's initial test cases are wonderful. I've adapted them to test the 
other conditions in the verify_commit_graph() method and caught some 
interesting behavior in the process. I'm preparing v2 so we can 
investigate the direction of the tests.


Thanks,
-Stolee


Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-11 Thread Derrick Stolee

On 5/10/2018 3:17 PM, Ævar Arnfjörð Bjarmason wrote:

On Thu, May 10 2018, Derrick Stolee wrote:


The behavior in this patch series does the following:

1. Near the end of 'git gc', run 'git commit-graph write'. The location
of this code assumes that a 'git gc --auto' has not terminated early
due to not meeting the auto threshold.

2. At the end of 'git fetch', run 'git commit-graph write'. This means
that every reachable commit will be in the commit-graph after a
a successful fetch, which seems a reasonable frequency. Then, the
only times we would be missing a reachable commit is after creating
one locally. There is a problem with the current patch, though: every
'git fetch' call runs 'git commit-graph write', even if there were no
ref updates or objects downloaded. Is there a simple way to detect if
the fetch was non-trivial?

One obvious problem with this approach: if we compute this during 'gc'
AND 'fetch', there will be times where a 'fetch' calls 'gc' and triggers
two commit-graph writes. If I were to abandon one of these patches, it
would be the 'fetch' integration. A 'git gc' really wants to delete all
references to unreachable commits, and without updating the commit-graph
we may still have commit data in the commit-graph file that is not in
the object database. In fact, deleting commits from the object database
but not from the commit-graph will cause 'git commit-graph verify' to
fail!

I welcome discussion on these ideas, as we are venturing out of the
"pure data structure" world and into the "user experience" world. I am
less confident in my skills in this world, but the feature is worthless
if it does not improve the user experience.

I really like #1 here, but I wonder why #2 is necessary.

I.e. is it critical for the performance of the commit graph feature that
it be kept really up-to-date, moreso than other things that rely on gc
--auto (e.g. the optional bitmap index)?


It is not critical. The feature has been designed to have recent commits 
not in the file. For simplicity, it is probably best to limit ourselves 
to writing after a non-trivial 'gc'.



Even if that's the case, I think something that does this via gc --auto
is a much better option. I.e. now we have gc.auto & gc.autoPackLimit, if
the answer to my question above is "yes" this could also be accomplished
by introducing a new graph-specific gc.* setting, and --auto would just
update the graph more often, but leave the rest.


This is an excellent idea for a follow-up series, if we find we want the 
commit-graph written more frequently. For now, I'm satisfied with one 
place where it is automatically computed.


I'll drop the fetch integration in my v2 series.

Thanks,
-Stolee


Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-11 Thread Derrick Stolee

On 5/10/2018 4:45 PM, Martin Ågren wrote:

On 10 May 2018 at 21:22, Stefan Beller  wrote:

On Thu, May 10, 2018 at 12:05 PM, Martin Ågren  wrote:

I hope to find time to do some more hands-on testing of this to see that
errors actually do get caught.

Packfiles and loose objects are primary data, which means that those
need a more advanced way to diagnose and repair them, so I would imagine
the commit graph fsck is closer to bitmaps fsck, which I would have suspected
to be found in t5310, but a quick read doesn't reveal many tests that are
checking for integrity. So I guess the test coverage here is ok, (although we
should always ask for more)

Since I'm wrapping up for today, I'm posting some simple tests that I
assembled. The last of these showcases one or two problems with the
current error-reporting. Depending on the error, there can be *lots* of
errors reported and there are no new-lines, so the result on stdout can
be a wall of not-very-legible text.

Some of these might not make sense. I just started going through the
documentation on the format, causing some sort of corruption in each
field. Maybe this can be helpful somehow.

Martin

diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 82f95eb11f..a7e48db2de 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -255,4 +255,49 @@ test_expect_success 'git fsck (checks commit-graph)' '
git fsck
  '
  
+# usage: corrupt_data   []

+corrupt_data() {
+   file=$1
+   pos=$2
+   data="${3:-\0}"
+   printf "$data" | dd of="$file" bs=1 seek="$pos" conv=notrunc
+}
+
+test_expect_success 'detect bad signature' '
+   cd "$TRASH_DIRECTORY/full" &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   corrupt_data $objdir/info/commit-graph 0 "\0" &&
+   test_must_fail git commit-graph verify 2>err &&
+   grep "graph signature" err
+'
+
+test_expect_success 'detect bad version number' '
+   cd "$TRASH_DIRECTORY/full" &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   corrupt_data $objdir/info/commit-graph 4 "\02" &&
+   test_must_fail git commit-graph verify 2>err &&
+   grep "graph version" err
+'
+
+test_expect_success 'detect bad hash version' '
+   cd "$TRASH_DIRECTORY/full" &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   corrupt_data $objdir/info/commit-graph 5 "\02" &&
+   test_must_fail git commit-graph verify 2>err &&
+   grep "hash version" err
+'
+
+test_expect_success 'detect too small chunk-count' '
+   cd "$TRASH_DIRECTORY/full" &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   corrupt_data $objdir/info/commit-graph 6 "\01" &&
+   test_must_fail git commit-graph verify 2>err &&
+   cat err
+'
+
+
  test_done



Martin: thank you so much for these test examples, and for running them 
to find out about the newline issue. This is really helpful.


Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-10 Thread Martin Ågren
On 10 May 2018 at 21:22, Stefan Beller  wrote:
> On Thu, May 10, 2018 at 12:05 PM, Martin Ågren  wrote:

>> I hope to find time to do some more hands-on testing of this to see that
>> errors actually do get caught.

> Packfiles and loose objects are primary data, which means that those
> need a more advanced way to diagnose and repair them, so I would imagine
> the commit graph fsck is closer to bitmaps fsck, which I would have suspected
> to be found in t5310, but a quick read doesn't reveal many tests that are
> checking for integrity. So I guess the test coverage here is ok, (although we
> should always ask for more)

Since I'm wrapping up for today, I'm posting some simple tests that I
assembled. The last of these showcases one or two problems with the
current error-reporting. Depending on the error, there can be *lots* of
errors reported and there are no new-lines, so the result on stdout can
be a wall of not-very-legible text.

Some of these might not make sense. I just started going through the
documentation on the format, causing some sort of corruption in each
field. Maybe this can be helpful somehow.

Martin

diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 82f95eb11f..a7e48db2de 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -255,4 +255,49 @@ test_expect_success 'git fsck (checks commit-graph)' '
git fsck
 '
 
+# usage: corrupt_data   []
+corrupt_data() {
+   file=$1
+   pos=$2
+   data="${3:-\0}"
+   printf "$data" | dd of="$file" bs=1 seek="$pos" conv=notrunc
+}
+
+test_expect_success 'detect bad signature' '
+   cd "$TRASH_DIRECTORY/full" &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   corrupt_data $objdir/info/commit-graph 0 "\0" &&
+   test_must_fail git commit-graph verify 2>err &&
+   grep "graph signature" err
+'
+
+test_expect_success 'detect bad version number' '
+   cd "$TRASH_DIRECTORY/full" &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   corrupt_data $objdir/info/commit-graph 4 "\02" &&
+   test_must_fail git commit-graph verify 2>err &&
+   grep "graph version" err
+'
+
+test_expect_success 'detect bad hash version' '
+   cd "$TRASH_DIRECTORY/full" &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   corrupt_data $objdir/info/commit-graph 5 "\02" &&
+   test_must_fail git commit-graph verify 2>err &&
+   grep "hash version" err
+'
+
+test_expect_success 'detect too small chunk-count' '
+   cd "$TRASH_DIRECTORY/full" &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   corrupt_data $objdir/info/commit-graph 6 "\01" &&
+   test_must_fail git commit-graph verify 2>err &&
+   cat err
+'
+
+
 test_done
-- 
2.17.0



Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-10 Thread Stefan Beller
On Thu, May 10, 2018 at 12:05 PM, Martin Ågren  wrote:
> On 10 May 2018 at 19:34, Derrick Stolee  wrote:
>
>> Commits 01-07 focus on the 'git commit-graph verify' subcommand. These
>> are ready for full, rigorous review.
>
> I don't know about "full" and "rigorous", but I tried to offer my thoughts.
>
> A lingering feeling I have is that users could possibly benefit from
> seeing "the commit-graph has a bad foo" a bit more than just "the
> commit-graph is bad". But adding "the bar is baz, should have been
> frotz" might not bring that much. Maybe you could keep the translatable
> string somewhat simple, then, if the more technical data could be useful
> to Git developers, dump it in a non-translated format. (I guess it could
> be hidden behind a debug switch, but let's take one step at a time..)
> This is nothing I feel strongly about.
>
>>  t/t5318-commit-graph.sh  |  25 +
>
> I wonder about tests. Some tests seem to use `dd` to corrupt files and
> check that it gets caught. Doing this in a a hash-agnostic way could be
> tricky, but maybe not impossible. I guess we could do something
> probabilistic, like "set the first two bytes of the very last OID to
> zero -- surely all OIDs can't start with 16 zero bits". Hmm, that might
> still require knowing the size of the OIDs...
>
> I hope to find time to do some more hands-on testing of this to see that
> errors actually do get caught.

Given that the commit graph is secondary data, the users work around
to quickly get back to a well working repository is most likely to remove
the file and regenerate it.
As a developer who wants to fix the bug, a stacktrace/datadump and the
history of git commands might be most valuable, but I agree we should
hide that behind a debug flag.

Packfiles and loose objects are primary data, which means that those
need a more advanced way to diagnose and repair them, so I would imagine
the commit graph fsck is closer to bitmaps fsck, which I would have suspected
to be found in t5310, but a quick read doesn't reveal many tests that are
checking for integrity. So I guess the test coverage here is ok, (although we
should always ask for more)

Thanks,
Stefan


Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-10 Thread Ævar Arnfjörð Bjarmason

On Thu, May 10 2018, Derrick Stolee wrote:

> The behavior in this patch series does the following:
>
> 1. Near the end of 'git gc', run 'git commit-graph write'. The location
>of this code assumes that a 'git gc --auto' has not terminated early
>due to not meeting the auto threshold.
>
> 2. At the end of 'git fetch', run 'git commit-graph write'. This means
>that every reachable commit will be in the commit-graph after a
>a successful fetch, which seems a reasonable frequency. Then, the
>only times we would be missing a reachable commit is after creating
>one locally. There is a problem with the current patch, though: every
>'git fetch' call runs 'git commit-graph write', even if there were no
>ref updates or objects downloaded. Is there a simple way to detect if
>the fetch was non-trivial?
>
> One obvious problem with this approach: if we compute this during 'gc'
> AND 'fetch', there will be times where a 'fetch' calls 'gc' and triggers
> two commit-graph writes. If I were to abandon one of these patches, it
> would be the 'fetch' integration. A 'git gc' really wants to delete all
> references to unreachable commits, and without updating the commit-graph
> we may still have commit data in the commit-graph file that is not in
> the object database. In fact, deleting commits from the object database
> but not from the commit-graph will cause 'git commit-graph verify' to
> fail!
>
> I welcome discussion on these ideas, as we are venturing out of the
> "pure data structure" world and into the "user experience" world. I am
> less confident in my skills in this world, but the feature is worthless
> if it does not improve the user experience.

I really like #1 here, but I wonder why #2 is necessary.

I.e. is it critical for the performance of the commit graph feature that
it be kept really up-to-date, moreso than other things that rely on gc
--auto (e.g. the optional bitmap index)?

Even if that's the case, I think something that does this via gc --auto
is a much better option. I.e. now we have gc.auto & gc.autoPackLimit, if
the answer to my question above is "yes" this could also be accomplished
by introducing a new graph-specific gc.* setting, and --auto would just
update the graph more often, but leave the rest.


Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

2018-05-10 Thread Martin Ågren
On 10 May 2018 at 19:34, Derrick Stolee  wrote:

> Commits 01-07 focus on the 'git commit-graph verify' subcommand. These
> are ready for full, rigorous review.

I don't know about "full" and "rigorous", but I tried to offer my thoughts.

A lingering feeling I have is that users could possibly benefit from
seeing "the commit-graph has a bad foo" a bit more than just "the
commit-graph is bad". But adding "the bar is baz, should have been
frotz" might not bring that much. Maybe you could keep the translatable
string somewhat simple, then, if the more technical data could be useful
to Git developers, dump it in a non-translated format. (I guess it could
be hidden behind a debug switch, but let's take one step at a time..)
This is nothing I feel strongly about.

>  t/t5318-commit-graph.sh  |  25 +

I wonder about tests. Some tests seem to use `dd` to corrupt files and
check that it gets caught. Doing this in a a hash-agnostic way could be
tricky, but maybe not impossible. I guess we could do something
probabilistic, like "set the first two bytes of the very last OID to
zero -- surely all OIDs can't start with 16 zero bits". Hmm, that might
still require knowing the size of the OIDs...

I hope to find time to do some more hands-on testing of this to see that
errors actually do get caught.

I saw you redirect stdout to a file "output", and anticipated later
commits to actually look into it. I never saw that though. (I did not
apply the patches, so I could have missed something.)

Martin