Re: --color-moved feedback, was Re: [PATCH v2 19/29] packed-backend: new module for handling packed references

2017-06-26 Thread Stefan Beller
On Fri, Jun 23, 2017 at 6:10 PM, Jeff King  wrote:
> On Fri, Jun 23, 2017 at 01:23:52PM -0700, Stefan Beller wrote:
>
>> > In the end, I just did --color-moved=plain,  ...
>> > "yep, this is all a giant moved chunk, so I don't have to look carefully
>> > at it".
>>
>> This is dangerous, as "plain" does not care about permutations.
>> See the 7f5af90798 (diff.c: color moved lines differently, 2017-06-19)
>> for details. You would want at least "zebra", which would show you
>> permutations.
>
> Ah, I see. I think what I'd really want is some way of correlating two
> particular hunks. That's hard to do with color, though. I guess that's
> the "you would need a ton of colors" discussion I saw going on before?

Yes. And the real question is "how many". ;)
and if you want to optimize for the fewest number of colors, or rather
"best user experience as defined by me", or something else entirely
different.

>
> It would depend on how many hunks there are, and how close together they
> are. For instance, your 6cd5757c8 seems like a good candidate, but I
> have to admit with --color-moved=zebra I have no idea what's going on.
> Some things are definitely colored, but I'm not sure what corresponds to
> what.

That is a good one indeed. My take on that:
If you use "zebra" alone, you do not care if it is cyan or yellow.
*Both* indicate a moved new line, and adjacent lines of the same color
indicate that they were adjacent at the source as well.

for reference http://i.imgur.com/hQXNlga.png

Look at the two lines of the first cyan->yellow transient
(closing the function prepare_submodule_repo_env_no_git_dir)

What it tells you is this:

  There is no matching paragraph in the deleted lines, that also
  end with two braces.

The yellow->cyan transit (prepare_submodule_repo_env) tells us:

  We did not have a empty line and then that function signature
  in the deleted lines, so we flipflop to the other color to tell the user.


>
>> > That feels more dangerous to me, just because the heuristics seem to
>> > find a lot of "moves" of repeated code. Imagine a simple patch like
>> > "switch return 0 to return -1". If the code you're touching went away,
>> > there's a very good chance that another "return 0" popped up somewhere
>> > else in the code base. A conflict is what you want there; silently
>> > changing some other site would be not only wrong, but quite subtle and
>> > hard to find.
>>
>> I agree, that is the danger, but the scale of danger is the size of the moved
>> chunk. A file is usually a very large chunk, such that it is obviously a good
>> idea to fix that. Maybe we'd introduce a threshold, that the fix must not be 
>> in
>> range of the block boundaries of say 3 lines.
>> (Then the moved block must be at least 7 lines of code such that a one liner
>> fix in the middle would kick in)
>
> Yes, I'd agree it's really a continuum from "one line" to "whole file".
> I think right now the --color-moved stuff is too close to the former to
> be safe, but pushing it farther towards the middle would remedy that.

Yup, I agree on that. I just wanted to state the principle that this would be
"move detection done right", because "file boundaries" are rather arbitrary
w.r.t to the very valuable content that we deal with. ;)

Thanks,
Stefan

>
> -Peff


Re: --color-moved feedback, was Re: [PATCH v2 19/29] packed-backend: new module for handling packed references

2017-06-23 Thread Jeff King
On Fri, Jun 23, 2017 at 01:23:52PM -0700, Stefan Beller wrote:

> > In the end, I just did --color-moved=plain,  ...
> > "yep, this is all a giant moved chunk, so I don't have to look carefully
> > at it".
> 
> This is dangerous, as "plain" does not care about permutations.
> See the 7f5af90798 (diff.c: color moved lines differently, 2017-06-19)
> for details. You would want at least "zebra", which would show you
> permutations.

Ah, I see. I think what I'd really want is some way of correlating two
particular hunks. That's hard to do with color, though. I guess that's
the "you would need a ton of colors" discussion I saw going on before?

It would depend on how many hunks there are, and how close together they
are. For instance, your 6cd5757c8 seems like a good candidate, but I
have to admit with --color-moved=zebra I have no idea what's going on.
Some things are definitely colored, but I'm not sure what corresponds to
what.

> > That feels more dangerous to me, just because the heuristics seem to
> > find a lot of "moves" of repeated code. Imagine a simple patch like
> > "switch return 0 to return -1". If the code you're touching went away,
> > there's a very good chance that another "return 0" popped up somewhere
> > else in the code base. A conflict is what you want there; silently
> > changing some other site would be not only wrong, but quite subtle and
> > hard to find.
> 
> I agree, that is the danger, but the scale of danger is the size of the moved
> chunk. A file is usually a very large chunk, such that it is obviously a good
> idea to fix that. Maybe we'd introduce a threshold, that the fix must not be 
> in
> range of the block boundaries of say 3 lines.
> (Then the moved block must be at least 7 lines of code such that a one liner
> fix in the middle would kick in)

Yes, I'd agree it's really a continuum from "one line" to "whole file".
I think right now the --color-moved stuff is too close to the former to
be safe, but pushing it farther towards the middle would remedy that.

-Peff


Re: [PATCH v2 19/29] packed-backend: new module for handling packed references

2017-06-23 Thread Junio C Hamano
Jeff King  writes:

> On Fri, Jun 23, 2017 at 09:01:37AM +0200, Michael Haggerty wrote:
>
>> Now that the interface between `files_ref_store` and
>> `packed_ref_store` is relatively narrow, move the latter into a new
>> module, "refs/packed-backend.h" and "refs/packed-backend.c". It still
>> doesn't quite implement the `ref_store` interface, but it will soon.
>> 
>> This commit moves code around and adjusts its visibility, but doesn't
>> change anything.
>
> Looks good. Stefan will be happy to know that I used --color-moved to
> look at it. ;)

This patch shows OK, but when applied to other changes in the
series, e.g. "packed_ref_store: make class into a subclass of
`ref_store`", it was somewhat irritating that all new blank lines
are shown as a copy-move of a single deleted blank line (also a
single "return refs" in an entirely new function being a copy/move
looked confusing).






Re: --color-moved feedback, was Re: [PATCH v2 19/29] packed-backend: new module for handling packed references

2017-06-23 Thread Stefan Beller
On Fri, Jun 23, 2017 at 1:10 PM, Jeff King  wrote:
> [I culled the cc list, as it was big and they don't all necessarily care
> about this feature]

Yeah I was on the verge to do that, but did not pull through.


>
> In the end, I just did --color-moved=plain,  ...
> "yep, this is all a giant moved chunk, so I don't have to look carefully
> at it".

This is dangerous, as "plain" does not care about permutations.
See the 7f5af90798 (diff.c: color moved lines differently, 2017-06-19)
for details. You would want at least "zebra", which would show you
permutations.

> I'm not sure what the default mode is. When I first used it it seemed to
> italicize a bunch of text. Which I didn't even notice at first, so I
> thought it was broken.

ok, italics does not seem to be popular.
I liked it as I could take the same color for these uninteresting lines.

> I didn't try the "dim" feature; I don't think urxvt supports that
> attribute (though maybe it was what was doing the italics?).

That is on by default (with italics), maybe the default should rather
be "zebra" as that has fewer colors but still contains all the information
needed to be sure about
"yep, this is all a giant moved chunk, so I don't have to look
carefully at it".

>
> One minor quibble is that I used with "git log", so I was looking at all
> the commits with that coloring. But the ones that didn't have movement
> had a lot of noisy "moved" sections. E.g., one line moving here or
> there, or even blank lines. It might be worth having some heuristics to
> identify a chunk (I think I saw some discussion on that earlier, and
> maybe it's even there and I didn't see it).

It is not there, yet. It was Michaels suggestions.
I'll give that a try, too.

>> Let's take this patch as an example, if someone were to find a bug
>> in one of the moved functions, they would send a fix based on the
>> function in refs/files-backend.c, such that it can easily be merged down
>> to maint, but when merging it forward with this, it may clash.
>
> That feels more dangerous to me, just because the heuristics seem to
> find a lot of "moves" of repeated code. Imagine a simple patch like
> "switch return 0 to return -1". If the code you're touching went away,
> there's a very good chance that another "return 0" popped up somewhere
> else in the code base. A conflict is what you want there; silently
> changing some other site would be not only wrong, but quite subtle and
> hard to find.

I agree, that is the danger, but the scale of danger is the size of the moved
chunk. A file is usually a very large chunk, such that it is obviously a good
idea to fix that. Maybe we'd introduce a threshold, that the fix must not be in
range of the block boundaries of say 3 lines.
(Then the moved block must be at least 7 lines of code such that a one liner
fix in the middle would kick in)

Thanks,
Stefan


--color-moved feedback, was Re: [PATCH v2 19/29] packed-backend: new module for handling packed references

2017-06-23 Thread Jeff King
[I culled the cc list, as it was big and they don't all necessarily care
about this feature]

On Fri, Jun 23, 2017 at 12:46:47PM -0700, Stefan Beller wrote:

> > Looks good. Stefan will be happy to know that I used --color-moved to
> > look at it. ;)
> 
> Hah!
> 
> As a follow up on that, let's perform a user survey :-P
> * How was your review experience?
> * Were you annoyed by the default colors or mode?
>   (That is best expressed as a patch, as it seems like
>bikeshedding to me, but I am far from being a UX expert ;)

I'm not sure what the default mode is. When I first used it it seemed to
italicize a bunch of text. Which I didn't even notice at first, so I
thought it was broken.

In the end, I just did --color-moved=plain, which seemed to color with
magenta/cyan. I found that a bit loud and switched to "red 80" and
"green 80" (i.e., a dark grey background, as I usually have a black
one). I probably would have picked a darker grey, but I use urxvt and it
doesn't seem to do 24-bit color codes.

I didn't try the "dim" feature; I don't think urxvt supports that
attribute (though maybe it was what was doing the italics?).

Anyway, it worked fine after that. My main use for it was just seeing
"yep, this is all a giant moved chunk, so I don't have to look carefully
at it".

One minor quibble is that I used with "git log", so I was looking at all
the commits with that coloring. But the ones that didn't have movement
had a lot of noisy "moved" sections. E.g., one line moving here or
there, or even blank lines. It might be worth having some heuristics to
identify a chunk (I think I saw some discussion on that earlier, and
maybe it's even there and I didn't see it).

> Just today I thought about that further:
> While reviewing is one thing (which I do a lot), how can we make this
> work with merging changes?
> 
> I think the file rename detection is acknowledged by the merge code,
> such that a plain file rename and a patch to said file is easy on the
> maintainer, but we would want that for smaller code movements, too.
> 
> Let's take this patch as an example, if someone were to find a bug
> in one of the moved functions, they would send a fix based on the
> function in refs/files-backend.c, such that it can easily be merged down
> to maint, but when merging it forward with this, it may clash.

That feels more dangerous to me, just because the heuristics seem to
find a lot of "moves" of repeated code. Imagine a simple patch like
"switch return 0 to return -1". If the code you're touching went away,
there's a very good chance that another "return 0" popped up somewhere
else in the code base. A conflict is what you want there; silently
changing some other site would be not only wrong, but quite subtle and
hard to find.

-Peff


Re: [PATCH v2 19/29] packed-backend: new module for handling packed references

2017-06-23 Thread Stefan Beller
On Fri, Jun 23, 2017 at 12:35 PM, Jeff King  wrote:
> On Fri, Jun 23, 2017 at 09:01:37AM +0200, Michael Haggerty wrote:
>
>> Now that the interface between `files_ref_store` and
>> `packed_ref_store` is relatively narrow, move the latter into a new
>> module, "refs/packed-backend.h" and "refs/packed-backend.c". It still
>> doesn't quite implement the `ref_store` interface, but it will soon.
>>
>> This commit moves code around and adjusts its visibility, but doesn't
>> change anything.
>
> Looks good. Stefan will be happy to know that I used --color-moved to
> look at it. ;)

Hah!

As a follow up on that, let's perform a user survey :-P
* How was your review experience?
* Were you annoyed by the default colors or mode?
  (That is best expressed as a patch, as it seems like
   bikeshedding to me, but I am far from being a UX expert ;)

Just today I thought about that further:
While reviewing is one thing (which I do a lot), how can we make this
work with merging changes?

I think the file rename detection is acknowledged by the merge code,
such that a plain file rename and a patch to said file is easy on the
maintainer, but we would want that for smaller code movements, too.

Let's take this patch as an example, if someone were to find a bug
in one of the moved functions, they would send a fix based on the
function in refs/files-backend.c, such that it can easily be merged down
to maint, but when merging it forward with this, it may clash.


Re: [PATCH v2 19/29] packed-backend: new module for handling packed references

2017-06-23 Thread Jeff King
On Fri, Jun 23, 2017 at 09:01:37AM +0200, Michael Haggerty wrote:

> Now that the interface between `files_ref_store` and
> `packed_ref_store` is relatively narrow, move the latter into a new
> module, "refs/packed-backend.h" and "refs/packed-backend.c". It still
> doesn't quite implement the `ref_store` interface, but it will soon.
> 
> This commit moves code around and adjusts its visibility, but doesn't
> change anything.

Looks good. Stefan will be happy to know that I used --color-moved to
look at it. ;)

-Peff


[PATCH v2 19/29] packed-backend: new module for handling packed references

2017-06-23 Thread Michael Haggerty
Now that the interface between `files_ref_store` and
`packed_ref_store` is relatively narrow, move the latter into a new
module, "refs/packed-backend.h" and "refs/packed-backend.c". It still
doesn't quite implement the `ref_store` interface, but it will soon.

This commit moves code around and adjusts its visibility, but doesn't
change anything.

Signed-off-by: Michael Haggerty 
---
 Makefile  |   1 +
 refs.c|  18 ++
 refs/files-backend.c  | 640 +-
 refs/packed-backend.c | 624 
 refs/packed-backend.h |  33 +++
 refs/refs-internal.h  |   9 +
 6 files changed, 686 insertions(+), 639 deletions(-)
 create mode 100644 refs/packed-backend.c
 create mode 100644 refs/packed-backend.h

diff --git a/Makefile b/Makefile
index e35542e631..aff9c6d7e2 100644
--- a/Makefile
+++ b/Makefile
@@ -817,6 +817,7 @@ LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += refs/files-backend.o
 LIB_OBJS += refs/iterator.o
+LIB_OBJS += refs/packed-backend.o
 LIB_OBJS += refs/ref-cache.o
 LIB_OBJS += ref-filter.o
 LIB_OBJS += remote.o
diff --git a/refs.c b/refs.c
index 32177969f0..5880c12372 100644
--- a/refs.c
+++ b/refs.c
@@ -173,6 +173,24 @@ int refname_is_safe(const char *refname)
return 1;
 }
 
+/*
+ * Return true if refname, which has the specified oid and flags, can
+ * be resolved to an object in the database. If the referred-to object
+ * does not exist, emit a warning and return false.
+ */
+int ref_resolves_to_object(const char *refname,
+  const struct object_id *oid,
+  unsigned int flags)
+{
+   if (flags & REF_ISBROKEN)
+   return 0;
+   if (!has_sha1_file(oid->hash)) {
+   error("%s does not point to a valid object!", refname);
+   return 0;
+   }
+   return 1;
+}
+
 char *refs_resolve_refdup(struct ref_store *refs,
  const char *refname, int resolve_flags,
  unsigned char *sha1, int *flags)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 346794cf7c..7df9747798 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2,6 +2,7 @@
 #include "../refs.h"
 #include "refs-internal.h"
 #include "ref-cache.h"
+#include "packed-backend.h"
 #include "../iterator.h"
 #include "../dir-iterator.h"
 #include "../lockfile.h"
@@ -14,85 +15,6 @@ struct ref_lock {
struct object_id old_oid;
 };
 
-/*
- * Return true if refname, which has the specified oid and flags, can
- * be resolved to an object in the database. If the referred-to object
- * does not exist, emit a warning and return false.
- */
-static int ref_resolves_to_object(const char *refname,
- const struct object_id *oid,
- unsigned int flags)
-{
-   if (flags & REF_ISBROKEN)
-   return 0;
-   if (!has_sha1_file(oid->hash)) {
-   error("%s does not point to a valid object!", refname);
-   return 0;
-   }
-   return 1;
-}
-
-struct packed_ref_cache {
-   struct ref_cache *cache;
-
-   /*
-* Count of references to the data structure in this instance,
-* including the pointer from files_ref_store::packed if any.
-* The data will not be freed as long as the reference count
-* is nonzero.
-*/
-   unsigned int referrers;
-
-   /* The metadata from when this packed-refs cache was read */
-   struct stat_validity validity;
-};
-
-/*
- * A container for `packed-refs`-related data. It is not (yet) a
- * `ref_store`.
- */
-struct packed_ref_store {
-   unsigned int store_flags;
-
-   /* The path of the "packed-refs" file: */
-   char *path;
-
-   /*
-* A cache of the values read from the `packed-refs` file, if
-* it might still be current; otherwise, NULL.
-*/
-   struct packed_ref_cache *cache;
-
-   /*
-* Lock used for the "packed-refs" file. Note that this (and
-* thus the enclosing `packed_ref_store`) must not be freed.
-*/
-   struct lock_file lock;
-};
-
-static struct packed_ref_store *packed_ref_store_create(
-   const char *path, unsigned int store_flags)
-{
-   struct packed_ref_store *refs = xcalloc(1, sizeof(*refs));
-
-   refs->store_flags = store_flags;
-   refs->path = xstrdup(path);
-   return refs;
-}
-
-/*
- * Die if refs is not the main ref store. caller is used in any
- * necessary error messages.
- */
-static void packed_assert_main_repository(struct packed_ref_store *refs,
- const char *caller)
-{
-   if (refs->store_flags & REF_STORE_MAIN)
-   return;
-
-   die("BUG: operation %s only allowed for main ref store", caller);
-}
-
 /*
  * Future: need to be in "struct repository"
  * when doing a full libification.
@@ -109,42