Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
On Wed, Jun 14, 2017 at 2:54 AM, Junio C Hamano wrote: > Stefan Beller writes: > >> The color experts agreed that (3) might be the best solution >> as this gives most flexibility: >> >> "I would be happy as I can configure the bounds highlighting >> to not exist, it would degenerate to a pure Zebra, which is >> very simple to understand. Junio seemed to like (2) a lot, so >> he would configure both dim colors to be 'context', but configure >> the highlight colors to be attention drawing. So everybody would >> be happy. It is also not too many colors, we are good at for loops." > > Another thing I found a bit confusing in the description of choices > in the documentation was that description for some began with "Based > on X.", and as a plain reader, I couldn't tell if that is saying > "the implementation happens to be similar or shares code with X" > (which is not all that interesting to the end user) or "the meaning > this mode tries to convey is the same as X but the presentation is a > bit different" (in which case the end user is hinted that it is > benefitial to understand what informacion the mode X shows and how). it is the latter, as I wanted to save typing. Maybe it was savings at the wrong place. > > For example, I view what I prefer (i.e. (2)) as a variant of Zebra > (i.e. (1)). Ok, so in a reroll I'll add Zebra as the base (It conveys all information that we have, any further modifications is just giving a better finish to the painting.) and then make commits on top of it adding the taping and water-dunking method on top. By organizing it in multiple commits the algorithm may become clearer. > Conceptually, you paint the diff output using water > soluble paint into a Zebra pattern, then apply thin strips of > protective tape to places where two Zebra colors are adjacent to > each other (i.e. do not cover the boundary between a block of a > Zebra colored moved lines and a block of context lines), dunk the > whole thing in water and then remove the strips of tape. Regions > covered by the strips of tape will retain the Zebra colors, while > the remainder of the Zebra colored part are colored in a much subdued > way. Understanding how Zebra mode marks the moved lines would help > understanding its output, but your implementation may not share much > code with the actual implementation of Zebra-painting. > > Thanks. > Thanks, Stefan
Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
Stefan Beller writes: > The color experts agreed that (3) might be the best solution > as this gives most flexibility: > > "I would be happy as I can configure the bounds highlighting > to not exist, it would degenerate to a pure Zebra, which is > very simple to understand. Junio seemed to like (2) a lot, so > he would configure both dim colors to be 'context', but configure > the highlight colors to be attention drawing. So everybody would > be happy. It is also not too many colors, we are good at for loops." Another thing I found a bit confusing in the description of choices in the documentation was that description for some began with "Based on X.", and as a plain reader, I couldn't tell if that is saying "the implementation happens to be similar or shares code with X" (which is not all that interesting to the end user) or "the meaning this mode tries to convey is the same as X but the presentation is a bit different" (in which case the end user is hinted that it is benefitial to understand what informacion the mode X shows and how). For example, I view what I prefer (i.e. (2)) as a variant of Zebra (i.e. (1)). Conceptually, you paint the diff output using water soluble paint into a Zebra pattern, then apply thin strips of protective tape to places where two Zebra colors are adjacent to each other (i.e. do not cover the boundary between a block of a Zebra colored moved lines and a block of context lines), dunk the whole thing in water and then remove the strips of tape. Regions covered by the strips of tape will retain the Zebra colors, while the remainder of the Zebra colored part are colored in a much subdued way. Understanding how Zebra mode marks the moved lines would help understanding its output, but your implementation may not share much code with the actual implementation of Zebra-painting. Thanks.
Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
On Wed, Jun 7, 2017 at 10:41 PM, Jacob Keller wrote: > On Mon, Jun 5, 2017 at 11:52 PM, Jacob Keller wrote: >> >> I will try to find some time tomorrow to go over it in detail. >> https://public-inbox.org/git/20170613023151.9688-1-sbel...@google.com/ restarted the discussion on this feature, so I assembled a team of color experts (my office mates), and we came to this conclusion more or less: There are different ways to implement a move detection. Different people prefer different things, the main contenders are: 1)"Zebra mode" (in the last patch known as "alternate" mode. This alternates between 2 colors for blocks of moved code. This may or may not reset to the first color, but the main features are: a) the whole block is in one color b) adjacent blocks have different colors. For this implementation we need 4 new colors (2 colors for +, 2 for -) 2)"Highlight bounds + dimming in between" A block is generally in a dim color. Depending on further configuration the boundaries are highlighted. There are different modes for that: a) all boundaries are highlighted b) bounds adjacent to other moved blocks of same sign are highlighted c) no bounds highlighted, moved code is only dimmed. (sucks for seeing blocks) d) [NEW] only the first (last) border line. This is consistent with the thread that started the discussion as this is done in the RFC for blame. The configured colors "oldMoved" and "oldMovedAlternative" are assumed that one is highlighting and the other is the dim color. Depending on the exact implementation we'd need up to 6 new colors (dim, highlight, highlight alternative for each + and -) The user may configure these to be the same though, e.g. both dims could be 'context'. 3) Combine 1) and 2) The algorithm would look like this: a) Find moved blocks b) Take one option from 2) to highlight the bounds. (all bounds or adjacent bounds are the only serious contenders IMHO) This needs 8 new colors configured: for each of (new, old): for each of (Zebra black, Zebra white): for each of (dim, highlight): define_color() The color experts agreed that (3) might be the best solution as this gives most flexibility: "I would be happy as I can configure the bounds highlighting to not exist, it would degenerate to a pure Zebra, which is very simple to understand. Junio seemed to like (2) a lot, so he would configure both dim colors to be 'context', but configure the highlight colors to be attention drawing. So everybody would be happy. It is also not too many colors, we are good at for loops." I hope to have summarized this adequately. Thanks, Stefan
Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
On Mon, Jun 5, 2017 at 11:52 PM, Jacob Keller wrote: > > I will try to find some time tomorrow to go over it in detail. > > Thanks, > Jake For the record, I got pulled into a project at work, so I won't have spare time to look into this for the near future. :( Probably not until next week. Thanks, Jake
Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
On Wed, Jun 7, 2017 at 2:58 PM, Ævar Arnfjörð Bjarmason wrote: > On Wed, Jun 7, 2017 at 8:28 PM, Stefan Beller wrote: >> On Tue, Jun 6, 2017 at 3:05 PM, Jacob Keller wrote: >>> On Tue, Jun 6, 2017 at 2:50 AM, Michael Haggerty >>> wrote: On Mon, Jun 5, 2017 at 8:23 PM, Stefan Beller wrote: > > > [...] > > "git diff" has been taught to optionally paint new lines that are > > the same as deleted lines elsewhere differently from genuinely new > > lines. > > > > Are we happy with these changes? I've been studiously ignoring this patch series due to lack of bandwidth. > [...] > Things to come, but not in this series as they are more advanced: > > Discuss if a block/line needs a minimum requirement. > > When doing reviews with this series, a couple of lines such > as "\t\t}" were marked as a moved, which is not wrong as they > really occurred in the text with opposing sign. > But it was annoying as it drew my attention to just closing > braces, which IMO is not the point of code review. > > To solve this issue I had the idea of a "minimum requirement", e.g. > * at least 3 consecutive lines or > * at least one line with at least 3 non-ws characters or > * compute the entropy of a given moved block and if it is too low, do > not mark it up. Shooting from the hip here... It seems obvious that for a line to be marked as moved, a minimum requirement is that 1. The line appears as both "+" and "-". That doesn't seem strong enough evidence though, and if that is the only criterion, I would expect a lot of boilerplate lines like "\t\t}" to be marked as moved. It seems like a lot of noise could be eliminated by *also* requiring that 2a. The line doesn't appear elsewhere in the file(s) concerned. >> >> 'elsewhere' in the opposing sign (+,-) or all the diff (including ' ' >> context)? >> >> This rule opens up the discussion on multi-copies, which I imagine >> happens a lot in configuration files. So say you have a prod and staging >> environment, then you might be tempted to make patches titled as: >> "1. preparation: duplicate common code into prod and staging" >> "2. Make an actual change to staging" >> >> For 1. you still want to see that there is faithful copy, but we'd have >> 2 postimages having these lines. >> >> Also what about de-duplication? >> I just stumbled upon edb0c72428 ([PATCH] diff: consolidate test >> helper script pieces., 2005-05-31) for unrelated reasons, >> but the move coloring of the same content multiple times >> helped me there to focus on the relevant part. >> Rule (2a) would probably get rid of most boilerplate lines without having to try to measure entropy. >> >> But it would also get rid of good use cases when not being very careful. >> I intentionally left out the (2a) as I am not yet sure how the move >> detection for multiple occurrences in post and preimage should >> work in the desired case. The suppression of little-entropy closing braces >> might be a side effect of just this. Or it can be treated separately. >> Maybe you are already using both criteria? I didn't see it in a quick perusal of the code. OTOH, it would be silly to refuse to mark lines like "\t\t}" as moved *only* because they appear elsewhere in the file(s). If you did so, you would have gaps of supposedly non-moved lines in the middle of moved blocks. This suggests marking as moved lines matching (1) and (2a) but also lines matching (1) and the following: 2b. The line is adjacent to to another line that is thought to have moved from the same old location to the same new location. >> >> This is what we do, a "block detection" by comparing "line runs" against >> the current lines. Based on these line runs we detect one block and >> color up adjacent blocks. >> Rule (2b) would be applied recursively, with the net effect being that any line satisfying (1) and (2a) is allowed to carry along any neighboring lines within the same "+"/"-" block even if they are not unique. >> >> So you are saying each block has to have at least one unique line? >> That doesn't go well with (de-)duplication IMHO. >> >> Thanks for your shot from the hip. I'll think about these rules more to see >> if I can make sense of them for duplication still. > > I've just been skimming this topic so far, but a question, what variant of: > > git diff ... | grep ... > > Can I use to see whether the diff that's being emitted has hunks > marked as moved? Presumably this needs -c ui.color=always & grepping > for the color codes. > > The use-case being to say add that diff | grep -q to a for-loop to > find all diffs in a repo that have hunks marked as moved. Early on in one of the first iterations, I had different signs instead of +,- to mark the moved parts
Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
On Wed, Jun 7, 2017 at 8:28 PM, Stefan Beller wrote: > On Tue, Jun 6, 2017 at 3:05 PM, Jacob Keller wrote: >> On Tue, Jun 6, 2017 at 2:50 AM, Michael Haggerty >> wrote: >>> On Mon, Jun 5, 2017 at 8:23 PM, Stefan Beller wrote: > [...] > "git diff" has been taught to optionally paint new lines that are > the same as deleted lines elsewhere differently from genuinely new > lines. > > Are we happy with these changes? >>> >>> >>> I've been studiously ignoring this patch series due to lack of bandwidth. >>> [...] Things to come, but not in this series as they are more advanced: Discuss if a block/line needs a minimum requirement. When doing reviews with this series, a couple of lines such as "\t\t}" were marked as a moved, which is not wrong as they really occurred in the text with opposing sign. But it was annoying as it drew my attention to just closing braces, which IMO is not the point of code review. To solve this issue I had the idea of a "minimum requirement", e.g. * at least 3 consecutive lines or * at least one line with at least 3 non-ws characters or * compute the entropy of a given moved block and if it is too low, do not mark it up. >>> >>> Shooting from the hip here... >>> >>> It seems obvious that for a line to be marked as moved, a minimum >>> requirement is that >>> >>> 1. The line appears as both "+" and "-". >>> >>> That doesn't seem strong enough evidence though, and if that is the >>> only criterion, I would expect a lot of boilerplate lines like "\t\t}" >>> to be marked as moved. It seems like a lot of noise could be >>> eliminated by *also* requiring that >>> >>> 2a. The line doesn't appear elsewhere in the file(s) concerned. > > 'elsewhere' in the opposing sign (+,-) or all the diff (including ' ' > context)? > > This rule opens up the discussion on multi-copies, which I imagine > happens a lot in configuration files. So say you have a prod and staging > environment, then you might be tempted to make patches titled as: > "1. preparation: duplicate common code into prod and staging" > "2. Make an actual change to staging" > > For 1. you still want to see that there is faithful copy, but we'd have > 2 postimages having these lines. > > Also what about de-duplication? > I just stumbled upon edb0c72428 ([PATCH] diff: consolidate test > helper script pieces., 2005-05-31) for unrelated reasons, > but the move coloring of the same content multiple times > helped me there to focus on the relevant part. > >>> >>> Rule (2a) would probably get rid of most boilerplate lines without >>> having to try to measure entropy. > > But it would also get rid of good use cases when not being very careful. > I intentionally left out the (2a) as I am not yet sure how the move > detection for multiple occurrences in post and preimage should > work in the desired case. The suppression of little-entropy closing braces > might be a side effect of just this. Or it can be treated separately. > >>> >>> Maybe you are already using both criteria? I didn't see it in a quick >>> perusal of the code. >>> >>> OTOH, it would be silly to refuse to mark lines like "\t\t}" as moved >>> *only* because they appear elsewhere in the file(s). If you did so, >>> you would have gaps of supposedly non-moved lines in the middle of >>> moved blocks. This suggests marking as moved lines matching (1) and >>> (2a) but also lines matching (1) and the following: >>> >>> 2b. The line is adjacent to to another line that is thought to have >>> moved from the same old location to the same new location. > > This is what we do, a "block detection" by comparing "line runs" against > the current lines. Based on these line runs we detect one block and > color up adjacent blocks. > >>> >>> Rule (2b) would be applied recursively, with the net effect being that >>> any line satisfying (1) and (2a) is allowed to carry along any >>> neighboring lines within the same "+"/"-" block even if they are not >>> unique. > > So you are saying each block has to have at least one unique line? > That doesn't go well with (de-)duplication IMHO. > > Thanks for your shot from the hip. I'll think about these rules more to see > if I can make sense of them for duplication still. I've just been skimming this topic so far, but a question, what variant of: git diff ... | grep ... Can I use to see whether the diff that's being emitted has hunks marked as moved? Presumably this needs -c ui.color=always & grepping for the color codes. The use-case being to say add that diff | grep -q to a for-loop to find all diffs in a repo that have hunks marked as moved.
Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
On Tue, Jun 6, 2017 at 3:05 PM, Jacob Keller wrote: > On Tue, Jun 6, 2017 at 2:50 AM, Michael Haggerty wrote: >> On Mon, Jun 5, 2017 at 8:23 PM, Stefan Beller wrote: >>> >>> > [...] >>> > "git diff" has been taught to optionally paint new lines that are >>> > the same as deleted lines elsewhere differently from genuinely new >>> > lines. >>> > >>> > Are we happy with these changes? >> >> >> I've been studiously ignoring this patch series due to lack of bandwidth. >> >>> [...] >>> Things to come, but not in this series as they are more advanced: >>> >>> Discuss if a block/line needs a minimum requirement. >>> >>> When doing reviews with this series, a couple of lines such >>> as "\t\t}" were marked as a moved, which is not wrong as they >>> really occurred in the text with opposing sign. >>> But it was annoying as it drew my attention to just closing >>> braces, which IMO is not the point of code review. >>> >>> To solve this issue I had the idea of a "minimum requirement", e.g. >>> * at least 3 consecutive lines or >>> * at least one line with at least 3 non-ws characters or >>> * compute the entropy of a given moved block and if it is too low, do >>> not mark it up. >> >> Shooting from the hip here... >> >> It seems obvious that for a line to be marked as moved, a minimum >> requirement is that >> >> 1. The line appears as both "+" and "-". >> >> That doesn't seem strong enough evidence though, and if that is the >> only criterion, I would expect a lot of boilerplate lines like "\t\t}" >> to be marked as moved. It seems like a lot of noise could be >> eliminated by *also* requiring that >> >> 2a. The line doesn't appear elsewhere in the file(s) concerned. 'elsewhere' in the opposing sign (+,-) or all the diff (including ' ' context)? This rule opens up the discussion on multi-copies, which I imagine happens a lot in configuration files. So say you have a prod and staging environment, then you might be tempted to make patches titled as: "1. preparation: duplicate common code into prod and staging" "2. Make an actual change to staging" For 1. you still want to see that there is faithful copy, but we'd have 2 postimages having these lines. Also what about de-duplication? I just stumbled upon edb0c72428 ([PATCH] diff: consolidate test helper script pieces., 2005-05-31) for unrelated reasons, but the move coloring of the same content multiple times helped me there to focus on the relevant part. >> >> Rule (2a) would probably get rid of most boilerplate lines without >> having to try to measure entropy. But it would also get rid of good use cases when not being very careful. I intentionally left out the (2a) as I am not yet sure how the move detection for multiple occurrences in post and preimage should work in the desired case. The suppression of little-entropy closing braces might be a side effect of just this. Or it can be treated separately. >> >> Maybe you are already using both criteria? I didn't see it in a quick >> perusal of the code. >> >> OTOH, it would be silly to refuse to mark lines like "\t\t}" as moved >> *only* because they appear elsewhere in the file(s). If you did so, >> you would have gaps of supposedly non-moved lines in the middle of >> moved blocks. This suggests marking as moved lines matching (1) and >> (2a) but also lines matching (1) and the following: >> >> 2b. The line is adjacent to to another line that is thought to have >> moved from the same old location to the same new location. This is what we do, a "block detection" by comparing "line runs" against the current lines. Based on these line runs we detect one block and color up adjacent blocks. >> >> Rule (2b) would be applied recursively, with the net effect being that >> any line satisfying (1) and (2a) is allowed to carry along any >> neighboring lines within the same "+"/"-" block even if they are not >> unique. So you are saying each block has to have at least one unique line? That doesn't go well with (de-)duplication IMHO. Thanks for your shot from the hip. I'll think about these rules more to see if I can make sense of them for duplication still. Thanks, Stefan
Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
On Tue, Jun 6, 2017 at 2:50 AM, Michael Haggerty wrote: > On Mon, Jun 5, 2017 at 8:23 PM, Stefan Beller wrote: >> >> > [...] >> > "git diff" has been taught to optionally paint new lines that are >> > the same as deleted lines elsewhere differently from genuinely new >> > lines. >> > >> > Are we happy with these changes? > > > I've been studiously ignoring this patch series due to lack of bandwidth. > >> [...] >> Things to come, but not in this series as they are more advanced: >> >> Discuss if a block/line needs a minimum requirement. >> >> When doing reviews with this series, a couple of lines such >> as "\t\t}" were marked as a moved, which is not wrong as they >> really occurred in the text with opposing sign. >> But it was annoying as it drew my attention to just closing >> braces, which IMO is not the point of code review. >> >> To solve this issue I had the idea of a "minimum requirement", e.g. >> * at least 3 consecutive lines or >> * at least one line with at least 3 non-ws characters or >> * compute the entropy of a given moved block and if it is too low, do >> not mark it up. > > Shooting from the hip here... > > It seems obvious that for a line to be marked as moved, a minimum > requirement is that > > 1. The line appears as both "+" and "-". > > That doesn't seem strong enough evidence though, and if that is the > only criterion, I would expect a lot of boilerplate lines like "\t\t}" > to be marked as moved. It seems like a lot of noise could be > eliminated by *also* requiring that > > 2a. The line doesn't appear elsewhere in the file(s) concerned. > > Rule (2a) would probably get rid of most boilerplate lines without > having to try to measure entropy. > > Maybe you are already using both criteria? I didn't see it in a quick > perusal of the code. > > OTOH, it would be silly to refuse to mark lines like "\t\t}" as moved > *only* because they appear elsewhere in the file(s). If you did so, > you would have gaps of supposedly non-moved lines in the middle of > moved blocks. This suggests marking as moved lines matching (1) and > (2a) but also lines matching (1) and the following: > > 2b. The line is adjacent to to another line that is thought to have > moved from the same old location to the same new location. > > Rule (2b) would be applied recursively, with the net effect being that > any line satisfying (1) and (2a) is allowed to carry along any > neighboring lines within the same "+"/"-" block even if they are not > unique. > > Michael This sounds reasonable to me, though I'm not sure how easy it is to implement. Thanks, Jake
Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
On Mon, Jun 5, 2017 at 8:23 PM, Stefan Beller wrote: > > > [...] > > "git diff" has been taught to optionally paint new lines that are > > the same as deleted lines elsewhere differently from genuinely new > > lines. > > > > Are we happy with these changes? I've been studiously ignoring this patch series due to lack of bandwidth. > [...] > Things to come, but not in this series as they are more advanced: > > Discuss if a block/line needs a minimum requirement. > > When doing reviews with this series, a couple of lines such > as "\t\t}" were marked as a moved, which is not wrong as they > really occurred in the text with opposing sign. > But it was annoying as it drew my attention to just closing > braces, which IMO is not the point of code review. > > To solve this issue I had the idea of a "minimum requirement", e.g. > * at least 3 consecutive lines or > * at least one line with at least 3 non-ws characters or > * compute the entropy of a given moved block and if it is too low, do > not mark it up. Shooting from the hip here... It seems obvious that for a line to be marked as moved, a minimum requirement is that 1. The line appears as both "+" and "-". That doesn't seem strong enough evidence though, and if that is the only criterion, I would expect a lot of boilerplate lines like "\t\t}" to be marked as moved. It seems like a lot of noise could be eliminated by *also* requiring that 2a. The line doesn't appear elsewhere in the file(s) concerned. Rule (2a) would probably get rid of most boilerplate lines without having to try to measure entropy. Maybe you are already using both criteria? I didn't see it in a quick perusal of the code. OTOH, it would be silly to refuse to mark lines like "\t\t}" as moved *only* because they appear elsewhere in the file(s). If you did so, you would have gaps of supposedly non-moved lines in the middle of moved blocks. This suggests marking as moved lines matching (1) and (2a) but also lines matching (1) and the following: 2b. The line is adjacent to to another line that is thought to have moved from the same old location to the same new location. Rule (2b) would be applied recursively, with the net effect being that any line satisfying (1) and (2a) is allowed to carry along any neighboring lines within the same "+"/"-" block even if they are not unique. Michael
Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
On Mon, Jun 5, 2017 at 6:10 PM, Junio C Hamano wrote: > Stefan Beller writes: > >>> "git diff" has been taught to optionally paint new lines that are >>> the same as deleted lines elsewhere differently from genuinely new >>> lines. >>> >>> Are we happy with these changes? >> >> I advertised this series e.g. for reviewing Brandons >> repo object refactoring series and used it myself to inspect >> some patches there[1]. I am certainly happy (but biased) with >> what we have available there. >> >> Jacob intended to use this series >> for review as well, but has given no opinion yet. >> >> You seemed to have used it for js/blame-lib? >> >> -- >> Those patches had a wide reviewer audience cc'd, >> so I would think people are aware of this series. > > I tried to, yes. I haven't had a chance to see how well the current > iteration fares "does the externally-visible goal make sense?" test. > > I do not think I saw a negative "an approach to show this kind of > output would not be useful" reaction, so I assume at least people > would want an alternative output format that would help reviewing a > change that moves blocks of lines around. > > In any case, that is not a review. A patch series wanting to do a > good thing, and people agreeing that the externally visible effect > it produces matches that good thing, is one thing. A review that > makes sure the code achieves the externally visible effect well > (e.g. without overly inefficient algorithm, without buffer overflows > or underflows, off-by-ones, etc.) is another thing, and I haven't > seen anybody going with fine toothed comb to do that kind of review, > hence my "are we happy?" inquiry. > I will try to find some time tomorrow to go over it in detail. Thanks, Jake
Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
On Mon, Jun 5, 2017 at 11:23 AM, Stefan Beller wrote: >> * sb/diff-color-move (2017-06-01) 17 commits >> - diff.c: color moved lines differently >> - diff: buffer all output if asked to >> - diff.c: emit_line includes whitespace highlighting >> - diff.c: convert diff_summary to use emit_line_* >> - diff.c: convert diff_flush to use emit_line_* >> - diff.c: convert word diffing to use emit_line_* >> - diff.c: convert show_stats to use emit_line_* >> - diff.c: convert emit_binary_diff_body to use emit_line_* >> - submodule.c: convert show_submodule_summary to use emit_line_fmt >> - diff.c: convert emit_rewrite_lines to use emit_line_* >> - diff.c: convert emit_rewrite_diff to use emit_line_* >> - diff.c: convert builtin_diff to use emit_line_* >> - diff.c: convert fn_out_consume to use emit_line >> - diff: introduce more flexible emit function >> - diff.c: factor out diff_flush_patch_all_file_pairs >> - diff: move line ending check into emit_hunk_header >> - diff: readability fix >> >> "git diff" has been taught to optionally paint new lines that are >> the same as deleted lines elsewhere differently from genuinely new >> lines. >> >> Are we happy with these changes? > > I advertised this series e.g. for reviewing Brandons > repo object refactoring series and used it myself to inspect > some patches there[1]. I am certainly happy (but biased) with > what we have available there. > > Jacob intended to use this series > for review as well, but has given no opinion yet. I haven't had any problems thus far. Been using it for the past few days at $DAYJOB. Haven't said anything yet because I haven't really had anything to add. I like it. Thanks, Jake > > You seemed to have used it for js/blame-lib? > > -- > Those patches had a wide reviewer audience cc'd, > so I would think people are aware of this series. > > -- > Things to come, but not in this series as they are more advanced: > > Discuss if a block/line needs a minimum requirement. > > When doing reviews with this series, a couple of lines such > as "\t\t}" were marked as a moved, which is not wrong as they > really occurred in the text with opposing sign. > But it was annoying as it drew my attention to just closing > braces, which IMO is not the point of code review. > > To solve this issue I had the idea of a "minimum requirement", e.g. > * at least 3 consecutive lines or > * at least one line with at least 3 non-ws characters or > * compute the entropy of a given moved block and if it is too low, do > not mark it up. > > I am not sure if such a "minimum requirement" is the right approach > at all. The nature of this discussion comes close to the diff heuristics > at which Michael did present a wonderful solution, hence I had him cc'd > on the series as he may have some good insights on how to improve > the diffs. :) > > -- > In conclusion: > > We are happy to move to next as it seems technically sound. > > But we want more exposure on usage to point out UX bugs. > (e.g. is the default mode for just giving --color-moved good for the > majority of people/use cases? Are there subtle annoyances such > as the closing braces?) > > So maybe merge to next with the strong option to evict it > when finding more fundamentally wrong things? > > Thanks, > Stefan > > [1] > https://public-inbox.org/git/cagz79kzjf9idsvgyi-hskb6n8w0uhvcu4w-r89f0erjpxe_...@mail.gmail.com/
Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
Stefan Beller writes: >> "git diff" has been taught to optionally paint new lines that are >> the same as deleted lines elsewhere differently from genuinely new >> lines. >> >> Are we happy with these changes? > > I advertised this series e.g. for reviewing Brandons > repo object refactoring series and used it myself to inspect > some patches there[1]. I am certainly happy (but biased) with > what we have available there. > > Jacob intended to use this series > for review as well, but has given no opinion yet. > > You seemed to have used it for js/blame-lib? > > -- > Those patches had a wide reviewer audience cc'd, > so I would think people are aware of this series. I tried to, yes. I haven't had a chance to see how well the current iteration fares "does the externally-visible goal make sense?" test. I do not think I saw a negative "an approach to show this kind of output would not be useful" reaction, so I assume at least people would want an alternative output format that would help reviewing a change that moves blocks of lines around. In any case, that is not a review. A patch series wanting to do a good thing, and people agreeing that the externally visible effect it produces matches that good thing, is one thing. A review that makes sure the code achieves the externally visible effect well (e.g. without overly inefficient algorithm, without buffer overflows or underflows, off-by-ones, etc.) is another thing, and I haven't seen anybody going with fine toothed comb to do that kind of review, hence my "are we happy?" inquiry.
Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
> * sb/diff-color-move (2017-06-01) 17 commits > - diff.c: color moved lines differently > - diff: buffer all output if asked to > - diff.c: emit_line includes whitespace highlighting > - diff.c: convert diff_summary to use emit_line_* > - diff.c: convert diff_flush to use emit_line_* > - diff.c: convert word diffing to use emit_line_* > - diff.c: convert show_stats to use emit_line_* > - diff.c: convert emit_binary_diff_body to use emit_line_* > - submodule.c: convert show_submodule_summary to use emit_line_fmt > - diff.c: convert emit_rewrite_lines to use emit_line_* > - diff.c: convert emit_rewrite_diff to use emit_line_* > - diff.c: convert builtin_diff to use emit_line_* > - diff.c: convert fn_out_consume to use emit_line > - diff: introduce more flexible emit function > - diff.c: factor out diff_flush_patch_all_file_pairs > - diff: move line ending check into emit_hunk_header > - diff: readability fix > > "git diff" has been taught to optionally paint new lines that are > the same as deleted lines elsewhere differently from genuinely new > lines. > > Are we happy with these changes? I advertised this series e.g. for reviewing Brandons repo object refactoring series and used it myself to inspect some patches there[1]. I am certainly happy (but biased) with what we have available there. Jacob intended to use this series for review as well, but has given no opinion yet. You seemed to have used it for js/blame-lib? -- Those patches had a wide reviewer audience cc'd, so I would think people are aware of this series. -- Things to come, but not in this series as they are more advanced: Discuss if a block/line needs a minimum requirement. When doing reviews with this series, a couple of lines such as "\t\t}" were marked as a moved, which is not wrong as they really occurred in the text with opposing sign. But it was annoying as it drew my attention to just closing braces, which IMO is not the point of code review. To solve this issue I had the idea of a "minimum requirement", e.g. * at least 3 consecutive lines or * at least one line with at least 3 non-ws characters or * compute the entropy of a given moved block and if it is too low, do not mark it up. I am not sure if such a "minimum requirement" is the right approach at all. The nature of this discussion comes close to the diff heuristics at which Michael did present a wonderful solution, hence I had him cc'd on the series as he may have some good insights on how to improve the diffs. :) -- In conclusion: We are happy to move to next as it seems technically sound. But we want more exposure on usage to point out UX bugs. (e.g. is the default mode for just giving --color-moved good for the majority of people/use cases? Are there subtle annoyances such as the closing braces?) So maybe merge to next with the strong option to evict it when finding more fundamentally wrong things? Thanks, Stefan [1] https://public-inbox.org/git/cagz79kzjf9idsvgyi-hskb6n8w0uhvcu4w-r89f0erjpxe_...@mail.gmail.com/