Re: [PATCH v5] diff.c: keep arrow(=>) on show_stats()'s shortened filename part to make rename visible.

Yoshioka Tsuneo Wed, 16 Oct 2013 02:55:03 -0700

Hello Junio, Keshav

Thank you very much for your detailed reviewing.
I just tried to update the patch and posted as "[PATCH v6]".


---
Tsuneo Yoshioka (吉岡 恒夫)
[email protected]




On Oct 16, 2013, at 1:54 AM, Junio C Hamano <[email protected]> wrote:

> Yoshioka Tsuneo <[email protected]> writes:
> 
>> "git diff -M --stat" can detect rename and show renamed file name like
>> "foofoofoo => barbarbar". But if destination filename is long, the line
>> is shortened like "...barbarbar" so there is no way to know whether the
>> file is renamed or existed in the source commit.
> 
> Is "destination" filename more special than the source filename?
> Perhaps "s/if destination filename is/if filenames are/"?
> 
>       Note: I do not want you to reroll using the suggested
>       wording without explanation; it may be possible that I am
>       missing something obvious and do not understand why you
>       singled out destination, in which case I'd rather see it
>       explained better in the log message than the potentially
>       suboptimal suggestion I made in the review without
>       understanding the issue. Of course, it is possible that you
>       want to do the same when source is overlong, in which case
>       you can just say "Yeah, you're right; will reroll".
> 
>        The above applies to all the other comments in this message.
> 
> Also "s/source commit/original/".  You may not be comparing two
> commits after all.
> 
>> Make sure there is always an arrow, like "...foo => ...bar".
>> The output can contains curly braces('{','}') for grouping.
> 
> s/contains/contain/;
> 
>> So, in general, the outpu format is "<pfx>{<mid_a> => <mid_b>}<sfx>"
> 
> s/outpu/&t/;
> 
>> To keep arrow("=>"), try to omit <pfx> as long as possible at first
>> because later part or changing part will be the more important part.
>> If it is not enough, shorten <mid_a>, <mid_b>, and <sfx> trying to
>> have the maximum length the same because those will be equaly important.
> 
> A sound reasoning.
> 
>> Signed-off-by: Tsuneo Yoshioka <[email protected]>
>> Test-added-by: Thomas Rast <[email protected]>
>> ---
>> diff.c                 | 187 
>> +++++++++++++++++++++++++++++++++++++++++++------
>> t/t4001-diff-rename.sh |  12 ++++
>> 2 files changed, 177 insertions(+), 22 deletions(-)
>> 
>> diff --git a/diff.c b/diff.c
>> index a04a34d..cf50807 100644
>> --- a/diff.c
>> +++ b/diff.c
>> @@ -1258,11 +1258,12 @@ static void fn_out_consume(void *priv, char *line, 
>> unsigned long len)
>>      }
>> }
>> 
>> -static char *pprint_rename(const char *a, const char *b)
>> +static void pprint_rename_find_common_prefix_suffix(const char *a, const 
>> char *b
>> +                                                                            
>>                         , struct strbuf *pfx, struct strbuf *a_mid
>> +                                                                            
>>                         , struct strbuf *b_mid, struct strbuf *sfx)
> 
> What kind of line splitting is this?
> 
> I think the real issue is that the function name is overly long, but
> aside from that,
> 
> - comma comes at the end of the line, not at the beginning of the
>   next line;
> 
> - the second and subsequent lines are indented, but not more than
>   the usual line width (align with the first letter inside the
>   opening parenthesis of the first line);
> 
> - a_mid and b_mid are more "alike" than pfx and a_mid.
> 
> so I would expect to see it more like:
> 
> static void abbrev_rename(const char *a, const char *b,
>                         struct strbuf *pfx,
>                         struct strbuf *a_mid, struct strbuf *b_mid,
>                         struct strbuf *sfx)
> 
> Note that the suggested name does not say "pprint", because in your
> version of this file, the code around here is no longer doing any
> printing.  The caller does so after using this function to decide
> how to abbreviate renames, so naming the helper function after what
> it does (e.g. abbreviate renames) is more appropriate.
> 
>> {
>>      const char *old = a;
>>      const char *new = b;
>> -    struct strbuf name = STRBUF_INIT;
>>      int pfx_length, sfx_length;
>>      int pfx_adjust_for_slash;
>>      int len_a = strlen(a);
>> @@ -1272,10 +1273,9 @@ static char *pprint_rename(const char *a, const char 
>> *b)
>>      int qlen_b = quote_c_style(b, NULL, NULL, 0);
>> 
>>      if (qlen_a || qlen_b) {
>> -            quote_c_style(a, &name, NULL, 0);
>> -            strbuf_addstr(&name, " => ");
>> -            quote_c_style(b, &name, NULL, 0);
>> -            return strbuf_detach(&name, NULL);
>> +            quote_c_style(a, a_mid, NULL, 0);
>> +            quote_c_style(b, b_mid, NULL, 0);
>> +            return;
>>      }
>> 
>>      /* Find common prefix */
>> @@ -1321,18 +1321,151 @@ static char *pprint_rename(const char *a, const 
>> char *b)
>>              a_midlen = 0;
>>      if (b_midlen < 0)
>>              b_midlen = 0;
>> +    
> 
> Trailing whitespace (there are many others you added to this file; I
> won't bother to point out all of them).
> 
>> +    strbuf_add(pfx, a, pfx_length);
>> +    strbuf_add(a_mid, a + pfx_length, a_midlen);
>> +    strbuf_add(b_mid, b + pfx_length, b_midlen);
>> +    strbuf_add(sfx, a + len_a - sfx_length, sfx_length);
>> +}
>> +
>> +/*
>> + * Omit each parts to fix in name_width.
>> + * Formatted string is "<pfx>{<a_mid> => <b_mid>}<sfx>".
>> + * At first, omit <pfx> as long as possible.
>> + * If it is not enough, omit <a_mid>, <b_mid>, <sfx> by tring to set the 
>> length of
>> + * those 3 parts(including "...") to the same.
>> + * Ex:
>> + * "foofoofoo => barbarbar"
>> + *   will be like
>> + * "...foo => ...bar".
>> + * "long_parent{foofoofoo => barbarbar}longfilename"
>> + *   will be like
>> + * "...parent{...foofoo => ...barbar}...lename"
>> + */
>> +static void pprint_rename_omit(struct strbuf *pfx, struct strbuf *a_mid, 
>> struct strbuf *b_mid
>> +                                                       , struct strbuf 
>> *sfx, int name_width)
> 
> Bad line splitting.
> 
>> +{
>> +
>> +#define ARROW " => "
>> +#define ELLIPSIS "..."
> 
> Ugly and leaks these symbols after the function is done using them
> to the remainder of this file.  Write them like this instead, perhaps?
> 
>       static const char arrow[] = " => ";
>        static const char dots[] = "...";
> 
>> +#define swap(a, b) myswap((a), (b), sizeof(a))
>> +    
>> +#define myswap(a, b, size) do {             \
>> +unsigned char mytmp[size];  \
>> +memcpy(mytmp, &a, size);            \
>> +memcpy(&a, &b, size);               \
>> +memcpy(&b, mytmp, size);            \
>> +} while (0)
> 
> These are totally unneeded, I suspect (see below).
> 
>> +
>> +    int use_curly_braces = (pfx->len > 0) || (sfx->len > 0);
>> +    size_t name_len;
>> +    size_t len;
>> +    size_t part_lengths[4];
> 
> Do not name an array in plural, i.e. elements[], unless there is a
> compelling reason to do so.  By using singular, e.g. element[], the
> third element can be spelled as element[3], which is more logical
> than having to call it elements[3].
> 
>       Side note. a notable exception is an array that is used as a
>       hash-table and frequently passed around as an argument; you
>       are usually not interested in iterating over it in ascending
>       order, and being able to call such a collection of things
>       "things" in plural, e.g. struct object objects[], is more
>       important.
> 
>> +    size_t max_part_len = 0;
>> +    size_t remainder_part_len = 0;
>> +    int i, j;
>> +
>> +    name_len = pfx->len + a_mid->len + b_mid->len + sfx->len + strlen(ARROW)
>> +            + (use_curly_braces ? 2 : 0);
>> +    
>> +    if (name_len <= name_width) {
>> +            /* Everthing fits in name_width */
>> +            return;
>> +    }
>> +    
>> +    if (use_curly_braces) {
>> +            if (strlen(ELLIPSIS) + (name_len - pfx->len) <= name_width) {
>> +                    /*
>> +                     Just omitting left of '{' is enough
>> +                     Ex: ...aaa{foofoofoo => bar}file
>> +                     */
> 
>       /*
>         * We format our multi-line
>         * comments like
>         * this.
>         */
> 
>> +                    strbuf_splice(pfx, name_len - pfx->len, name_width - 
>> (name_len - pfx->len), ELLIPSIS, strlen(ELLIPSIS));
> 
> Overlong line.
> 
> Is the math for the second and third arguments correct?  If you are
> making "abcdefghij" into "...hij", you would splice at position 0
> for length up to 'g', so it felt strange to see any arithmetic as
> the second argument, but I didn't look at this code very closely.
> 
>> +                    return;
>> +            } else {
>> +                    if (pfx->len > strlen(ELLIPSIS)) {
>> +                            /*
>> +                             Just omitting left of '{' is not enough
>> +                             name will be "...{SOMETHING}SOMETHING"
>> +                             */
>> +                            strbuf_reset(pfx);
>> +                            strbuf_addstr(pfx, ELLIPSIS);
>> +                    }
>> +            }
>> +    }
>> 
>> -    strbuf_grow(&name, pfx_length + a_midlen + b_midlen + sfx_length + 7);
>> -    if (pfx_length + sfx_length) {
>> -            strbuf_add(&name, a, pfx_length);
>> +    /* available length for a_mid, b_mid and sfx */
>> +    len = name_width - strlen(ARROW) - (use_curly_braces ? 2 : 0);
>> +    
>> +    /* a_mid, b_mid, sfx will be have the same max, including 
>> ellipsis("..."). */
>> +    part_lengths[0] = (int)a_mid->len;
>> +    part_lengths[1] = (int)b_mid->len;
>> +    part_lengths[2] = (int)sfx->len;
> 
> What are these casts about?  strbuf.len is of size_t which is
> already the correct type for part_length[].
> 
>> +    
>> +    /* bubble sort of part_lengths, descending order */
> 
> Do not bubble sort.  Unless there is a compelling reason not to
> (liek you are in a performance critical section and want to use a
> custom sort algorithm), just let the platform-supplied qsort(3) do
> the job by writing a small comparison function.
> 
>> +    for (i=0; i<3; i++) {
>> +            for (j=i+1; j<3; j++) {
>> +                    if (part_lengths[j] > part_lengths[i]) {
>> +                            swap(part_lengths[i], part_lengths[j]);
>> +                    }
>> +            }
>> +    }
>> +    
>> +    if (part_lengths[1] + part_lengths[1] + part_lengths[2] <= len) {
>> +            /*
>> +             * "{...foofoo => barbar}file"
>> +             * There is only one omitting part.
> 
> s/omitting/omitted/;
> 
>> +             */
>> +            max_part_len = len - part_lengths[1] - part_lengths[2];
>> +    } else if (part_lengths[2] + part_lengths[2] + part_lengths[2] <= len) {
>> +            /*
>> +             * "{...foofoo => ...barbar}file"
>> +             * There are 2 omitting part.
> 
> s/omitting part/omitted parts/;
> 
>> +             */
>> +            max_part_len = (len - part_lengths[2]) / 2;
>> +            remainder_part_len = (len - part_lengths[2]) - max_part_len * 2;
>> +    } else {
>> +            /*
>> +             * "{...ofoo => ...rbar}...file"
>> +             * There are 3 omitting part.
> 
> Likewise.
> 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5] diff.c: keep arrow(=>) on show_stats()'s shortened filename part to make rename visible.

Reply via email to