Re: [PATCH v6] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.

2013-10-18 Thread Yoshioka Tsuneo
Hello Junio

 In the [PATCH v7], I changed to keep filename part of suffix to handle
 above case, but not always keep directory part because I feel totally
 keeping all part of long suffix including directory name may cause output 
 like:
…{… = …}…ongPath1/LongPath2/nameOfTheFileThatWasMoved 
 And, above may be worse than:
   ...{...ceDirectory = …ionDirectory}.../nameOfTheFileThatWasMoved
 I think.
 
 I am not sure if I agree.
 
 Losing LongPath2 part may be more significant data loss than losing
 a single bit that says the change is a rename, as the latter may not
 quite tell us what these two directories were anyway.
I'm not sure which is the better in general.
But anyway, I don't have strong opinion about this.
So, I just changed to keep the all of the sfx part(lator than '}').
I just sent the updated patch as [PATCH v8].

Thanks !

---
Tsuneo Yoshioka (吉岡 恒夫)
yoshiokatsu...@gmail.com




On Oct 18, 2013, at 1:38 AM, Junio C Hamano gits...@pobox.com wrote:

 Yoshioka Tsuneo yoshiokatsu...@gmail.com writes:
 
 In the [PATCH v7], I changed to keep filename part of suffix to handle
 above case, but not always keep directory part because I feel totally
 keeping all part of long suffix including directory name may cause output 
 like:
…{… = …}…ongPath1/LongPath2/nameOfTheFileThatWasMoved 
 And, above may be worse than:
   ...{...ceDirectory = …ionDirectory}.../nameOfTheFileThatWasMoved
 I think.
 
 I am not sure if I agree.
 
 Losing LongPath2 part may be more significant data loss than losing
 a single bit that says the change is a rename, as the latter may not
 quite tell us what these two directories were anyway.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.

2013-10-17 Thread Junio C Hamano
Yoshioka Tsuneo yoshiokatsu...@gmail.com writes:

 git diff -M --stat can detect rename and show renamed file name like
 foofoofoo = barbarbar.
 Before this commit, this output is shortened always by omitting left most
 part like ...foo = barbarbar. So, if the destination filename is too long,
 source filename putting left or arrow can be totally omitted like
 ...barbarbar, without including any of foofoofoo =.
 In such a case where arrow symbol is omitted, there is no way to know
 whether the file is renamed or existed in the original.
 Make sure there is always an arrow, like ...foo = ...bar.
 The output can contain curly braces('{','}') for grouping.
 So, in general, the output format is pfx{mid_a = mid_b}sfx
 To keep arrow(=), try to omit pfx as long as possible at first
 because later part or changing part will be the more important part.
 If it is not enough, shorten mid_a, mid_b, and sfx trying to
 have the maximum length the same because those will be equally important.

I somehow find this solid wall of text extremely hard to
read. Adding a blank line as a paragraph break may make it easier to
read, perhaps.

Also it is customary in our history to omit the full-stop from the
patch title on the Subject: line.

 + name_len = pfx-len + a_mid-len + b_mid-len + sfx-len + strlen(arrow)
 + + (use_curly_braces ? 2 : 0);
 +
 + if (name_len = name_width) {
 + /* Everthing fits in name_width */
 + return;
 + }

Logic up to this point seems good; drop {} around a single statement
return;, i.e.

if (name_len = name_width)
return; /* everything fits */

 + } else {
 + if (pfx-len  strlen(dots)) {
 + /*
 +  * Just omitting left of '{' is not enough
 +  * name will be ...{SOMETHING}SOMETHING
 +  */
 + strbuf_reset(pfx);
 + strbuf_addstr(pfx, dots);
 + }

(mental note) ... otherwise, i.e. with a short common prefix, the
final result will be ab{SOMETHING}SOMETHING, which is also fine
for the purpose of the remainder of this function.

 + }
 + }
 +
 + /* available length for a_mid, b_mid and sfx */
 + len = name_width - strlen(arrow) - (use_curly_braces ? 2 : 0);
 +
 + /* a_mid, b_mid, sfx will be have the same max, including 
 ellipsis(...). */
 + part_length[0] = a_mid-len;
 + part_length[1] = b_mid-len;
 + part_length[2] = sfx-len;
 +
 + qsort(part_length, sizeof(part_length)/sizeof(part_length[0]), 
 sizeof(part_length[0])
 +   , compare_size_t_descending_order);

In our code, comma does not come at the beginning of continued
line.

 + if (part_length[1] + part_length[1] + part_length[2] = len) {
 + /*
 +  * {...foofoo = barbar}file
 +  * There is only one omitted part.
 +  */
 + max_part_len = len - part_length[1] - part_length[2];

It would be clearer to explicitly set remainder to zero here, and
omit the initialization of the variable.  That would make what the
three parts of if/elseif/else do more consistent.

 + } else if (part_length[2] + part_length[2] + part_length[2] = len) {
 + /*
 +  * {...foofoo = ...barbar}file
 +  * There are 2 omitted parts.
 +  */
 + max_part_len = (len - part_length[2]) / 2;
 + remainder_part_len = (len - part_length[2]) - max_part_len * 2;
 + } else {
 + /*
 +  * {...ofoo = ...rbar}...file
 +  * There are 3 omitted parts.
 +  */
 + max_part_len = len / 3;
 + remainder_part_len = len - (max_part_len) * 3;
 + }

I am not sure if distributing the burden of truncation equally to
three parts so that the resulting pieces are of similar lengths is
really a good idea.  Between these two

{...SourceDirectory = ...nationDirectory}...ileThatWasMoved 
{...ceDirectory = ...ionDirectory}nameOfTheFileThatWasMoved

that attempt to show that the file nameOfTheFileThatWasMoved was
moved from the longSourceDirectory to the DestinationDirectory, the
latter is much more informative, I would think.

 diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh
 index 2f327b7..03d6371 100755
 --- a/t/t4001-diff-rename.sh
 +++ b/t/t4001-diff-rename.sh
 @@ -156,4 +156,16 @@ test_expect_success 'rename pretty print common prefix 
 and suffix overlap' '
   test_i18ngrep  d/f/{ = f}/e  output
  '
  
 +test_expect_success 'rename of very long path shows =' '
 + mkdir long_dirname_that_does_not_fit_in_a_single_line 
 + mkdir another_extremely_long_path_but_not_the_same_as_the_first 
 + cp path1 long_dirname*/ 
 + git add long_dirname*/path1 
 + test_commit add_long_pathname 
 + git mv long_dirname*/path1 

Re: [PATCH v6] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.

2013-10-17 Thread Junio C Hamano
Yoshioka Tsuneo yoshiokatsu...@gmail.com writes:

 Before this commit, this output is shortened always by omitting left most
 part like ...foo = barbarbar. So, if the destination filename is too long,
 source filename putting left or arrow can be totally omitted like
 ...barbarbar, without including any of foofoofoo =.

This is an explanation much easier to understand than the one in the
previous iteration.  Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.

2013-10-17 Thread Yoshioka Tsuneo
Hello Junio

Thank you very much for the reviewing.
I try to fix the issues, and posted the updated patch as [PATCH v7].

 I am not sure if distributing the burden of truncation equally to
 three parts so that the resulting pieces are of similar lengths is
 really a good idea.  Between these two
 
   {...SourceDirectory = ...nationDirectory}...ileThatWasMoved 
   {...ceDirectory = ...ionDirectory}nameOfTheFileThatWasMoved
 
 that attempt to show that the file nameOfTheFileThatWasMoved was
 moved from the longSourceDirectory to the DestinationDirectory, the
 latter is much more informative, I would think.
In the [PATCH v7], I changed to keep filename part of suffix to handle
above case, but not always keep directory part because I feel totally
keeping all part of long suffix including directory name may cause output like:
…{… = …}…ongPath1/LongPath2/nameOfTheFileThatWasMoved 
And, above may be worse than:
   ...{...ceDirectory = …ionDirectory}.../nameOfTheFileThatWasMoved
I think.

Thank you !

---
Tsuneo Yoshioka (吉岡 恒夫)
yoshiokatsu...@gmail.com




On Oct 17, 2013, at 10:29 PM, Junio C Hamano gits...@pobox.com wrote:

 Yoshioka Tsuneo yoshiokatsu...@gmail.com writes:
 
 git diff -M --stat can detect rename and show renamed file name like
 foofoofoo = barbarbar.
 Before this commit, this output is shortened always by omitting left most
 part like ...foo = barbarbar. So, if the destination filename is too long,
 source filename putting left or arrow can be totally omitted like
 ...barbarbar, without including any of foofoofoo =.
 In such a case where arrow symbol is omitted, there is no way to know
 whether the file is renamed or existed in the original.
 Make sure there is always an arrow, like ...foo = ...bar.
 The output can contain curly braces('{','}') for grouping.
 So, in general, the output format is pfx{mid_a = mid_b}sfx
 To keep arrow(=), try to omit pfx as long as possible at first
 because later part or changing part will be the more important part.
 If it is not enough, shorten mid_a, mid_b, and sfx trying to
 have the maximum length the same because those will be equally important.
 
 I somehow find this solid wall of text extremely hard to
 read. Adding a blank line as a paragraph break may make it easier to
 read, perhaps.
 
 Also it is customary in our history to omit the full-stop from the
 patch title on the Subject: line.
 
 +name_len = pfx-len + a_mid-len + b_mid-len + sfx-len + strlen(arrow)
 ++ (use_curly_braces ? 2 : 0);
 +
 +if (name_len = name_width) {
 +/* Everthing fits in name_width */
 +return;
 +}
 
 Logic up to this point seems good; drop {} around a single statement
 return;, i.e.
 
   if (name_len = name_width)
   return; /* everything fits */
 
 +} else {
 +if (pfx-len  strlen(dots)) {
 +/*
 + * Just omitting left of '{' is not enough
 + * name will be ...{SOMETHING}SOMETHING
 + */
 +strbuf_reset(pfx);
 +strbuf_addstr(pfx, dots);
 +}
 
 (mental note) ... otherwise, i.e. with a short common prefix, the
 final result will be ab{SOMETHING}SOMETHING, which is also fine
 for the purpose of the remainder of this function.
 
 +}
 +}
 +
 +/* available length for a_mid, b_mid and sfx */
 +len = name_width - strlen(arrow) - (use_curly_braces ? 2 : 0);
 +
 +/* a_mid, b_mid, sfx will be have the same max, including 
 ellipsis(...). */
 +part_length[0] = a_mid-len;
 +part_length[1] = b_mid-len;
 +part_length[2] = sfx-len;
 +
 +qsort(part_length, sizeof(part_length)/sizeof(part_length[0]), 
 sizeof(part_length[0])
 +  , compare_size_t_descending_order);
 
 In our code, comma does not come at the beginning of continued
 line.
 
 +if (part_length[1] + part_length[1] + part_length[2] = len) {
 +/*
 + * {...foofoo = barbar}file
 + * There is only one omitted part.
 + */
 +max_part_len = len - part_length[1] - part_length[2];
 
 It would be clearer to explicitly set remainder to zero here, and
 omit the initialization of the variable.  That would make what the
 three parts of if/elseif/else do more consistent.
 
 +} else if (part_length[2] + part_length[2] + part_length[2] = len) {
 +/*
 + * {...foofoo = ...barbar}file
 + * There are 2 omitted parts.
 + */
 +max_part_len = (len - part_length[2]) / 2;
 +remainder_part_len = (len - part_length[2]) - max_part_len * 2;
 +} else {
 +/*
 + * {...ofoo = ...rbar}...file
 + * There are 3 omitted parts.
 + */
 +max_part_len = len / 3;
 +remainder_part_len = len - (max_part_len) * 3;
 +}
 
 

Re: [PATCH v6] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.

2013-10-17 Thread Junio C Hamano
Yoshioka Tsuneo yoshiokatsu...@gmail.com writes:

 In the [PATCH v7], I changed to keep filename part of suffix to handle
 above case, but not always keep directory part because I feel totally
 keeping all part of long suffix including directory name may cause output 
 like:
 …{… = …}…ongPath1/LongPath2/nameOfTheFileThatWasMoved 
 And, above may be worse than:
...{...ceDirectory = …ionDirectory}.../nameOfTheFileThatWasMoved
 I think.

I am not sure if I agree.

Losing LongPath2 part may be more significant data loss than losing
a single bit that says the change is a rename, as the latter may not
quite tell us what these two directories were anyway.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.

2013-10-16 Thread Yoshioka Tsuneo

git diff -M --stat can detect rename and show renamed file name like
foofoofoo = barbarbar.
Before this commit, this output is shortened always by omitting left most
part like ...foo = barbarbar. So, if the destination filename is too long,
source filename putting left or arrow can be totally omitted like
...barbarbar, without including any of foofoofoo =.
In such a case where arrow symbol is omitted, there is no way to know
whether the file is renamed or existed in the original.
Make sure there is always an arrow, like ...foo = ...bar.
The output can contain curly braces('{','}') for grouping.
So, in general, the output format is pfx{mid_a = mid_b}sfx
To keep arrow(=), try to omit pfx as long as possible at first
because later part or changing part will be the more important part.
If it is not enough, shorten mid_a, mid_b, and sfx trying to
have the maximum length the same because those will be equally important.

Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com
Test-added-by: Thomas Rast tr...@inf.ethz.ch
---
 diff.c | 180 +++--
 t/t4001-diff-rename.sh |  12 
 2 files changed, 170 insertions(+), 22 deletions(-)

diff --git a/diff.c b/diff.c
index a04a34d..afe6a36 100644
--- a/diff.c
+++ b/diff.c
@@ -1258,11 +1258,13 @@ static void fn_out_consume(void *priv, char *line, 
unsigned long len)
}
 }
 
-static char *pprint_rename(const char *a, const char *b)
+static void find_common_prefix_suffix(const char *a, const char *b,
+   struct strbuf *pfx,
+   struct strbuf *a_mid, struct strbuf *b_mid,
+   struct strbuf *sfx)
 {
const char *old = a;
const char *new = b;
-   struct strbuf name = STRBUF_INIT;
int pfx_length, sfx_length;
int pfx_adjust_for_slash;
int len_a = strlen(a);
@@ -1272,10 +1274,9 @@ static char *pprint_rename(const char *a, const char *b)
int qlen_b = quote_c_style(b, NULL, NULL, 0);
 
if (qlen_a || qlen_b) {
-   quote_c_style(a, name, NULL, 0);
-   strbuf_addstr(name,  = );
-   quote_c_style(b, name, NULL, 0);
-   return strbuf_detach(name, NULL);
+   quote_c_style(a, a_mid, NULL, 0);
+   quote_c_style(b, b_mid, NULL, 0);
+   return;
}
 
/* Find common prefix */
@@ -1322,17 +1323,142 @@ static char *pprint_rename(const char *a, const char 
*b)
if (b_midlen  0)
b_midlen = 0;
 
-   strbuf_grow(name, pfx_length + a_midlen + b_midlen + sfx_length + 7);
-   if (pfx_length + sfx_length) {
-   strbuf_add(name, a, pfx_length);
+   strbuf_add(pfx, a, pfx_length);
+   strbuf_add(a_mid, a + pfx_length, a_midlen);
+   strbuf_add(b_mid, b + pfx_length, b_midlen);
+   strbuf_add(sfx, a + len_a - sfx_length, sfx_length);
+}
+
+static int compare_size_t_descending_order(const void *left, const void *right)
+{
+   size_t left_val = *(size_t *)left;
+   size_t right_val = *(size_t *)right;
+   return (int)(right_val - left_val);
+}
+
+/*
+ * Omit each parts to fix in name_width.
+ * Formatted string is pfx{a_mid = b_mid}sfx.
+ * At first, omit pfx as long as possible.
+ * If it is not enough, omit a_mid, b_mid, sfx by tring to set the 
length of
+ * those 3 parts(including ...) to the same.
+ * Ex:
+ * foofoofoo = barbarbar
+ *   will be like
+ * ...foo = ...bar.
+ * long_parent{foofoofoo = barbarbar}longfilename
+ *   will be like
+ * ...parent{...foofoo = ...barbar}...lename
+ */
+static void rename_omit(struct strbuf *pfx,
+   struct strbuf *a_mid, struct strbuf *b_mid,
+   struct strbuf *sfx,
+   int name_width)
+{
+   static const char arrow[] =  = ;
+   static const char dots[] = ...;
+   int use_curly_braces = (pfx-len  0) || (sfx-len  0);
+   size_t name_len;
+   size_t len;
+   size_t part_length[3];
+   size_t max_part_len = 0;
+   size_t remainder_part_len = 0;
+
+   name_len = pfx-len + a_mid-len + b_mid-len + sfx-len + strlen(arrow)
+   + (use_curly_braces ? 2 : 0);
+
+   if (name_len = name_width) {
+   /* Everthing fits in name_width */
+   return;
+   }
+
+   if (use_curly_braces) {
+   if (strlen(dots) + (name_len - pfx-len) = name_width) {
+   /*
+* Just omitting left of '{' is enough
+* Ex: ...aaa{foofoofoo = bar}file
+*/
+   strbuf_splice(pfx, 0, name_len - name_width + 
strlen(dots), dots, strlen(dots));
+   return;
+   } else {
+   if (pfx-len  strlen(dots)) {
+   /*
+* Just omitting left of '{' is not