From: Linus Torvalds <torva...@linux-foundation.org>
Date: Wed, 16 Mar 2016 09:15:53 -0700
Subject: [PATCH] pretty-print: de-tabify indented logs to make things line up 
properly

This should all line up:

  Column 1      Column 2
  --------      --------
  A             B
  ABCD          EFGH
  SPACES        Instead of Tabs

Even with multi-byte UTF8 characters:

  Column 1      Column 2
  --------      --------
  Ä             B
  åäö           100
  A Møøse       once bit my sister..

Signed-off-by: Linus Torvalds <torva...@linux-foundation.org>
---

This seems to work for me, and while there is some cost, it's minimal. 
Doing a "git log > /dev/null" of the current git tree is about 1% slower 
because of the tab-finding. A tree with a lot of tabs in the commit 
messages would be more noticeable, because then you actually end up 
hitting the whole "how wide is this" issue.

(But if the tabs are all at the beginning of a line, you'd still be ok 
and avoid the utf8 width calculations).

Comments?

 pretty.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 74 insertions(+), 2 deletions(-)

diff --git a/pretty.c b/pretty.c
index 92b2870a7eab..0b40457f99f0 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1629,6 +1629,76 @@ void pp_title_line(struct pretty_print_context *pp,
        strbuf_release(&title);
 }
 
+static int pp_utf8_width(const char *start, const char *end)
+{
+       int width = 0;
+       size_t remain = end - start;
+
+       while (remain) {
+               int n = utf8_width(&start, &remain);
+               if (n < 0 || !start)
+                       return -1;
+               width += n;
+       }
+       return width;
+}
+
+/*
+ * pp_handle_indent() prints out the intendation, and
+ * perhaps the whole line (without the final newline)
+ *
+ * Why "perhaps"? If there are tabs in the indented line
+ * it will print it out in order to de-tabify the line.
+ *
+ * But if there are no tabs, we just fall back on the
+ * normal "print the whole line".
+ */
+static int pp_handle_indent(struct strbuf *sb, int indent,
+                            const char *line, int linelen)
+{
+       const char *tab;
+
+       strbuf_addchars(sb, ' ', indent);
+
+       tab = memchr(line, '\t', linelen);
+       if (!tab)
+               return 0;
+
+       do {
+               int width = pp_utf8_width(line, tab);
+
+               /*
+                * If it wasn't well-formed utf8, or it
+                * had characters with badly defined
+                * width (control characters etc), just
+                * give up on trying to align things.
+                */
+               if (width < 0)
+                       break;
+
+               /* Output the data .. */
+               strbuf_add(sb, line, tab - line);
+
+               /* .. and the de-tabified tab */
+               strbuf_addchars(sb, ' ', 8-(width & 7));
+
+               /* Skip over the printed part .. */
+               linelen -= 1+tab-line;
+               line = tab + 1;
+
+               /* .. and look for the next tab */
+               tab = memchr(line, '\t', linelen);
+       } while (tab);
+
+       /*
+        * Print out everything after the last tab without
+        * worrying about width - there's nothing more to
+        * align.
+        */
+       strbuf_add(sb, line, linelen);
+       return 1;
+}
+
 void pp_remainder(struct pretty_print_context *pp,
                  const char **msg_p,
                  struct strbuf *sb,
@@ -1652,8 +1722,10 @@ void pp_remainder(struct pretty_print_context *pp,
                first = 0;
 
                strbuf_grow(sb, linelen + indent + 20);
-               if (indent)
-                       strbuf_addchars(sb, ' ', indent);
+               if (indent) {
+                       if (pp_handle_indent(sb, indent, line, linelen))
+                               linelen = 0;
+               }
                strbuf_add(sb, line, linelen);
                strbuf_addch(sb, '\n');
        }
-- 
2.8.0.rc2
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to