Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
Marc Branchaudwrites: > On 16-03-15 09:02 PM, Stefan Beller wrote: >> On Tue, Mar 15, 2016 at 6:00 PM, Stefan Beller wrote: >>> >>> Instead of converting to whitespaces in Git, we could make use of the >>> set_tabs capability for ttys and setup the terminal for having tabs align >>> to 12,+8,+8,+8... >> >> Or rather read in the existing tabs configuration and shift it by a constant. > > Could this also help with diff output, where the leading + or - mars the > indentation in a similar way? Hardly: the same pager would be used to display the commit message (with a 4-space prefix) and the diff (with a 1-char +/- prefix). So the offsets in these two places would need to be different, and I don't think we can change that on the fly. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
On Wed, Mar 16, 2016 at 7:27 AM, Marc Branchaudwrote: > > Could this also help with diff output, where the leading + or - mars the > indentation in a similar way? I don't think that's a good idea at least by default, since then it will break things like running diff in emacs capture mode. So even if you're in a terminal, you can't assume that you can munge the output too much. Of course, if colorization is on, you might as well pretty-print the diff by indenting things properly too, since the end result isn't going to be used as a _diff_. Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
On Wed, Mar 16, 2016 at 9:27 PM, Marc Branchaudwrote: > On 16-03-15 09:02 PM, Stefan Beller wrote: >> On Tue, Mar 15, 2016 at 6:00 PM, Stefan Beller wrote: >>> >>> Instead of converting to whitespaces in Git, we could make use of the >>> set_tabs capability for ttys and setup the terminal for having tabs align >>> to 12,+8,+8,+8... >> >> Or rather read in the existing tabs configuration and shift it by a constant. > > Could this also help with diff output, where the leading + or - mars the > indentation in a similar way? Yes, please! Even if we only activate this when stdout is tty. Much better output. > That opens a bit of a deeper problem, because not all the files in a single > repo necessarily use the same tab size. Maybe we can pass --tab-width=xx to git? -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
On 16-03-15 09:02 PM, Stefan Beller wrote: > On Tue, Mar 15, 2016 at 6:00 PM, Stefan Bellerwrote: >> >> Instead of converting to whitespaces in Git, we could make use of the >> set_tabs capability for ttys and setup the terminal for having tabs align >> to 12,+8,+8,+8... > > Or rather read in the existing tabs configuration and shift it by a constant. Could this also help with diff output, where the leading + or - mars the indentation in a similar way? That opens a bit of a deeper problem, because not all the files in a single repo necessarily use the same tab size. M. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
Linus Torvaldswrites: > On Wed, Mar 16, 2016 at 7:27 AM, Marc Branchaud wrote: >> >> Could this also help with diff output, where the leading + or - mars the >> indentation in a similar way? > > I don't think that's a good idea at least by default, since then it > will break things like running diff in emacs capture mode. > > So even if you're in a terminal, you can't assume that you can munge > the output too much. > > Of course, if colorization is on, you might as well pretty-print the > diff by indenting things properly too, since the end result isn't > going to be used as a _diff_. I agree that I will never use such an end result as a diff, but I may still be tempted to cut-and-paste individual lines after '^+' when resurrecting a WIP topic that does not rebase very well, so I agree with you that the output shouldn't be munged by default even though I think it is OK to have an option.. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
On Tue, Mar 15, 2016 at 09:57:16PM -0700, Linus Torvalds wrote: > On Tue, Mar 15, 2016 at 9:46 PM, Junio C Hamanowrote: > > > > It also ignores that byte counts of non-HT bytes may not necessarily > > match display columns. There is utf8_width() function exported from > > utf8.c for that purpose. > > Hmm. I did that to now make it horribly slower. Doing the > per-character widths really does horrible things for performance. > > That said, I think I can make it do the fast thing for lines that have > no tabs at all, which is the big bulk of it. So it would do the width > calculations only in the rare "yes, I found a tab" case. > > I already wrote it in a way that non-tab lines end up just looping > over the line looking for a tab and then fall back to the old code. > > I might just have to make that behavior a bit more explicit. Let me > stare at it a bit, but it won't happen until tomorrow, I think. I wondered about performance when reading yours, and measured a straight "git log >/dev/null" on the kernel at about 3% slower (best-of-five and average). We can get most of that back by sticking a memchr(line, '\t', linelen); in front and skipping the loop if it returns NULL. You can also implement your loop in terms of memchr() calls to skip to the next tab, but I don't know if it's worth it. It's really only worth spending time optimizing the non-tab case (and even then, only worth it for trivial things like "use the already optimized-as-hell memchr"). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
On Tue, Mar 15, 2016 at 9:46 PM, Junio C Hamanowrote: > > It also ignores that byte counts of non-HT bytes may not necessarily > match display columns. There is utf8_width() function exported from > utf8.c for that purpose. Hmm. I did that to now make it horribly slower. Doing the per-character widths really does horrible things for performance. That said, I think I can make it do the fast thing for lines that have no tabs at all, which is the big bulk of it. So it would do the width calculations only in the rare "yes, I found a tab" case. I already wrote it in a way that non-tab lines end up just looping over the line looking for a tab and then fall back to the old code. I might just have to make that behavior a bit more explicit. Let me stare at it a bit, but it won't happen until tomorrow, I think. Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
Linus Torvaldswrites: > Here's a first try at it. It does tab expansion only for the cases > that indent the commit message, so for things like "pretty=email" it > makes no difference at all. It also ignores that byte counts of non-HT bytes may not necessarily match display columns. There is utf8_width() function exported from utf8.c for that purpose. I think it is fine to make it default for the pretty=medium, but it would be nicer to introduce a new command line option --no-untabify to optionally disable the expansion. > pretty.c | 20 +++- > 1 file changed, 19 insertions(+), 1 deletion(-) > > diff --git a/pretty.c b/pretty.c > index 92b2870a7eab..dcd6105d1eb2 100644 > --- a/pretty.c > +++ b/pretty.c > @@ -1652,8 +1652,26 @@ void pp_remainder(struct pretty_print_context *pp, > first = 0; > > strbuf_grow(sb, linelen + indent + 20); > - if (indent) > + if (indent) { > + int i, last = 0, pos = 0; > + > strbuf_addchars(sb, ' ', indent); > + for (i = 0; i < linelen; i++) { > + int expand; > + if (line[i] != '\t') > + continue; > + strbuf_add(sb, line+last, i-last); > + pos += i-last; > + expand = (pos + 8) & ~7; > + strbuf_addchars(sb, ' ', expand - pos); > + pos = expand; > + last = i+1; > + } > + > + // Handle the tail non-tab content > + line += last; > + linelen -= last; > + } > strbuf_add(sb, line, linelen); > strbuf_addch(sb, '\n'); > } -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
On Tue, Mar 15, 2016 at 5:48 PM, Linus Torvaldswrote: > On Tue, Mar 15, 2016 at 5:45 PM, Junio C Hamano wrote: >> >> Wouldn't it be nicer to do this kind of thing at the output side? A >> stupid way would be to have an option to indent the log text with a >> tab instead of 4-spaces, but a more sensible way would be to keep >> the visual 4-space-indent and do the expand-tab for each line of >> text. > > Yes, that would certainly work for me too. It's just the "ascii boxes > don't line up" that is problematic.. Ok, that actually turns out to be pretty easy. Here's a first try at it. It does tab expansion only for the cases that indent the commit message, so for things like "pretty=email" it makes no difference at all. However, it hardcodes the hardtab size to 8, and makes this all unconditional. That's how a vt100 terminal works, but fancer terminals may allow you set actual tab-stops, and if you're in emacs you can probably do just about anything) Comments? This does make the kernel commit 0dc8c730c98a look fine, so it would make me happy. I can write a commit log etc if people think this is ok, but right now this is just a silly raw patch in the attachement.. Linus pretty.c | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/pretty.c b/pretty.c index 92b2870a7eab..dcd6105d1eb2 100644 --- a/pretty.c +++ b/pretty.c @@ -1652,8 +1652,26 @@ void pp_remainder(struct pretty_print_context *pp, first = 0; strbuf_grow(sb, linelen + indent + 20); - if (indent) + if (indent) { + int i, last = 0, pos = 0; + strbuf_addchars(sb, ' ', indent); + for (i = 0; i < linelen; i++) { + int expand; + if (line[i] != '\t') + continue; + strbuf_add(sb, line+last, i-last); + pos += i-last; + expand = (pos + 8) & ~7; + strbuf_addchars(sb, ' ', expand - pos); + pos = expand; + last = i+1; + } + + // Handle the tail non-tab content + line += last; + linelen -= last; + } strbuf_add(sb, line, linelen); strbuf_addch(sb, '\n'); }
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
On Tue, Mar 15, 2016 at 6:00 PM, Stefan Bellerwrote: > > Instead of converting to whitespaces in Git, we could make use of the > set_tabs capability for ttys and setup the terminal for having tabs align > to 12,+8,+8,+8... Or rather read in the existing tabs configuration and shift it by a constant. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
On Tue, Mar 15, 2016 at 5:48 PM, Linus Torvaldswrote: > On Tue, Mar 15, 2016 at 5:45 PM, Junio C Hamano wrote: >> >> Wouldn't it be nicer to do this kind of thing at the output side? A >> stupid way would be to have an option to indent the log text with a >> tab instead of 4-spaces, but a more sensible way would be to keep >> the visual 4-space-indent and do the expand-tab for each line of >> text. > > Yes, that would certainly work for me too. It's just the "ascii boxes > don't line up" that is problematic.. I would also rather side to correct the display side rather than the applying side. [I still want to send and apply patches with git written in the white space language ;] Instead of converting to whitespaces in Git, we could make use of the set_tabs capability for ttys and setup the terminal for having tabs align to 12,+8,+8,+8... such that it looks like an indented terminal. That way we would also preserve the tabs in case you'd want to copy the message as is. Thanks, Stefan > > (Also people including example source code etc, where the first tab > looks wildly different from the second one) > > Linus > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
On Tue, Mar 15, 2016 at 5:45 PM, Junio C Hamanowrote: > > Wouldn't it be nicer to do this kind of thing at the output side? A > stupid way would be to have an option to indent the log text with a > tab instead of 4-spaces, but a more sensible way would be to keep > the visual 4-space-indent and do the expand-tab for each line of > text. Yes, that would certainly work for me too. It's just the "ascii boxes don't line up" that is problematic.. (Also people including example source code etc, where the first tab looks wildly different from the second one) Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
Linus Torvaldswrites: > Do people hate that idea? I may not get around to it for a while (it's > the kernel merge window right now), but I can write the patch > eventually - I just wanted to do an RFC first. Wouldn't it be nicer to do this kind of thing at the output side? A stupid way would be to have an option to indent the log text with a tab instead of 4-spaces, but a more sensible way would be to keep the visual 4-space-indent and do the expand-tab for each line of text. That way, your viewing of existing commits that use 8-space HT to align (and worse yet, mixture of 8-space HT and 8 spaces and assume the end result would align in the output) would become more pleasant without you having to run filter-branch ;-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
On Tue, Mar 15, 2016 at 5:23 PM, Stefan Bellerwrote: > > Could you point at some example to better understand the problem? So in the kernel repo, I just randomly looked for tabs that show this problem, and take for example commit ff9a9b4c4334b53b52ee9279f30bd5dd92ea9bdd. You can see how the original lined up, by doing git show --pretty=email ff9a9b4c4334 because the email format doesn't indent the commit message. But if you do just git show ff9a9b4c4334 and get the usual indentation, you'll see things not line up at all. In case you don't want to bother with the kernel repo, here's what it looks like: email format: --- snip snip 8< --- A microbenchmark calling an invalid syscall number 10 million times in a row speeds up an additional 30% over the numbers with just the previous patches, for a total speedup of about 40% over 4.4 and 4.5-rc1. Run times for the microbenchmark: 4.43.8 seconds 4.5-rc13.7 seconds 4.5-rc1 + first patch 3.3 seconds 4.5-rc1 + first 3 patches 3.1 seconds 4.5-rc1 + all patches 2.3 seconds A non-NOHZ_FULL cpu (not the housekeeping CPU): all kernels1.86 seconds --- snip snip 8< --- Normal "git show" format: --- snip snip 8< --- A microbenchmark calling an invalid syscall number 10 million times in a row speeds up an additional 30% over the numbers with just the previous patches, for a total speedup of about 40% over 4.4 and 4.5-rc1. Run times for the microbenchmark: 4.43.8 seconds 4.5-rc13.7 seconds 4.5-rc1 + first patch 3.3 seconds 4.5-rc1 + first 3 patches 3.1 seconds 4.5-rc1 + all patches 2.3 seconds A non-NOHZ_FULL cpu (not the housekeeping CPU): all kernels1.86 seconds --- snip snip 8< --- which hopefully clarifies. In the above case, it really isn't very annoying. It's just slightly ugly. In some other cases, it can get quite hard to see what's up, but the ones that come through me I actually tend to try to edit, so many of them have been corrected. For other examples (again, in the kernel), look at 19b2c30d3cce, or 0dc8c730c98a. Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
On March 15, 2016 8:17 PM Linus Torvalds wrote: > So I end up doing this manually when I notice, but I was wondering ig maybe > git could just have an option to "git am" and friends to de-tabify the commit > message. > > It's particularly noticeable when people line things up using tabs (for the > kernel, it's often things like "cpu1 does X, cpu2 does Y"), and then when you > do "git log" it looks like a unholy mess, because the 4-char indentation of > the > log message ends up causing those things to not line up at all after all. > > The natural thing to do would be to pass in a "tab size" parameter to > strbuf_stripspace(), and default it to 0 (for no change), but have some way to > let people say "expand tabs to spaces at 8-character tab-stops" or similar > (but let people use different tab-stops if they want). > > Do people hate that idea? I may not get around to it for a while (it's the > kernel merge window right now), but I can write the patch eventually - I just > wanted to do an RFC first. Speaking partly as a consumer of the comments and partly as someone who generates the commits through APIs, I would ask that the commit tab handling semantic be more formalized than just tab size to strbuf_stripspace(). While it might seem a bit unfair to have to worry about non-git git clients, the detabbing can impact the other commit implementers (e.g., SourceTree, EGit, JGit, and the raft of process automation bits out there using JGit for cool stuff). Personally, I would prefer to have a normalized behaviour so that any bit of automation building a commit message would have a specific definition to go to (and hopefully comply with) in order to properly format the message for posterity and across all consumers. It might also be useful to have some ability to be presentation-compatible with legacy commits (done after this type of enhancement) so that a reasonable presentation can be done for those 8 year old commits that still have embedded tabs. Personally, I don't encourage tabs in commits myself and do see the value of this, but is this really restricted just to git am? Just my $0.02, Randall -- Brief whoami: NonStop developer since approximately UNIX(421664400)/NonStop(2112884442) -- In my real life, I talk too much. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tabs in commit messages - de-tabify option in strbuf_stripspace()?
On Tue, Mar 15, 2016 at 5:16 PM, Linus Torvaldswrote: > Do people hate that idea? I may not get around to it for a while (it's > the kernel merge window right now), but I can write the patch > eventually - I just wanted to do an RFC first. Could you point at some example to better understand the problem? Thanks, Stefan > > Linus > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html