Re: [PATCH] am: match --signoff to the original scripted version

2015-09-08 Thread Jeff King
On Sat, Sep 05, 2015 at 09:56:27PM -0700, Junio C Hamano wrote:

> +static void am_signoff(struct strbuf *sb)
> +{
> + char *cp;
> + struct strbuf mine = STRBUF_INIT;
> +
> + /* Does it end with our own sign-off? */
> + strbuf_addf(, "\n%s%s\n",
> + sign_off_header,
> + fmt_name(getenv("GIT_COMMITTER_NAME"),
> +  getenv("GIT_COMMITTER_EMAIL")));
> + if (mine.len < sb->len &&
> + !strcmp(mine.buf, sb->buf + sb->len - mine.len))
> + goto exit; /* no need to duplicate */

Here you insert the "\n" directly at the start of "mine", so the test
"does it contain S-o-b at the beginning of a line" does not count the
first line. Probably fine, as somebody putting a S-o-b in their subject
deserves whatever they get.

But...

> + /* Does it have any Signed-off-by: in the text */
> + for (cp = sb->buf;
> +  cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL;
> +  cp = strchr(cp, '\n')) {
> + if (sb->buf == cp || cp[-1] == '\n')
> + break;
> + }

Here you are more careful about finding S-o-b at sb->buf.

I don't think it really matters in practice, but it's an interesting
inconsistency.

Other than that (and I do not think it is worth re-rolling for this;
just an interesting observation), the patch looks OK to me.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] am: match --signoff to the original scripted version

2015-09-08 Thread Jeff King
On Sun, Sep 06, 2015 at 10:24:12AM -0700, Junio C Hamano wrote:

> >> +   /* Does it end with our own sign-off? */
> >> +   strbuf_addf(, "\n%s%s\n",
> >> +   sign_off_header,
> >> +   fmt_name(getenv("GIT_COMMITTER_NAME"),
> >> +getenv("GIT_COMMITTER_EMAIL")));
> >
> > Maybe use git_committer_info() here?
> 
> Perhaps, but I wanted to make sure I am doing the same thing as the
> codepath of sequencer.c::append_signoff(), which the original ended
> up calling.  git_committer_info() does way more than that, no?

Not really.  I think git_committer_info(IDENT_STRICT | IDENT_NO_DATE)
runs the exact same code, with one exception: we would also set the
ident_explicitly_given variables. But nobody in builtin/am.c calls
committer_ident_sufficiently_given(). And if they did, I think the
change would be an improvement.

> >> +   if (mine.len < sb->len &&
> >> +   !strcmp(mine.buf, sb->buf + sb->len - mine.len))
> >
> > Perhaps use ends_with()?
> 
> This caller already _knows_ how long the sb->buf string is; it is
> pointless to have ends_with() run strlen() on it.

That actually goes double. We know sb.len. The ends_with() function is
built around strip_suffix(), which both operate on strings.  But we do
not have ends_with_mem() built around strip_suffix_mem().

But we also know mine.len. Even strip_suffix_mem() assumes the suffix
itself is a string. So what you really want is:

  int strip_suffix_mem_mem(const char *buf, size_t *len,
   const char *suffix, size_t suffix_len);

and then you can trivially build the existing strip_suffix_mem() around
it, build strip_suffix() around that, and then build ends_with(),
ends_with_mem() and ends_with_mem_mem() around those. And don't forget
strbuf_ends_with(), strbuf_ends_with_mem(), and strbuf_ends_with_strbuf()

:) I am only half tongue in cheek. The proliferation of names is tedious
(and not appropriate for an -rc regression fix), but I do think the
resulting code is a lot more obvious as:

  if (strbuf_ends_with_strbuf(, ))
  ...

or even:

  if (ends_with_mem_mem(sb->buf, sb->len, mine.buf, mine.len))
  ...

Of course given that this is run only once per program, and that these
_are_ in fact strings, we can probably not bother to optimize it and
just accept:

  if (ends_with(sb->buf, mine.buf))
  ...

But if you want to go all-out on optimization, I think you can replace
your strcmp with memcmp:

  if (mine.len < sb->len &&
  !memcmp(mine.buf, sb->buf + sb->len - mine.len, mine.len))

(assuming that memcmp is in fact faster than strcmp).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] am: match --signoff to the original scripted version

2015-09-08 Thread Junio C Hamano
Jeff King  writes:

> On Sat, Sep 05, 2015 at 09:56:27PM -0700, Junio C Hamano wrote:
>
>> +static void am_signoff(struct strbuf *sb)
>> +{
>> +char *cp;
>> +struct strbuf mine = STRBUF_INIT;
>> +
>> +/* Does it end with our own sign-off? */
>> +strbuf_addf(, "\n%s%s\n",
>> +sign_off_header,
>> +fmt_name(getenv("GIT_COMMITTER_NAME"),
>> + getenv("GIT_COMMITTER_EMAIL")));
>> +if (mine.len < sb->len &&
>> +!strcmp(mine.buf, sb->buf + sb->len - mine.len))
>> +goto exit; /* no need to duplicate */
>
> Here you insert the "\n" directly at the start of "mine", so the test
> "does it contain S-o-b at the beginning of a line" does not count the
> first line. Probably fine, as somebody putting a S-o-b in their subject
> deserves whatever they get.

Yup.

>> +/* Does it have any Signed-off-by: in the text */
>> +for (cp = sb->buf;
>> + cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL;
>> + cp = strchr(cp, '\n')) {
>> +if (sb->buf == cp || cp[-1] == '\n')
>> +break;
>> +}
>
> Here you are more careful about finding S-o-b at sb->buf.

It is not being careful for that, actually.  It just is avoiding to
access sb->buf[-1], which would be a realproblem.  "The beginning of
buffer is also at the beginning of a line" is merely a happy side
effect ;-).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] am: match --signoff to the original scripted version

2015-09-06 Thread Paul Tan
Hi,

Thanks for handling this.

On Sun, Sep 6, 2015 at 12:56 PM, Junio C Hamano  wrote:
> Linus noticed that the recently reimplementated "git am -s" defines

s/reimplementated/reimplemented/ ?

> the trailer block too rigidly, resulting an unnecessary blank line

s/resulting an/resulting in an/ ?

> between the existing sign-offs and his new sign-off.  An e-mail
> submission sent to Linus in real life ends with mixture of sign-offs
> and commentaries, e.g.
>
> title here
>
> message here
>
> Signed-off-by: Original Author 
> [rv: tweaked frotz and nitfol]
> Signed-off-by: Re Viewer 
> Signed-off-by: Other Reviewer 
> ---
> patch here
>
> Because the reimplementation reused append_signoff() helper that is
> used by other codepaths, which is unaware that people intermix such
> comments with their sign-offs in the trailer block, such a message
> was judged to end with a non-trailer, resulting in an extra blank

s/extra blank/extra blank line/ ?

> before adding a new sign-off.
>
> The original scripted version of "git am" used a lot looser
> definition, i.e. "if and only if there is no line that begins with
> Signed-off-by:, add a blank line before adding a new sign-off".  For
> the upcoming release, stop using the append_signoff() in "git am"
> and reimplement the looser definition used by the scripted version
> to use only in "git am" to fix this regression in "am" while
> avoiding new regressions to other users of append_signoff().
>
> In the longer term, we should look into loosening append_signoff()
> so that other codepaths that add a new sign-off behave the same way
> as "git am -s", but that is a task for post-release.
>
> Reported-by: Linus Torvalds 
> Signed-off-by: Junio C Hamano 
> ---
>  builtin/am.c  | 31 +--
>  t/t4150-am.sh | 48 
>  2 files changed, 77 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/am.c b/builtin/am.c
> index 634f7a7..e7828e5 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1191,6 +1191,33 @@ static void NORETURN die_user_resolve(const struct 
> am_state *state)
> exit(128);
>  }
>
> +static void am_signoff(struct strbuf *sb)
> +{

Hmm, okay, but now we have two similarly named functions am_signoff()
and am_append_signoff() which both do nearly similar things, the only
difference being am_signoff() operates on a strbuf while
am_append_signoff() operates on the "msg" char* field in the am_state,
which seems a bit iffy to me. I wonder if the logic could be
implemented in am_append_signoff() instead so we have only one
function?

> +   char *cp;
> +   struct strbuf mine = STRBUF_INIT;
> +
> +   /* Does it end with our own sign-off? */
> +   strbuf_addf(, "\n%s%s\n",
> +   sign_off_header,
> +   fmt_name(getenv("GIT_COMMITTER_NAME"),
> +getenv("GIT_COMMITTER_EMAIL")));

Maybe use git_committer_info() here?

> +   if (mine.len < sb->len &&
> +   !strcmp(mine.buf, sb->buf + sb->len - mine.len))

Perhaps use ends_with()?

> +   goto exit; /* no need to duplicate */
> +
> +   /* Does it have any Signed-off-by: in the text */
> +   for (cp = sb->buf;
> +cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL;
> +cp = strchr(cp, '\n')) {
> +   if (sb->buf == cp || cp[-1] == '\n')
> +   break;
> +   }
> +
> +   strbuf_addstr(sb, mine.buf + !!cp);
> +exit:
> +   strbuf_release();
> +}
> +
>  /**
>   * Appends signoff to the "msg" field of the am_state.
>   */
> @@ -1199,7 +1226,7 @@ static void am_append_signoff(struct am_state *state)
> struct strbuf sb = STRBUF_INIT;
>
> strbuf_attach(, state->msg, state->msg_len, state->msg_len);
> -   append_signoff(, 0, 0);
> +   am_signoff();
> state->msg = strbuf_detach(, >msg_len);
>  }
>
> @@ -1303,7 +1330,7 @@ static int parse_mail(struct am_state *state, const 
> char *mail)
> stripspace(, 0);
>
> if (state->signoff)
> -   append_signoff(, 0, 0);
> +   am_signoff();
>
> assert(!state->author_name);
> state->author_name = strbuf_detach(_name, NULL);

Thanks again,
Paul
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] am: match --signoff to the original scripted version

2015-09-06 Thread Paul Tan
On Sun, Sep 6, 2015 at 12:56 PM, Junio C Hamano  wrote:
> diff --git a/builtin/am.c b/builtin/am.c
> index 634f7a7..e7828e5 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1191,6 +1191,33 @@ static void NORETURN die_user_resolve(const struct 
> am_state *state)
> exit(128);
>  }
>
> +static void am_signoff(struct strbuf *sb)
> +{
> +   char *cp;
> +   struct strbuf mine = STRBUF_INIT;
> +
> +   /* Does it end with our own sign-off? */
> +   strbuf_addf(, "\n%s%s\n",
> +   sign_off_header,
> +   fmt_name(getenv("GIT_COMMITTER_NAME"),
> +getenv("GIT_COMMITTER_EMAIL")));
> +   if (mine.len < sb->len &&
> +   !strcmp(mine.buf, sb->buf + sb->len - mine.len))
> +   goto exit; /* no need to duplicate */
> +
> +   /* Does it have any Signed-off-by: in the text */
> +   for (cp = sb->buf;
> +cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL;
> +cp = strchr(cp, '\n')) {
> +   if (sb->buf == cp || cp[-1] == '\n')
> +   break;
> +   }
> +
> +   strbuf_addstr(sb, mine.buf + !!cp);

To add on, I wonder if the above "add a blank line if there is no
Signed-off-by: at the beginning of a line" logic could be expressed
more succinctly like this:

if (!starts_with(sb->buf, "Signed-off-by: ") &&
!strstr(sb->buf, "\nSigned-off-by: "))
strbuf_addch(sb, '\n');

strbuf_addstr(sb, mine.buf + 1);

> +exit:
> +   strbuf_release();
> +}
> +
>  /**
>   * Appends signoff to the "msg" field of the am_state.
>   */

Regards,
Paul
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] am: match --signoff to the original scripted version

2015-09-06 Thread Junio C Hamano
Paul Tan  writes:

>> +   /* Does it have any Signed-off-by: in the text */
>> +   for (cp = sb->buf;
>> +cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL;
>> +cp = strchr(cp, '\n')) {
>> +   if (sb->buf == cp || cp[-1] == '\n')
>> +   break;
>> +   }
>> +
>> +   strbuf_addstr(sb, mine.buf + !!cp);
>
> To add on, I wonder if the above "add a blank line if there is no
> Signed-off-by: at the beginning of a line" logic could be expressed
> more succinctly like this:
>
> if (!starts_with(sb->buf, "Signed-off-by: ") &&
> !strstr(sb->buf, "\nSigned-off-by: "))
> strbuf_addch(sb, '\n');
>
> strbuf_addstr(sb, mine.buf + 1);

That may be more obvious, but I wanted to reuse the existing
sign_off_header[]; there is no variant that has a leading "end of
previous line".

I actually think it is not an uncommon thing to ask "Give me the
first occurrence of the string NEEDLE that appears at the beginning
of lines in BUF", so the right way to address the "is it more
readable" question would probably be to have a helper function to do
just that, and use it here and other places that answer that
question by themselves due to lack of such a helper.



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] am: match --signoff to the original scripted version

2015-09-06 Thread Junio C Hamano
Paul Tan  writes:

> s/reimplementated/reimplemented/ ?
> s/resulting an/resulting in an/ ?
> s/extra blank/extra blank line/ ?

Thanks.

>> +static void am_signoff(struct strbuf *sb)
>> +{
>
> Hmm, okay, but now we have two similarly named functions am_signoff()
> and am_append_signoff() which both do nearly similar things, the only
> difference being am_signoff() operates on a strbuf while
> am_append_signoff() operates on the "msg" char* field in the am_state,
> which seems a bit iffy to me.

I do recall advising you not to overuse strbuf in am_state, and
specifically not to use strbuf for a field like author_name, and the
criteria to tell why a field should not be a strbuf in am_state is
if the code used its strbuf-ness only because it is initially read
with strbuf_read_file(), but afterwards it is necessary to use the
field as a strbuf because the field is not modified later and is not
passed to a helper function that takes a strbuf.

I think the final code ended up following that piece of advice a bit
too aggressively.  The  pair in am_state did need to
be modified with sign-off, and it was done by passing it as a strbuf
to append_signoff() in the code we are fixing here; keeping msg
field that you wrote as a strbuf in the original would have made
am_append_signoff() unnecessary.

But this patch you are commenting on is purely for regression fix,
and I didn't want to change other things like data representation at
the same time.

>> +   /* Does it end with our own sign-off? */
>> +   strbuf_addf(, "\n%s%s\n",
>> +   sign_off_header,
>> +   fmt_name(getenv("GIT_COMMITTER_NAME"),
>> +getenv("GIT_COMMITTER_EMAIL")));
>
> Maybe use git_committer_info() here?

Perhaps, but I wanted to make sure I am doing the same thing as the
codepath of sequencer.c::append_signoff(), which the original ended
up calling.  git_committer_info() does way more than that, no?

>> +   if (mine.len < sb->len &&
>> +   !strcmp(mine.buf, sb->buf + sb->len - mine.len))
>
> Perhaps use ends_with()?

This caller already _knows_ how long the sb->buf string is; it is
pointless to have ends_with() run strlen() on it.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] am: match --signoff to the original scripted version

2015-09-06 Thread Linus Torvalds
On Sat, Sep 5, 2015 at 9:56 PM, Junio C Hamano  wrote:
>
>  For
> the upcoming release, stop using the append_signoff() in "git am"
> and reimplement the looser definition used by the scripted version
> to use only in "git am" to fix this regression in "am" while
> avoiding new regressions to other users of append_signoff().

I've tested this by re-doing the same patch-bomb from Andrew that I
had problems with originally, and it WorksForMe(tm).

Thanks,

Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] am: match --signoff to the original scripted version

2015-09-05 Thread Junio C Hamano
Linus noticed that the recently reimplementated "git am -s" defines
the trailer block too rigidly, resulting an unnecessary blank line
between the existing sign-offs and his new sign-off.  An e-mail
submission sent to Linus in real life ends with mixture of sign-offs
and commentaries, e.g.

title here

message here

Signed-off-by: Original Author 
[rv: tweaked frotz and nitfol]
Signed-off-by: Re Viewer 
Signed-off-by: Other Reviewer 
---
patch here

Because the reimplementation reused append_signoff() helper that is
used by other codepaths, which is unaware that people intermix such
comments with their sign-offs in the trailer block, such a message
was judged to end with a non-trailer, resulting in an extra blank
before adding a new sign-off.

The original scripted version of "git am" used a lot looser
definition, i.e. "if and only if there is no line that begins with
Signed-off-by:, add a blank line before adding a new sign-off".  For
the upcoming release, stop using the append_signoff() in "git am"
and reimplement the looser definition used by the scripted version
to use only in "git am" to fix this regression in "am" while
avoiding new regressions to other users of append_signoff().

In the longer term, we should look into loosening append_signoff()
so that other codepaths that add a new sign-off behave the same way
as "git am -s", but that is a task for post-release.

Reported-by: Linus Torvalds 
Signed-off-by: Junio C Hamano 
---

 * So this is essentially that "third try", but with a bit of test
   to cast in stone that we expect things that are not "key: value"
   in the trailer block.  Let's have this (or a fix along this line)
   as a regression fix for the upcoming release.

 builtin/am.c  | 31 +--
 t/t4150-am.sh | 48 
 2 files changed, 77 insertions(+), 2 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 634f7a7..e7828e5 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1191,6 +1191,33 @@ static void NORETURN die_user_resolve(const struct 
am_state *state)
exit(128);
 }
 
+static void am_signoff(struct strbuf *sb)
+{
+   char *cp;
+   struct strbuf mine = STRBUF_INIT;
+
+   /* Does it end with our own sign-off? */
+   strbuf_addf(, "\n%s%s\n",
+   sign_off_header,
+   fmt_name(getenv("GIT_COMMITTER_NAME"),
+getenv("GIT_COMMITTER_EMAIL")));
+   if (mine.len < sb->len &&
+   !strcmp(mine.buf, sb->buf + sb->len - mine.len))
+   goto exit; /* no need to duplicate */
+
+   /* Does it have any Signed-off-by: in the text */
+   for (cp = sb->buf;
+cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL;
+cp = strchr(cp, '\n')) {
+   if (sb->buf == cp || cp[-1] == '\n')
+   break;
+   }
+
+   strbuf_addstr(sb, mine.buf + !!cp);
+exit:
+   strbuf_release();
+}
+
 /**
  * Appends signoff to the "msg" field of the am_state.
  */
@@ -1199,7 +1226,7 @@ static void am_append_signoff(struct am_state *state)
struct strbuf sb = STRBUF_INIT;
 
strbuf_attach(, state->msg, state->msg_len, state->msg_len);
-   append_signoff(, 0, 0);
+   am_signoff();
state->msg = strbuf_detach(, >msg_len);
 }
 
@@ -1303,7 +1330,7 @@ static int parse_mail(struct am_state *state, const char 
*mail)
stripspace(, 0);
 
if (state->signoff)
-   append_signoff(, 0, 0);
+   am_signoff();
 
assert(!state->author_name);
state->author_name = strbuf_detach(_name, NULL);
diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index dd627c4..5e48555 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -873,4 +873,52 @@ test_expect_success 'am --message-id -s signs off after 
the message id' '
test_cmp expected actual
 '
 
+test_expect_success 'am -s unexpected trailer block' '
+   rm -fr .git/rebase-apply &&
+   git reset --hard &&
+   echo signed >file &&
+   git add file &&
+   cat >msg <<-EOF &&
+   subject here
+
+   Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>
+   [jc: tweaked log message]
+   Signed-off-by: J C H 
+   EOF
+   git commit -F msg &&
+   git cat-file commit HEAD | sed -e '1,/^$/d' >original &&
+   git format-patch --stdout -1 >patch &&
+
+   git reset --hard HEAD^ &&
+   git am -s patch &&
+   (
+   cat original &&
+   echo "Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+   ) >expect &&
+   git cat-file commit HEAD | sed -e '1,/^$/d' >actual &&
+   test_cmp expect actual &&
+
+   cat >msg <<-\EOF &&
+   subject here
+
+   We make sure that there is a blank line between the log
+