Re: More builtin git-am issues..

2015-09-07 Thread Christian Couder
On Sat, Sep 5, 2015 at 9:39 PM, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> To salvage "interpret-trailers" needs a lot more, as we are
>> realizing that the definition that led to its external design does
>> not match the way users use footers in the real world.  This affects
>> the internal data representation and the whole thing needs to be
>> rethought.
>
> Note that I am not saying that you personally did any bad job while
> working on the interpret-trailers topic.  We collectively designed
> its feature based on a much narrower definition of what the trailer
> block is than what is used in the real world in practice, so we do
> not have a good way to locate an existing entry that is not a (key,
> value), no matter what the syntax to denote which is key and which
> is value is, insert a new entry that is not a (key, value), or
> remove an existing entry that is not a (key, value), all of which
> will be necessary to mutate trailer blocks people use in the real
> life.
>
> I think a good way forward would be to go like this:
>
>  * a helper function that starts from a flat  (or a
>strbuf) and identifies the end of the body of the message,
>i.e. find "^---$", "^Conflicts:$", etc. and skip blank lines
>backwards.  That is what ignore_non_trailer() in commit.c does,
>and that can be shared across everybody that mutates a log
>message.
>
>  * a helper function that takes the result of the above as a flat
> (or a strbuf) and identifies the beginning of a
>trailer block.  That may be just the matter of scanning backwards
>from the end of the trailer block ignore_non_trailer() identified
>for the first blank line, as I agree with Linus's "So quite
>frankly, at that point if git doesn't recognize it as a sign-off
>block, I don't think it's a big deal" in the thread.
>
>Not having that and not calling that function can reintroduce the
>recent "interpret-trailers corner case" bug Matthieu brought up.
>
> With these, everybody except interpret-trailers that mutates a log
> message can add a new signoff consistently.  And then, building on
> these, "interpret-trailers" can be written like this:
>
>  (1) starting from a flat  (or a strbuf), using the above
>  helpers, identify the parts of the log message that is the
>  trailer block (and you will know what is before and what is
>  after the trailer block).
>
>  (2) keep the part before the trailer block and the part after the
>  trailer block (this could be empty) in one strbuf each; we do
>  not want to mutate these parts, and it is pointless to split
>  them further into individual lines.
>
>  (3) express the lines in the trailer block in a richer data
>  structure that is easier to manipulate (i.e. reorder the lines,
>  insert, delete, etc.) and work on it.
>
>  (4) when manipulation of the trailer block is finished, reconstruct
>  the resulting message by concatenating the "before trailer"
>  part, "trailer" part, and "after trailer" part.
>
> As to the exact design of "a richer data structure" and the
> manipulation we may want on the trailer, I currently do not have a
> strong "it should be this way" opinion yet, but after looking at
> various examples Linus gave us in the discussion, my gut feelig is
> that it would be best to keep the operation simple and generic,
> e.g. "find a line that matches this regexp and replace it with this
> line", "insert this line at the end", "delete all lines that match
> this regexp", etc.

I will see what I can do about this.

Thanks,
Christian.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-05 Thread Jeff King
On Sat, Sep 05, 2015 at 09:30:27AM +0200, Johannes Sixt wrote:

> >How about a bit looser rule like this?
> >
> > A block of text at the end of the message, each and every
> > line in which must match "^[^:  ]+:[  ]" (that is,
> > a "keyword" that does not contain a whitespace nor a colon,
> > followed by a colon and whitespace, and arbitrary value thru
> > the end of line) is a signature block.
> 
> Why do we need a new rule? The old git-am had a logic that pleased everyone,
> and it must have been implemented somewhere. Shouldn't it be sufficient to
> just re-implement or re-use that logic?

That was my thought, too; if there is a regression, let's start by
fixing that for the upcoming 2.6.0, and then we can worry about doing
something fancier[1] later.

And I think the original behavior really is what Linus is asking for: we
consider the final block, even with a "[]" comment, as a S-o-b block if
it has a Signed-off-by in it. So if we have the final block:

[some comment]
Signed-off-by: Somebody Else 

Running "am -s" with the current master yields:

[some comment]
Signed-off-by: Somebody Else 

Signed-off-by: Jeff King 

which is wrong. Running the same with v2.5.0 gets:

[some comment]
Signed-off-by: Somebody Else 
Signed-off-by: Jeff King 

So far so good. Now let's change our input to:

[some comment]
Reviewed-by: Somebody Else 

and run "am -s".  Current "master" continues to stick the extra line in
there. No surprise. But now so does v2.5.0!

So I don't think the old behavior covered all cases well, either, and
there's room for improvement. But likewise, I don't recall seeing a lot
of complaints about it in practice. It seems like a sane thing to
restore it for the upcoming release, and then build from there.

-Peff

[1] I think part of the reason people are interested in "fancy" here is
that this topic extends beyond "git am". There's "commit -s", of
course, but there's also the generic "interpret-trailers" command,
which is supposed to be a generalization of the "--signoff" option.
It would be nice if the rules remained consistent for when we add a
trailer to an existing block, rather than special-casing signoffs.

But again, I think that's something to shoot for in the long run.
It's more important in the short term not to regress "am".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-05 Thread Johannes Sixt

Am 05.09.2015 um 02:54 schrieb Junio C Hamano:

Linus Torvalds  writes:


So I think that logic should basically be extended to saying

  - if any line in the last chunk has a "Signed-off-by:", set a flag.

  - at the end of the loop, if that flag wasn't set, return 0.


I am reluctant to special case S-o-b: too much, even though this is
about "am -s" and by definition S-o-b: is special, as that is what
we are adding after all.

How about a bit looser rule like this?

 A block of text at the end of the message, each and every
 line in which must match "^[^:]+:[  ]" (that is,
 a "keyword" that does not contain a whitespace nor a colon,
 followed by a colon and whitespace, and arbitrary value thru
 the end of line) is a signature block.


Why do we need a new rule? The old git-am had a logic that pleased 
everyone, and it must have been implemented somewhere. Shouldn't it be 
sufficient to just re-implement or re-use that logic?


-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-05 Thread Junio C Hamano
Junio C Hamano  writes:

> To salvage "interpret-trailers" needs a lot more, as we are
> realizing that the definition that led to its external design does
> not match the way users use footers in the real world.  This affects
> the internal data representation and the whole thing needs to be
> rethought.

Note that I am not saying that you personally did any bad job while
working on the interpret-trailers topic.  We collectively designed
its feature based on a much narrower definition of what the trailer
block is than what is used in the real world in practice, so we do
not have a good way to locate an existing entry that is not a (key,
value), no matter what the syntax to denote which is key and which
is value is, insert a new entry that is not a (key, value), or
remove an existing entry that is not a (key, value), all of which
will be necessary to mutate trailer blocks people use in the real
life.

I think a good way forward would be to go like this:

 * a helper function that starts from a flat  (or a
   strbuf) and identifies the end of the body of the message,
   i.e. find "^---$", "^Conflicts:$", etc. and skip blank lines
   backwards.  That is what ignore_non_trailer() in commit.c does,
   and that can be shared across everybody that mutates a log
   message.

 * a helper function that takes the result of the above as a flat
    (or a strbuf) and identifies the beginning of a
   trailer block.  That may be just the matter of scanning backwards
   from the end of the trailer block ignore_non_trailer() identified
   for the first blank line, as I agree with Linus's "So quite
   frankly, at that point if git doesn't recognize it as a sign-off
   block, I don't think it's a big deal" in the thread.

   Not having that and not calling that function can reintroduce the
   recent "interpret-trailers corner case" bug Matthieu brought up.

With these, everybody except interpret-trailers that mutates a log
message can add a new signoff consistently.  And then, building on
these, "interpret-trailers" can be written like this:

 (1) starting from a flat  (or a strbuf), using the above
 helpers, identify the parts of the log message that is the
 trailer block (and you will know what is before and what is
 after the trailer block).

 (2) keep the part before the trailer block and the part after the
 trailer block (this could be empty) in one strbuf each; we do
 not want to mutate these parts, and it is pointless to split
 them further into individual lines.

 (3) express the lines in the trailer block in a richer data
 structure that is easier to manipulate (i.e. reorder the lines,
 insert, delete, etc.) and work on it.

 (4) when manipulation of the trailer block is finished, reconstruct
 the resulting message by concatenating the "before trailer"
 part, "trailer" part, and "after trailer" part.

As to the exact design of "a richer data structure" and the
manipulation we may want on the trailer, I currently do not have a
strong "it should be this way" opinion yet, but after looking at
various examples Linus gave us in the discussion, my gut feelig is
that it would be best to keep the operation simple and generic,
e.g. "find a line that matches this regexp and replace it with this
line", "insert this line at the end", "delete all lines that match
this regexp", etc.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-05 Thread Junio C Hamano
Junio C Hamano  writes:

> Johannes Sixt  writes:
>
>> Why do we need a new rule? The old git-am had a logic that pleased
>> everyone, and it must have been implemented somewhere. Shouldn't it be
>> sufficient to just re-implement or re-use that logic?
> ...
> Perhaps we would need to tell has_conforming_footer() function who
> the caller is, and use a different logic (i.e. "just re-implement")
> when the caller is append_signoff().

Ah, I think I misunderstood you.  "just re-implement" could be "not
to use the existing helper used by other callers and instead do it
manually all inside builtin/am.c", and I agree that it would be the
safest way to go.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-05 Thread Junio C Hamano
Junio C Hamano  writes:

> But for the purpose of 2.6-rc period, I think we should start from
> doing a separate implementation inside builtin/am.c without touching
> append_signoff().
> ...
> Here is a quick attempt to do the "just fix am regression without
> changing anything else".

And a quick attempt without even compilation testing has flaws as
expected X-<.

Second attempt.

 builtin/am.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 83b3d86..3a65d09 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1207,6 +1207,27 @@ static void NORETURN die_user_resolve(const struct 
am_state *state)
exit(128);
 }
 
+static void am_signoff(struct strbuf *sb)
+{
+   char *cp;
+
+   strbuf_complete_line(sb);
+
+   for (cp = sb->buf;
+cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL;
+cp = strchr(cp, '\n')) {
+   if (sb->buf == cp || cp[-1] == '\n')
+   break;
+   }
+   if (!cp)
+   strbuf_addch(sb, '\n');
+   strbuf_addf(sb, "%s%s\n",
+   sign_off_header,
+   fmt_name(getenv("GIT_COMMITTER_NAME"),
+getenv("GIT_COMMITTER_EMAIL")));
+   strbuf_addch(sb, '\n');
+}
+
 /**
  * Appends signoff to the "msg" field of the am_state.
  */
@@ -1215,7 +1236,7 @@ static void am_append_signoff(struct am_state *state)
struct strbuf sb = STRBUF_INIT;
 
strbuf_attach(, state->msg, state->msg_len, state->msg_len);
-   append_signoff(, 0, 0);
+   am_signoff();
state->msg = strbuf_detach(, >msg_len);
 }
 
@@ -1319,7 +1340,7 @@ static int parse_mail(struct am_state *state, const char 
*mail)
stripspace(, 0);
 
if (state->signoff)
-   append_signoff(, 0, 0);
+   am_signoff();
 
assert(!state->author_name);
state->author_name = strbuf_detach(_name, NULL);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-05 Thread Junio C Hamano
Junio C Hamano  writes:

> And a quick attempt without even compilation testing has flaws as
> expected X-<.
>
> Second attempt.

... and I forget the de-dup logic.  The third attempt.

 builtin/am.c | 32 ++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 83b3d86..30ffdde 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1207,6 +1207,34 @@ static void NORETURN die_user_resolve(const struct 
am_state *state)
exit(128);
 }
 
+static void am_signoff(struct strbuf *sb)
+{
+   char *cp;
+   struct strbuf mine = STRBUF_INIT;
+
+   /* Does it end with our own sign-off? */
+   strbuf_addf(, "%s%s\n",
+   sign_off_header,
+   fmt_name(getenv("GIT_COMMITTER_NAME"),
+getenv("GIT_COMMITTER_EMAIL")));
+   if (mine.len < sb->len &&
+   !strcmp(mine.buf, sb->buf + sb->len - mine.len))
+   goto exit; /* no need to duplicate */
+
+   /* Does it have any Signed-off-by: in the text */
+   for (cp = sb->buf;
+cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL;
+cp = strchr(cp, '\n')) {
+   if (sb->buf == cp || cp[-1] == '\n')
+   break;
+   }
+   if (!cp)
+   strbuf_addch(sb, '\n');
+   strbuf_addbuf(sb, );
+exit:
+   strbuf_release();
+}
+
 /**
  * Appends signoff to the "msg" field of the am_state.
  */
@@ -1215,7 +1243,7 @@ static void am_append_signoff(struct am_state *state)
struct strbuf sb = STRBUF_INIT;
 
strbuf_attach(, state->msg, state->msg_len, state->msg_len);
-   append_signoff(, 0, 0);
+   am_signoff();
state->msg = strbuf_detach(, >msg_len);
 }
 
@@ -1319,7 +1347,7 @@ static int parse_mail(struct am_state *state, const char 
*mail)
stripspace(, 0);
 
if (state->signoff)
-   append_signoff(, 0, 0);
+   am_signoff();
 
assert(!state->author_name);
state->author_name = strbuf_detach(_name, NULL);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-05 Thread Junio C Hamano
Linus Torvalds  writes:

> That said, the original "git am" rules actually seem to be rather
> straightforward: it's never an issue about "last block of text", and
> it's simply an issue of "is there a sign-ff _anywhere_ in the text".
>
> That simplicity has a certain appeal to me. I don't think it was
> necessarily written that way because it was "well designed" - I
> suspect it is more an issue of "easy to implement in a shell-script".

Guilty as pointed out.

> Four out of 119 emails may not be a big percentage, but it does mean
> that it's not horribly unusual either..

Sure.  Thanks for these examples.  I was aware that people do
strange things with the footer, but with the first example of [akpm]
comment at the very beginning alone, I wouldn't have guessed why
intermixing one-liner comments directly in the chain of
Signed-of-by: lines made any sense.  Call it lack of imagination,
but each sign-off optionally prefixed by a single-liner summary of
what change was done makes sense why people do want to use these
lines that way.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-05 Thread Junio C Hamano
Johannes Sixt  writes:

> Why do we need a new rule? The old git-am had a logic that pleased
> everyone, and it must have been implemented somewhere. Shouldn't it be
> sufficient to just re-implement or re-use that logic?

If you look at the helper the rewritten "am" calls, you will notice
that it is used by other things that wants to know whether a log
message has the final "trailer" block, and the reason why the
callers want to know is not limited to "I want to add a sign-off".

What your "just re-implement" means is "change the world order
drastically to other callers who do not want such a special casing
for sign-off".

That was why I tried to see if a slight tweak to the rule shared by
all callers that would be much less likely to break these other
callers can satisfy the need of "am".

Perhaps we would need to tell has_conforming_footer() function who
the caller is, and use a different logic (i.e. "just re-implement")
when the caller is append_signoff().
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-05 Thread Junio C Hamano
Jeff King  writes:

> [1] I think part of the reason people are interested in "fancy" here is
> that this topic extends beyond "git am". There's "commit -s", of
> course, but there's also the generic "interpret-trailers" command,
> which is supposed to be a generalization of the "--signoff" option.
> It would be nice if the rules remained consistent for when we add a
> trailer to an existing block, rather than special-casing signoffs.
>
> But again, I think that's something to shoot for in the long run.
> It's more important in the short term not to regress "am".

Yes.  I personally think the global append_signoff() everybody else
uses can and should implement the same logic (whichever the exact
logic is) as whatever is used by "am" because the caller that makes
a call to that function knows it is adding a "Signed-off-by:" line,
so existing "Signed-off-by" lines are already special in that context.

But for the purpose of 2.6-rc period, I think we should start from
doing a separate implementation inside builtin/am.c without touching
append_signoff().

We can do a more thoughtful refactoring for append_signoff() in the
next cycle.  Another thing that needs consideration is the other
user of the has_conforming_footer() helper append_signoff() uses,
which is the codepath to add "cherry-picked-from"; it is less
obvious that "find Signed-off-by: anywhere in the text to decide if
the new footer needs its own paragraph" is a good logic in that
context.

To salvage "interpret-trailers" needs a lot more, as we are
realizing that the definition that led to its external design does
not match the way users use footers in the real world.  This affects
the internal data representation and the whole thing needs to be
rethought.

Here is a quick attempt to do the "just fix am regression without
changing anything else".


 builtin/am.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/builtin/am.c b/builtin/am.c
index 83b3d86..c63d238 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1213,9 +1213,24 @@ static void NORETURN die_user_resolve(const struct 
am_state *state)
 static void am_append_signoff(struct am_state *state)
 {
struct strbuf sb = STRBUF_INIT;
+   char *cp;
 
strbuf_attach(, state->msg, state->msg_len, state->msg_len);
-   append_signoff(, 0, 0);
+   strbuf_complete_line();
+
+   for (cp = sb.buf;
+cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL;
+cp = strchr(cp, '\n')) {
+   if (sb.buf < cp && cp[-1] == '\n')
+   break;
+   }
+   if (!cp)
+   strbuf_addch(, '\n');
+   strbuf_addf(, "%s%s\n",
+   sign_off_header,
+   fmt_name(getenv("GIT_COMMITTER_NAME"),
+getenv("GIT_COMMITTER_EMAIL")));
+   strbuf_addch(, '\n');
state->msg = strbuf_detach(, >msg_len);
 }
 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-04 Thread Linus Torvalds
On Fri, Sep 4, 2015 at 4:47 PM, Linus Torvalds
 wrote:
>
> I *think* it's this part:
>
> if (!(found_rfc2822 ||
>   is_cherry_picked_from_line(buf + i, k - i - 1)))
> return 0;
>
> which basically returns 0 for _any_ line in the footer that doesn't
> match the found_rfc2822 format.

Confirmed. I hacked up a version that just doesn't do that check at
all, and it works fine (but obviously only on well-formatted emails
that really do have a sign-off).

So I think that logic should basically be extended to saying

 - if any line in the last chunk has a "Signed-off-by:", set a flag.

 - at the end of the loop, if that flag wasn't set, return 0.

Instead of that thing that basically returns zero immediately when it
sees a line it doesn't like.

I'm in the middle of my merge window, so I'm not going to get around
to writing a patch until that's over. Hopefully somebody will step up
in the meantime. Hint, hint.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-04 Thread Jeff King
On Fri, Sep 04, 2015 at 04:52:42PM -0700, Linus Torvalds wrote:

> On Fri, Sep 4, 2015 at 4:47 PM, Linus Torvalds
>  wrote:
> >
> > I *think* it's this part:
> >
> > if (!(found_rfc2822 ||
> >   is_cherry_picked_from_line(buf + i, k - i - 1)))
> > return 0;
> >
> > which basically returns 0 for _any_ line in the footer that doesn't
> > match the found_rfc2822 format.
> 
> Confirmed. I hacked up a version that just doesn't do that check at
> all, and it works fine (but obviously only on well-formatted emails
> that really do have a sign-off).
> 
> So I think that logic should basically be extended to saying
> 
>  - if any line in the last chunk has a "Signed-off-by:", set a flag.
> 
>  - at the end of the loop, if that flag wasn't set, return 0.

Do we want to make Signed-off-by a special token here? What about "Cc:",
and other project-specific trailers? You wouldn't want to end up with:

  [some comment]
  Cc: somebody

  Signed-off-by: somebody else

It's not a problem for git-am, which should be taking in patches that
are already signed-off (and after all, if your project does not use
signoffs, why are you using "am -s"?). But won't "git commit -s" run
into the same problem?

We could look for an arbitrary rfc2822-ish string, but I'd be worried
slightly about false positives, like:

  This is a paragraph with arbitrary text. But one
  of the lines will use a colon, and that a causes a
  problem: because of wrapping, this line kind of looks
  like a trailer.

Maybe only the final line needs to look rfc2822-ish? Or maybe we can
constrain the content that _doesn't_ look like a trailer line. Is it
sufficient to allow a block encased in "[]", rather than arbitrary text?

All that being said, I think we can punt on the harder issues for now,
and just restore the original git-am.sh behavior (which it looks
like was fairly naive) for the upcoming release, so that there is no
regression.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-04 Thread Junio C Hamano
Linus Torvalds  writes:

> So I think that logic should basically be extended to saying
>
>  - if any line in the last chunk has a "Signed-off-by:", set a flag.
>
>  - at the end of the loop, if that flag wasn't set, return 0.

I am reluctant to special case S-o-b: too much, even though this is
about "am -s" and by definition S-o-b: is special, as that is what
we are adding after all.

How about a bit looser rule like this?

A block of text at the end of the message, each and every
line in which must match "^[^:  ]+:[  ]" (that is,
a "keyword" that does not contain a whitespace nor a colon,
followed by a colon and whitespace, and arbitrary value thru
the end of line) is a signature block.




--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-04 Thread Junio C Hamano
Linus Torvalds  writes:

> but the failing cases have a comment by Andrew:
>
> [a...@linux-foundation.org: coding-style fixes]
> Signed-off-by: Tang Chen 
> Cc: Xishi Qiu 
> Cc: Yasuaki Ishimatsu 
> Cc: Kamezawa Hiroyuki 
> Cc: Taku Izumi 
> Cc: Gu Zheng 
> Cc: Naoya Horiguchi 
> Cc: Vlastimil Babka 
> Cc: Mel Gorman 
> Cc: David Rientjes 
> Cc: [4.2.x]
> Signed-off-by: Andrew Morton 
>
> Signed-off-by: Linus Torvalds 
>
> ie that "[a...@linux-foundation.org: coding-style fixes]" makes git am
> now decide that the previous block of text was not a sign-off block,
> so it adds an empty line before adding my sign-off. But very obviously
> it *was* a sign-off block.

Ahh, OK, scratch what I said earlier.  The user intended this to be
sign-off block, but the convention append_signoff() was taught from
very earlier days is that the sign-off block must consist of block
of text all of which look like rfc2822 "keyword: value" header lines,
and the comment thing makes it a non-conforming header.

Perhaps

A block of text at the end of the existing text could be a
signature block.  If all its lines that are rfc2822-like are at
its end, then it is a sign-off block.  Otherwise it is not.

would allow the leading non-signature lines in the above example.

If the comment line (which I would say should have been separated by
a blank line from the signature block if only to make it easier to
read the whole thing) were in the middle, e.g.

> Signed-off-by: Tang Chen 
> Cc: Xishi Qiu 
> Cc: Yasuaki Ishimatsu 
> Cc: Vlastimil Babka 
> Cc: Mel Gorman 
> [a...@linux-foundation.org: coding-style fixes]
> Cc: David Rientjes 
> Cc: [4.2.x]
> Signed-off-by: Andrew Morton 

that rule would still not think this is a signature block, but at
that point, do we really want to consider such a block of text a
signature block?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-04 Thread Linus Torvalds
On Fri, Sep 4, 2015 at 6:06 PM, Junio C Hamano  wrote:
>
> that rule would still not think this is a signature block, but at
> that point, do we really want to consider such a block of text a
> signature block?

So exactly why are you arguing for these rules that are known to break
in real life, that I gave actual examples for existing, and that I
also gave an actual example for not just giving a false negative, but
also a false positive?

I'm also pretty sure that what you are arguing for is a regression.

Now, as mentioned, it may well be true that we've had this odd
behavior before, and it's not a real regression - I may just have
picked up on this problem because I've been more careful. Maybe I
didn't notice these problems before.

But looking at the old git-am.sh script, it does simply seem to look
for that '^Signed-off-by:' pattern. It did

  ADD_SIGNOFF=$(
  test "$LAST_SIGNED_OFF_BY" = "$SIGNOFF" || {
  test '' = "$LAST_SIGNED_OFF_BY" && echo
  echo "$SIGNOFF"
  })

which seems to literally just check the last sign-off line it found.
If it matches the new sign-off, it doesn't do anything (and doesn't
add the new one either), and if it doesn't exist at all (so it's
empty) it adds teh empty line.

Quite frankly, that not only worked for a long time, it's simply less
ambiguous than your made up rule.

It's very simple. "if you find a sign-off in the commit message, don't
add an new empty line before the new signoff".

Much better than "every line in the last set of lines must match some
weak format that isn't even true, and is too non-specific anyway".

Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-04 Thread Linus Torvalds
On Fri, Sep 4, 2015 at 6:23 PM, Junio C Hamano  wrote:
>
> OK, I didn't know that was acceptable in the kernel community
> to have random comments like that inside the block.  Can these
> comments span multiple paragraphs?  What I am wondering is what
> you want to see in a case like this:
>
>  Signed-off-by: Noam Camus 
>  Acked-by: Vineet Gupta 
>  [ Also removed pointless cast from "void *". - Linus]
>  Signed-off-by: Linus Torvalds 
>  [ Ahh, we have to tell at least these people
>
>   - stakeholder class 1
>   - staeholder class 2
>  ]
>  Cc: foo
>  Cc: bar

So quite frankly, at that point if git doesn't recognize it as a
sign-off block, I don't think it's a big deal.

That said, the original "git am" rules actually seem to be rather
straightforward: it's never an issue about "last block of text", and
it's simply an issue of "is there a sign-ff _anywhere_ in the text".

That simplicity has a certain appeal to me. I don't think it was
necessarily written that way because it was "well designed" - I
suspect it is more an issue of "easy to implement in a shell-script".

And it's possible that I'm mis-reading the scripts too. It's not like
I _remember_ what the exact behavior was, I just think it used to work
really well for us (ie I don't recall seeing lots of those empty lines
in the middle of the sign-off block before, and this current merge
window it happened for four of the emails I applied from Andrew's
119-patch series..

Four out of 119 emails may not be a big percentage, but it does mean
that it's not horribly unusual either..

Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-04 Thread Linus Torvalds
On Fri, Sep 4, 2015 at 5:07 PM, Jeff King  wrote:
>
> Do we want to make Signed-off-by a special token here? What about "Cc:",
> and other project-specific trailers? You wouldn't want to end up with:
>
>   [some comment]
>   Cc: somebody
>
>   Signed-off-by: somebody else
>
> It's not a problem for git-am, which should be taking in patches that
> are already signed-off (and after all, if your project does not use
> signoffs, why are you using "am -s"?). But won't "git commit -s" run
> into the same problem?

So I'm a *bit* worried about taking anything else than Signed-off-by:
exactly because of the problem you mention:

> We could look for an arbitrary rfc2822-ish string, but I'd be worried
> slightly about false positives, like:
>
>   This is a paragraph with arbitrary text. But one
>   of the lines will use a colon, and that a causes a
>   problem: because of wrapping, this line kind of looks
>   like a trailer.

which clearly needs an empty line before the sign-off.

And even if we limit it to just the last line, like you suggest:

> Maybe only the final line needs to look rfc2822-ish?

I'd still worry, for the same exact reason. The "rfc2822-ish" check is
_so_ weak that it can be trivially triggered by normal text.

So maybe it doesn't require a strict "Signed-off-by:" match, but I
think it needs something stricter than the found_rfc2822 format (like
maybe "looks like a name/email").  We just don't want to require that
*all* the lines are that way, because at least in the kernel we often
do end up adding small comments in that section.

The git project sign-off rules seem to be stricter, and it looks like
it's almost universally of the form "Signed-off-by:" with a few
"Helped-by:" and "Reviewed-by:". In the kernel, we really do tend to
be messier in that area, with the sign-off chunk not just having the
sign-offs and cc's etc, but also tends to have those occasional small
notes left by the people forwarding the emails.

And we also often don't have _strictly_ just an email. We tend to have
things like

Cc:  # v3.13+

etc, and sometimes we don't have have any email at all, but things like

Reported-by: coverity
Tested-by: MysterX on #openelec
Generated-by: Coccinelle SmPL

so it's hard to give a really strict rule. It's fairly free-format.

That said, I would argue that when you apply a patch with sign-off,
pretty much by definition that patch _should_ have a sign-off from the
originator. So I suspect that the "there is an existing sign-off in
that last chunk" is a fairly good rule. Even if there are lots of
other lines too.

If you're using "git commit -s" to just commit your own work, you
presumably would normally want the extra sign-off. Or at least you can
do a "git commit --amend" fairly easily to fix things up. Doing the
amending later for "git am" is rather impractical, when it might have
been a series of 100+ emails..

  Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-04 Thread Linus Torvalds
On Fri, Sep 4, 2015 at 5:54 PM, Junio C Hamano  wrote:
>
> How about a bit looser rule like this?
>
> A block of text at the end of the message, each and every
> line in which must match "^[^:  ]+:[  ]" (that is,
> a "keyword" that does not contain a whitespace nor a colon,
> followed by a colon and whitespace, and arbitrary value thru
> the end of line) is a signature block.

No. That's still broken.

The thing is, and that was what the report was all about, not every
line _is_ of that format. We have commetns from the sign-off people.
Things like this:

Signed-off-by: Noam Camus 
Acked-by: Vineet Gupta 
[ Also removed pointless cast from "void *".  - Linus ]
Signed-off-by: Linus Torvalds 

or

Signed-off-by: Andi Kleen 
[ Updated comments and changelog a bit. ]
Signed-off-by: Peter Zijlstra (Intel) 
Link: 
http://lkml.kernel.org/r/1424225886-18652-3-git-send-email-a...@firstfloor.org
Signed-off-by: Ingo Molnar 

so no, it is simply not true that "every line must match".

I'm not even seeing why you argue for that, since clearly having a
sign-off-line is actually a safer choice too. The "every line must
match" rule is bad, not just because it's not true like above, but
also because it can be true without it being a sign-off block.

For example, it's not at all unlikely that you have perfectly normal
comments that just list some subsystem and their changes. Which could
easily look like

   Trivial fixes all over the tree

   drm: fix whitespace
   mm: speeling errors
   kernel: indentation and codign style

The above looks like a perfectly sane commit log to me.

Do you seriously think that it makes for a better "sign-off block
test" than one that actually checks for "is there a sign-off line"?

I'd much rather have special cases like testing for specific keywords
or looking for things that look like emails, than make it about being
"every line has this very generic format".

 Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-04 Thread Junio C Hamano
Linus Torvalds  writes:

> On Fri, Sep 4, 2015 at 6:06 PM, Junio C Hamano  wrote:
>>
>> that rule would still not think this is a signature block, but at
>> that point, do we really want to consider such a block of text a
>> signature block?
>
> So exactly why are you arguing for these rules that are known to break
> in real life, that I gave actual examples for existing,...

Our mails crossed.  I was working from the [akpm] comment at the
very beginning while you were giving more examples that inserts
comments in the middle which I didn't see.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More builtin git-am issues..

2015-09-04 Thread Junio C Hamano
Linus Torvalds  writes:

> The thing is, and that was what the report was all about, not every
> line _is_ of that format. We have commetns from the sign-off people.
> Things like this:
>
> Signed-off-by: Noam Camus 
> Acked-by: Vineet Gupta 
> [ Also removed pointless cast from "void *".  - Linus ]
> Signed-off-by: Linus Torvalds 

OK, I didn't know that was acceptable in the kernel community
to have random comments like that inside the block.  Can these
comments span multiple paragraphs?  What I am wondering is what
you want to see in a case like this:

 Signed-off-by: Noam Camus 
 Acked-by: Vineet Gupta 
 [ Also removed pointless cast from "void *". - Linus]
 Signed-off-by: Linus Torvalds 
 [ Ahh, we have to tell at least these people

  - stakeholder class 1
  - staeholder class 2
 ]
 Cc: foo
 Cc: bar


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html