Re: More builtin git-am issues..
On Sat, Sep 5, 2015 at 9:39 PM, Junio C Hamano wrote: > Junio C Hamano writes: > >> To salvage "interpret-trailers" needs a lot more, as we are >> realizing that the definition that led to its external design does >> not match the way users use footers in the real world. This affects >> the internal data representation and the whole thing needs to be >> rethought. > > Note that I am not saying that you personally did any bad job while > working on the interpret-trailers topic. We collectively designed > its feature based on a much narrower definition of what the trailer > block is than what is used in the real world in practice, so we do > not have a good way to locate an existing entry that is not a (key, > value), no matter what the syntax to denote which is key and which > is value is, insert a new entry that is not a (key, value), or > remove an existing entry that is not a (key, value), all of which > will be necessary to mutate trailer blocks people use in the real > life. > > I think a good way forward would be to go like this: > > * a helper function that starts from a flat (or a >strbuf) and identifies the end of the body of the message, >i.e. find "^---$", "^Conflicts:$", etc. and skip blank lines >backwards. That is what ignore_non_trailer() in commit.c does, >and that can be shared across everybody that mutates a log >message. > > * a helper function that takes the result of the above as a flat > (or a strbuf) and identifies the beginning of a >trailer block. That may be just the matter of scanning backwards >from the end of the trailer block ignore_non_trailer() identified >for the first blank line, as I agree with Linus's "So quite >frankly, at that point if git doesn't recognize it as a sign-off >block, I don't think it's a big deal" in the thread. > >Not having that and not calling that function can reintroduce the >recent "interpret-trailers corner case" bug Matthieu brought up. > > With these, everybody except interpret-trailers that mutates a log > message can add a new signoff consistently. And then, building on > these, "interpret-trailers" can be written like this: > > (1) starting from a flat (or a strbuf), using the above > helpers, identify the parts of the log message that is the > trailer block (and you will know what is before and what is > after the trailer block). > > (2) keep the part before the trailer block and the part after the > trailer block (this could be empty) in one strbuf each; we do > not want to mutate these parts, and it is pointless to split > them further into individual lines. > > (3) express the lines in the trailer block in a richer data > structure that is easier to manipulate (i.e. reorder the lines, > insert, delete, etc.) and work on it. > > (4) when manipulation of the trailer block is finished, reconstruct > the resulting message by concatenating the "before trailer" > part, "trailer" part, and "after trailer" part. > > As to the exact design of "a richer data structure" and the > manipulation we may want on the trailer, I currently do not have a > strong "it should be this way" opinion yet, but after looking at > various examples Linus gave us in the discussion, my gut feelig is > that it would be best to keep the operation simple and generic, > e.g. "find a line that matches this regexp and replace it with this > line", "insert this line at the end", "delete all lines that match > this regexp", etc. I will see what I can do about this. Thanks, Christian. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
Junio C Hamano writes: > To salvage "interpret-trailers" needs a lot more, as we are > realizing that the definition that led to its external design does > not match the way users use footers in the real world. This affects > the internal data representation and the whole thing needs to be > rethought. Note that I am not saying that you personally did any bad job while working on the interpret-trailers topic. We collectively designed its feature based on a much narrower definition of what the trailer block is than what is used in the real world in practice, so we do not have a good way to locate an existing entry that is not a (key, value), no matter what the syntax to denote which is key and which is value is, insert a new entry that is not a (key, value), or remove an existing entry that is not a (key, value), all of which will be necessary to mutate trailer blocks people use in the real life. I think a good way forward would be to go like this: * a helper function that starts from a flat (or a strbuf) and identifies the end of the body of the message, i.e. find "^---$", "^Conflicts:$", etc. and skip blank lines backwards. That is what ignore_non_trailer() in commit.c does, and that can be shared across everybody that mutates a log message. * a helper function that takes the result of the above as a flat (or a strbuf) and identifies the beginning of a trailer block. That may be just the matter of scanning backwards from the end of the trailer block ignore_non_trailer() identified for the first blank line, as I agree with Linus's "So quite frankly, at that point if git doesn't recognize it as a sign-off block, I don't think it's a big deal" in the thread. Not having that and not calling that function can reintroduce the recent "interpret-trailers corner case" bug Matthieu brought up. With these, everybody except interpret-trailers that mutates a log message can add a new signoff consistently. And then, building on these, "interpret-trailers" can be written like this: (1) starting from a flat (or a strbuf), using the above helpers, identify the parts of the log message that is the trailer block (and you will know what is before and what is after the trailer block). (2) keep the part before the trailer block and the part after the trailer block (this could be empty) in one strbuf each; we do not want to mutate these parts, and it is pointless to split them further into individual lines. (3) express the lines in the trailer block in a richer data structure that is easier to manipulate (i.e. reorder the lines, insert, delete, etc.) and work on it. (4) when manipulation of the trailer block is finished, reconstruct the resulting message by concatenating the "before trailer" part, "trailer" part, and "after trailer" part. As to the exact design of "a richer data structure" and the manipulation we may want on the trailer, I currently do not have a strong "it should be this way" opinion yet, but after looking at various examples Linus gave us in the discussion, my gut feelig is that it would be best to keep the operation simple and generic, e.g. "find a line that matches this regexp and replace it with this line", "insert this line at the end", "delete all lines that match this regexp", etc. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
Junio C Hamano writes: > And a quick attempt without even compilation testing has flaws as > expected X-<. > > Second attempt. ... and I forget the de-dup logic. The third attempt. builtin/am.c | 32 ++-- 1 file changed, 30 insertions(+), 2 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index 83b3d86..30ffdde 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -1207,6 +1207,34 @@ static void NORETURN die_user_resolve(const struct am_state *state) exit(128); } +static void am_signoff(struct strbuf *sb) +{ + char *cp; + struct strbuf mine = STRBUF_INIT; + + /* Does it end with our own sign-off? */ + strbuf_addf(&mine, "%s%s\n", + sign_off_header, + fmt_name(getenv("GIT_COMMITTER_NAME"), +getenv("GIT_COMMITTER_EMAIL"))); + if (mine.len < sb->len && + !strcmp(mine.buf, sb->buf + sb->len - mine.len)) + goto exit; /* no need to duplicate */ + + /* Does it have any Signed-off-by: in the text */ + for (cp = sb->buf; +cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL; +cp = strchr(cp, '\n')) { + if (sb->buf == cp || cp[-1] == '\n') + break; + } + if (!cp) + strbuf_addch(sb, '\n'); + strbuf_addbuf(sb, &mine); +exit: + strbuf_release(&mine); +} + /** * Appends signoff to the "msg" field of the am_state. */ @@ -1215,7 +1243,7 @@ static void am_append_signoff(struct am_state *state) struct strbuf sb = STRBUF_INIT; strbuf_attach(&sb, state->msg, state->msg_len, state->msg_len); - append_signoff(&sb, 0, 0); + am_signoff(&sb); state->msg = strbuf_detach(&sb, &state->msg_len); } @@ -1319,7 +1347,7 @@ static int parse_mail(struct am_state *state, const char *mail) stripspace(&msg, 0); if (state->signoff) - append_signoff(&msg, 0, 0); + am_signoff(&msg); assert(!state->author_name); state->author_name = strbuf_detach(&author_name, NULL); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
Junio C Hamano writes: > But for the purpose of 2.6-rc period, I think we should start from > doing a separate implementation inside builtin/am.c without touching > append_signoff(). > ... > Here is a quick attempt to do the "just fix am regression without > changing anything else". And a quick attempt without even compilation testing has flaws as expected X-<. Second attempt. builtin/am.c | 25 +++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index 83b3d86..3a65d09 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -1207,6 +1207,27 @@ static void NORETURN die_user_resolve(const struct am_state *state) exit(128); } +static void am_signoff(struct strbuf *sb) +{ + char *cp; + + strbuf_complete_line(sb); + + for (cp = sb->buf; +cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL; +cp = strchr(cp, '\n')) { + if (sb->buf == cp || cp[-1] == '\n') + break; + } + if (!cp) + strbuf_addch(sb, '\n'); + strbuf_addf(sb, "%s%s\n", + sign_off_header, + fmt_name(getenv("GIT_COMMITTER_NAME"), +getenv("GIT_COMMITTER_EMAIL"))); + strbuf_addch(sb, '\n'); +} + /** * Appends signoff to the "msg" field of the am_state. */ @@ -1215,7 +1236,7 @@ static void am_append_signoff(struct am_state *state) struct strbuf sb = STRBUF_INIT; strbuf_attach(&sb, state->msg, state->msg_len, state->msg_len); - append_signoff(&sb, 0, 0); + am_signoff(&sb); state->msg = strbuf_detach(&sb, &state->msg_len); } @@ -1319,7 +1340,7 @@ static int parse_mail(struct am_state *state, const char *mail) stripspace(&msg, 0); if (state->signoff) - append_signoff(&msg, 0, 0); + am_signoff(&msg); assert(!state->author_name); state->author_name = strbuf_detach(&author_name, NULL); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
Jeff King writes: > [1] I think part of the reason people are interested in "fancy" here is > that this topic extends beyond "git am". There's "commit -s", of > course, but there's also the generic "interpret-trailers" command, > which is supposed to be a generalization of the "--signoff" option. > It would be nice if the rules remained consistent for when we add a > trailer to an existing block, rather than special-casing signoffs. > > But again, I think that's something to shoot for in the long run. > It's more important in the short term not to regress "am". Yes. I personally think the global append_signoff() everybody else uses can and should implement the same logic (whichever the exact logic is) as whatever is used by "am" because the caller that makes a call to that function knows it is adding a "Signed-off-by:" line, so existing "Signed-off-by" lines are already special in that context. But for the purpose of 2.6-rc period, I think we should start from doing a separate implementation inside builtin/am.c without touching append_signoff(). We can do a more thoughtful refactoring for append_signoff() in the next cycle. Another thing that needs consideration is the other user of the has_conforming_footer() helper append_signoff() uses, which is the codepath to add "cherry-picked-from"; it is less obvious that "find Signed-off-by: anywhere in the text to decide if the new footer needs its own paragraph" is a good logic in that context. To salvage "interpret-trailers" needs a lot more, as we are realizing that the definition that led to its external design does not match the way users use footers in the real world. This affects the internal data representation and the whole thing needs to be rethought. Here is a quick attempt to do the "just fix am regression without changing anything else". builtin/am.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/builtin/am.c b/builtin/am.c index 83b3d86..c63d238 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -1213,9 +1213,24 @@ static void NORETURN die_user_resolve(const struct am_state *state) static void am_append_signoff(struct am_state *state) { struct strbuf sb = STRBUF_INIT; + char *cp; strbuf_attach(&sb, state->msg, state->msg_len, state->msg_len); - append_signoff(&sb, 0, 0); + strbuf_complete_line(&sb); + + for (cp = sb.buf; +cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL; +cp = strchr(cp, '\n')) { + if (sb.buf < cp && cp[-1] == '\n') + break; + } + if (!cp) + strbuf_addch(&sb, '\n'); + strbuf_addf(&sb, "%s%s\n", + sign_off_header, + fmt_name(getenv("GIT_COMMITTER_NAME"), +getenv("GIT_COMMITTER_EMAIL"))); + strbuf_addch(&sb, '\n'); state->msg = strbuf_detach(&sb, &state->msg_len); } -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
Linus Torvalds writes: > That said, the original "git am" rules actually seem to be rather > straightforward: it's never an issue about "last block of text", and > it's simply an issue of "is there a sign-ff _anywhere_ in the text". > > That simplicity has a certain appeal to me. I don't think it was > necessarily written that way because it was "well designed" - I > suspect it is more an issue of "easy to implement in a shell-script". Guilty as pointed out. > Four out of 119 emails may not be a big percentage, but it does mean > that it's not horribly unusual either.. Sure. Thanks for these examples. I was aware that people do strange things with the footer, but with the first example of [akpm] comment at the very beginning alone, I wouldn't have guessed why intermixing one-liner comments directly in the chain of Signed-of-by: lines made any sense. Call it lack of imagination, but each sign-off optionally prefixed by a single-liner summary of what change was done makes sense why people do want to use these lines that way. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
Junio C Hamano writes: > Johannes Sixt writes: > >> Why do we need a new rule? The old git-am had a logic that pleased >> everyone, and it must have been implemented somewhere. Shouldn't it be >> sufficient to just re-implement or re-use that logic? > ... > Perhaps we would need to tell has_conforming_footer() function who > the caller is, and use a different logic (i.e. "just re-implement") > when the caller is append_signoff(). Ah, I think I misunderstood you. "just re-implement" could be "not to use the existing helper used by other callers and instead do it manually all inside builtin/am.c", and I agree that it would be the safest way to go. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
Johannes Sixt writes: > Why do we need a new rule? The old git-am had a logic that pleased > everyone, and it must have been implemented somewhere. Shouldn't it be > sufficient to just re-implement or re-use that logic? If you look at the helper the rewritten "am" calls, you will notice that it is used by other things that wants to know whether a log message has the final "trailer" block, and the reason why the callers want to know is not limited to "I want to add a sign-off". What your "just re-implement" means is "change the world order drastically to other callers who do not want such a special casing for sign-off". That was why I tried to see if a slight tweak to the rule shared by all callers that would be much less likely to break these other callers can satisfy the need of "am". Perhaps we would need to tell has_conforming_footer() function who the caller is, and use a different logic (i.e. "just re-implement") when the caller is append_signoff(). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
On Sat, Sep 05, 2015 at 09:30:27AM +0200, Johannes Sixt wrote: > >How about a bit looser rule like this? > > > > A block of text at the end of the message, each and every > > line in which must match "^[^: ]+:[ ]" (that is, > > a "keyword" that does not contain a whitespace nor a colon, > > followed by a colon and whitespace, and arbitrary value thru > > the end of line) is a signature block. > > Why do we need a new rule? The old git-am had a logic that pleased everyone, > and it must have been implemented somewhere. Shouldn't it be sufficient to > just re-implement or re-use that logic? That was my thought, too; if there is a regression, let's start by fixing that for the upcoming 2.6.0, and then we can worry about doing something fancier[1] later. And I think the original behavior really is what Linus is asking for: we consider the final block, even with a "[]" comment, as a S-o-b block if it has a Signed-off-by in it. So if we have the final block: [some comment] Signed-off-by: Somebody Else Running "am -s" with the current master yields: [some comment] Signed-off-by: Somebody Else Signed-off-by: Jeff King which is wrong. Running the same with v2.5.0 gets: [some comment] Signed-off-by: Somebody Else Signed-off-by: Jeff King So far so good. Now let's change our input to: [some comment] Reviewed-by: Somebody Else and run "am -s". Current "master" continues to stick the extra line in there. No surprise. But now so does v2.5.0! So I don't think the old behavior covered all cases well, either, and there's room for improvement. But likewise, I don't recall seeing a lot of complaints about it in practice. It seems like a sane thing to restore it for the upcoming release, and then build from there. -Peff [1] I think part of the reason people are interested in "fancy" here is that this topic extends beyond "git am". There's "commit -s", of course, but there's also the generic "interpret-trailers" command, which is supposed to be a generalization of the "--signoff" option. It would be nice if the rules remained consistent for when we add a trailer to an existing block, rather than special-casing signoffs. But again, I think that's something to shoot for in the long run. It's more important in the short term not to regress "am". -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
Am 05.09.2015 um 02:54 schrieb Junio C Hamano: Linus Torvalds writes: So I think that logic should basically be extended to saying - if any line in the last chunk has a "Signed-off-by:", set a flag. - at the end of the loop, if that flag wasn't set, return 0. I am reluctant to special case S-o-b: too much, even though this is about "am -s" and by definition S-o-b: is special, as that is what we are adding after all. How about a bit looser rule like this? A block of text at the end of the message, each and every line in which must match "^[^:]+:[ ]" (that is, a "keyword" that does not contain a whitespace nor a colon, followed by a colon and whitespace, and arbitrary value thru the end of line) is a signature block. Why do we need a new rule? The old git-am had a logic that pleased everyone, and it must have been implemented somewhere. Shouldn't it be sufficient to just re-implement or re-use that logic? -- Hannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
On Fri, Sep 4, 2015 at 6:23 PM, Junio C Hamano wrote: > > OK, I didn't know that was acceptable in the kernel community > to have random comments like that inside the block. Can these > comments span multiple paragraphs? What I am wondering is what > you want to see in a case like this: > > Signed-off-by: Noam Camus > Acked-by: Vineet Gupta > [ Also removed pointless cast from "void *". - Linus] > Signed-off-by: Linus Torvalds > [ Ahh, we have to tell at least these people > > - stakeholder class 1 > - staeholder class 2 > ] > Cc: foo > Cc: bar So quite frankly, at that point if git doesn't recognize it as a sign-off block, I don't think it's a big deal. That said, the original "git am" rules actually seem to be rather straightforward: it's never an issue about "last block of text", and it's simply an issue of "is there a sign-ff _anywhere_ in the text". That simplicity has a certain appeal to me. I don't think it was necessarily written that way because it was "well designed" - I suspect it is more an issue of "easy to implement in a shell-script". And it's possible that I'm mis-reading the scripts too. It's not like I _remember_ what the exact behavior was, I just think it used to work really well for us (ie I don't recall seeing lots of those empty lines in the middle of the sign-off block before, and this current merge window it happened for four of the emails I applied from Andrew's 119-patch series.. Four out of 119 emails may not be a big percentage, but it does mean that it's not horribly unusual either.. Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
Linus Torvalds writes: > On Fri, Sep 4, 2015 at 6:06 PM, Junio C Hamano wrote: >> >> that rule would still not think this is a signature block, but at >> that point, do we really want to consider such a block of text a >> signature block? > > So exactly why are you arguing for these rules that are known to break > in real life, that I gave actual examples for existing,... Our mails crossed. I was working from the [akpm] comment at the very beginning while you were giving more examples that inserts comments in the middle which I didn't see. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
Linus Torvalds writes: > The thing is, and that was what the report was all about, not every > line _is_ of that format. We have commetns from the sign-off people. > Things like this: > > Signed-off-by: Noam Camus > Acked-by: Vineet Gupta > [ Also removed pointless cast from "void *". - Linus ] > Signed-off-by: Linus Torvalds OK, I didn't know that was acceptable in the kernel community to have random comments like that inside the block. Can these comments span multiple paragraphs? What I am wondering is what you want to see in a case like this: Signed-off-by: Noam Camus Acked-by: Vineet Gupta [ Also removed pointless cast from "void *". - Linus] Signed-off-by: Linus Torvalds [ Ahh, we have to tell at least these people - stakeholder class 1 - staeholder class 2 ] Cc: foo Cc: bar -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
On Fri, Sep 4, 2015 at 6:06 PM, Junio C Hamano wrote: > > that rule would still not think this is a signature block, but at > that point, do we really want to consider such a block of text a > signature block? So exactly why are you arguing for these rules that are known to break in real life, that I gave actual examples for existing, and that I also gave an actual example for not just giving a false negative, but also a false positive? I'm also pretty sure that what you are arguing for is a regression. Now, as mentioned, it may well be true that we've had this odd behavior before, and it's not a real regression - I may just have picked up on this problem because I've been more careful. Maybe I didn't notice these problems before. But looking at the old git-am.sh script, it does simply seem to look for that '^Signed-off-by:' pattern. It did ADD_SIGNOFF=$( test "$LAST_SIGNED_OFF_BY" = "$SIGNOFF" || { test '' = "$LAST_SIGNED_OFF_BY" && echo echo "$SIGNOFF" }) which seems to literally just check the last sign-off line it found. If it matches the new sign-off, it doesn't do anything (and doesn't add the new one either), and if it doesn't exist at all (so it's empty) it adds teh empty line. Quite frankly, that not only worked for a long time, it's simply less ambiguous than your made up rule. It's very simple. "if you find a sign-off in the commit message, don't add an new empty line before the new signoff". Much better than "every line in the last set of lines must match some weak format that isn't even true, and is too non-specific anyway". Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
Linus Torvalds writes: > but the failing cases have a comment by Andrew: > > [a...@linux-foundation.org: coding-style fixes] > Signed-off-by: Tang Chen > Cc: Xishi Qiu > Cc: Yasuaki Ishimatsu > Cc: Kamezawa Hiroyuki > Cc: Taku Izumi > Cc: Gu Zheng > Cc: Naoya Horiguchi > Cc: Vlastimil Babka > Cc: Mel Gorman > Cc: David Rientjes > Cc: [4.2.x] > Signed-off-by: Andrew Morton > > Signed-off-by: Linus Torvalds > > ie that "[a...@linux-foundation.org: coding-style fixes]" makes git am > now decide that the previous block of text was not a sign-off block, > so it adds an empty line before adding my sign-off. But very obviously > it *was* a sign-off block. Ahh, OK, scratch what I said earlier. The user intended this to be sign-off block, but the convention append_signoff() was taught from very earlier days is that the sign-off block must consist of block of text all of which look like rfc2822 "keyword: value" header lines, and the comment thing makes it a non-conforming header. Perhaps A block of text at the end of the existing text could be a signature block. If all its lines that are rfc2822-like are at its end, then it is a sign-off block. Otherwise it is not. would allow the leading non-signature lines in the above example. If the comment line (which I would say should have been separated by a blank line from the signature block if only to make it easier to read the whole thing) were in the middle, e.g. > Signed-off-by: Tang Chen > Cc: Xishi Qiu > Cc: Yasuaki Ishimatsu > Cc: Vlastimil Babka > Cc: Mel Gorman > [a...@linux-foundation.org: coding-style fixes] > Cc: David Rientjes > Cc: [4.2.x] > Signed-off-by: Andrew Morton that rule would still not think this is a signature block, but at that point, do we really want to consider such a block of text a signature block? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
On Fri, Sep 4, 2015 at 5:54 PM, Junio C Hamano wrote: > > How about a bit looser rule like this? > > A block of text at the end of the message, each and every > line in which must match "^[^: ]+:[ ]" (that is, > a "keyword" that does not contain a whitespace nor a colon, > followed by a colon and whitespace, and arbitrary value thru > the end of line) is a signature block. No. That's still broken. The thing is, and that was what the report was all about, not every line _is_ of that format. We have commetns from the sign-off people. Things like this: Signed-off-by: Noam Camus Acked-by: Vineet Gupta [ Also removed pointless cast from "void *". - Linus ] Signed-off-by: Linus Torvalds or Signed-off-by: Andi Kleen [ Updated comments and changelog a bit. ] Signed-off-by: Peter Zijlstra (Intel) Link: http://lkml.kernel.org/r/1424225886-18652-3-git-send-email-a...@firstfloor.org Signed-off-by: Ingo Molnar so no, it is simply not true that "every line must match". I'm not even seeing why you argue for that, since clearly having a sign-off-line is actually a safer choice too. The "every line must match" rule is bad, not just because it's not true like above, but also because it can be true without it being a sign-off block. For example, it's not at all unlikely that you have perfectly normal comments that just list some subsystem and their changes. Which could easily look like Trivial fixes all over the tree drm: fix whitespace mm: speeling errors kernel: indentation and codign style The above looks like a perfectly sane commit log to me. Do you seriously think that it makes for a better "sign-off block test" than one that actually checks for "is there a sign-off line"? I'd much rather have special cases like testing for specific keywords or looking for things that look like emails, than make it about being "every line has this very generic format". Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
Linus Torvalds writes: > So I think that logic should basically be extended to saying > > - if any line in the last chunk has a "Signed-off-by:", set a flag. > > - at the end of the loop, if that flag wasn't set, return 0. I am reluctant to special case S-o-b: too much, even though this is about "am -s" and by definition S-o-b: is special, as that is what we are adding after all. How about a bit looser rule like this? A block of text at the end of the message, each and every line in which must match "^[^: ]+:[ ]" (that is, a "keyword" that does not contain a whitespace nor a colon, followed by a colon and whitespace, and arbitrary value thru the end of line) is a signature block. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
On Fri, Sep 4, 2015 at 5:07 PM, Jeff King wrote: > > Do we want to make Signed-off-by a special token here? What about "Cc:", > and other project-specific trailers? You wouldn't want to end up with: > > [some comment] > Cc: somebody > > Signed-off-by: somebody else > > It's not a problem for git-am, which should be taking in patches that > are already signed-off (and after all, if your project does not use > signoffs, why are you using "am -s"?). But won't "git commit -s" run > into the same problem? So I'm a *bit* worried about taking anything else than Signed-off-by: exactly because of the problem you mention: > We could look for an arbitrary rfc2822-ish string, but I'd be worried > slightly about false positives, like: > > This is a paragraph with arbitrary text. But one > of the lines will use a colon, and that a causes a > problem: because of wrapping, this line kind of looks > like a trailer. which clearly needs an empty line before the sign-off. And even if we limit it to just the last line, like you suggest: > Maybe only the final line needs to look rfc2822-ish? I'd still worry, for the same exact reason. The "rfc2822-ish" check is _so_ weak that it can be trivially triggered by normal text. So maybe it doesn't require a strict "Signed-off-by:" match, but I think it needs something stricter than the found_rfc2822 format (like maybe "looks like a name/email"). We just don't want to require that *all* the lines are that way, because at least in the kernel we often do end up adding small comments in that section. The git project sign-off rules seem to be stricter, and it looks like it's almost universally of the form "Signed-off-by:" with a few "Helped-by:" and "Reviewed-by:". In the kernel, we really do tend to be messier in that area, with the sign-off chunk not just having the sign-offs and cc's etc, but also tends to have those occasional small notes left by the people forwarding the emails. And we also often don't have _strictly_ just an email. We tend to have things like Cc: # v3.13+ etc, and sometimes we don't have have any email at all, but things like Reported-by: coverity Tested-by: MysterX on #openelec Generated-by: Coccinelle SmPL so it's hard to give a really strict rule. It's fairly free-format. That said, I would argue that when you apply a patch with sign-off, pretty much by definition that patch _should_ have a sign-off from the originator. So I suspect that the "there is an existing sign-off in that last chunk" is a fairly good rule. Even if there are lots of other lines too. If you're using "git commit -s" to just commit your own work, you presumably would normally want the extra sign-off. Or at least you can do a "git commit --amend" fairly easily to fix things up. Doing the amending later for "git am" is rather impractical, when it might have been a series of 100+ emails.. Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
On Fri, Sep 04, 2015 at 04:52:42PM -0700, Linus Torvalds wrote: > On Fri, Sep 4, 2015 at 4:47 PM, Linus Torvalds > wrote: > > > > I *think* it's this part: > > > > if (!(found_rfc2822 || > > is_cherry_picked_from_line(buf + i, k - i - 1))) > > return 0; > > > > which basically returns 0 for _any_ line in the footer that doesn't > > match the found_rfc2822 format. > > Confirmed. I hacked up a version that just doesn't do that check at > all, and it works fine (but obviously only on well-formatted emails > that really do have a sign-off). > > So I think that logic should basically be extended to saying > > - if any line in the last chunk has a "Signed-off-by:", set a flag. > > - at the end of the loop, if that flag wasn't set, return 0. Do we want to make Signed-off-by a special token here? What about "Cc:", and other project-specific trailers? You wouldn't want to end up with: [some comment] Cc: somebody Signed-off-by: somebody else It's not a problem for git-am, which should be taking in patches that are already signed-off (and after all, if your project does not use signoffs, why are you using "am -s"?). But won't "git commit -s" run into the same problem? We could look for an arbitrary rfc2822-ish string, but I'd be worried slightly about false positives, like: This is a paragraph with arbitrary text. But one of the lines will use a colon, and that a causes a problem: because of wrapping, this line kind of looks like a trailer. Maybe only the final line needs to look rfc2822-ish? Or maybe we can constrain the content that _doesn't_ look like a trailer line. Is it sufficient to allow a block encased in "[]", rather than arbitrary text? All that being said, I think we can punt on the harder issues for now, and just restore the original git-am.sh behavior (which it looks like was fairly naive) for the upcoming release, so that there is no regression. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More builtin git-am issues..
On Fri, Sep 4, 2015 at 4:47 PM, Linus Torvalds wrote: > > I *think* it's this part: > > if (!(found_rfc2822 || > is_cherry_picked_from_line(buf + i, k - i - 1))) > return 0; > > which basically returns 0 for _any_ line in the footer that doesn't > match the found_rfc2822 format. Confirmed. I hacked up a version that just doesn't do that check at all, and it works fine (but obviously only on well-formatted emails that really do have a sign-off). So I think that logic should basically be extended to saying - if any line in the last chunk has a "Signed-off-by:", set a flag. - at the end of the loop, if that flag wasn't set, return 0. Instead of that thing that basically returns zero immediately when it sees a line it doesn't like. I'm in the middle of my merge window, so I'm not going to get around to writing a patch until that's over. Hopefully somebody will step up in the meantime. Hint, hint. Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html