On Sun, Sep 20, 2020 at 12:06 AM Lukas Bulwahn <lukas.bulw...@gmail.com> wrote: > > > > On Sat, 19 Sep 2020, Joe Perches wrote: > > > On Sat, 2020-09-19 at 20:12 +0200, Lukas Bulwahn wrote: > > > > > > On Sat, 19 Sep 2020, Joe Perches wrote: > > > > > > > On Sat, 2020-09-19 at 13:42 +0530, Dwaipayan Ray wrote: > > > > > Checkpatch did not handle cases where the author From: header > > > > > was split into two lines. The author string went empty and > > > > > checkpatch generated a false NO_AUTHOR_SIGN_OFF warning. > > > > > > > > It's good to provide an example where the current code > > > > doesn't work. > > > > > > > > > > Joe, as this is a linux-kernel-mentees patch, we discussed that before > > > reaching out to you; you can find Dwaipayan's own evaluation here: > > > > > > https://lore.kernel.org/linux-kernel-mentees/cabjpp5botg0qlfsarjtb2vaz_hjf229oaqihhkg4syd35i_...@mail.gmail.com/ > > > > > > Dwaipayan, Joe's comment is still valid; it would be good to describe > > > the reasons why patches might have split lines (as far as see, long > > > encodings for non-ascii names). > > > > > > I will run my own evaluation of checkpatch.pl before and after patch > > > application on Monday and then check if I can confirm Dwaipayan's results. > > > > > > > It likely would be better to do this by searching forward for > > > > any extension lines after a "^From:' rather than searching > > > > backwards as there can be any number of extension lines. > > > > > > > > > > Just to sure what you are talking about... > > > > > > You mean just to access the next line through the lines array, rather > > > than using prevheader and trying to decode that one line twice. > > > > > > I agree the logic is a bit redundant and complicated at the moment. > > > > > > Once prevheader is non-empty, it already clear that author is '' and > > > prevheader decodes with that match, because that is the only way to > > > make prevheader non-empty in the first place; at least as far I see it > > > right now. > > > > Yeah, something like this (completely untested): > > --- > > scripts/checkpatch.pl | 8 ++++++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl > > index 3e474072aa90..2c710d05b184 100755 > > --- a/scripts/checkpatch.pl > > +++ b/scripts/checkpatch.pl > > @@ -2679,9 +2679,13 @@ sub process { > > } > > > > # Check the patch for a From: > > - if (decode("MIME-Header", $line) =~ /^From:\s*(.*)/) { > > + if ($line =~ /^From:\s*(.*)/i) { > > $author = $1; > > - $author = encode("utf8", $author) if ($line =~ > > /=\?utf-8\?/i); > > + my $curline = $linenr; > > + while (defined($rawlines[$curline] && > > $rawlines[$curline++] =~ /^ \s*(.*)/) { > > + $author .= $1; > > + } > > + $author = encode("utf8", $author) if ($author =~ > > /=\?utf-8\?/i); > > $author =~ s/"//g; > > $author = reformat_email($author); > > } > > >
Hi, Yeah I think the backwards checking was pretty redundant after all. If the extended encoding went too long, the From: header would be split into more than two lines and my proposed solution would fail. Thanks for the heads up, Joe! > Yeah, I get how you would like to see that being implemented. I will work > with Dwaipayan to get that properly implemented, properly described and > tested. > > But let us keep the fun of that task to Dwaipayan... that is what a > mentorship is all about :) > > Lukas Yes definitely, the task is interesting for me, and I would like to solve it in a proper way. As for the fix, shouldn't we stop the author string concatenation once an email address is found? something like: last if $rawlines[$curline] = ~/^\s*(\S+\@\S+)\s*/ I will update the patch and sync up with Lukas on this. Thanks, Dwaipayan.