On Wed, 2020-10-21 at 20:31 +0530, Aditya Srivastava wrote: > Presence of hexadecimal address or symbol results in false warning > message by checkpatch.pl. > > For example, running checkpatch on commit b8ad540dd4e4 ("mptcp: fix > memory leak in mptcp_subflow_create_socket()") results in warning: > > WARNING:REPEATED_WORD: Possible repeated word: 'ff' > 00 00 00 00 00 00 00 00 00 2f 30 0a 81 88 ff ff ........./0.....
Right. > To avoid all such reports, add an additional regex check for a repeating > pattern of 4 or more 2-lettered words separated by space in a line. > A quick evaluation on v5.6..v5.8 showed that this fix reduces > REPEATED_WORD warnings from 2797 to 1043. Are many of the other 1043 false positives? Any pattern to them? > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl [] > @@ -3050,8 +3050,10 @@ sub process { > } > } > > -# check for repeated words separated by a single space > - if ($rawline =~ /^\+/ || $in_commit_log) { > +# check for repeated words separated by a single space and > +# avoid repeating hex occurrences like 'ff ff fe 09 ...' > + if (($rawline =~ /^\+/ || $in_commit_log) && > + $rawline !~ /(\b[0-9a-f]{2}( )+){4,}/) { This might be better as \b$Hex to avoid FF FF and FFFFFFFF FFFFFFFF I might add that check to the line below where the repeated words are checked against long --- scripts/checkpatch.pl | 1 + 1 file changed, 1 insertion(+) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index fab38b493cef..929866999f81 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -3062,6 +3062,7 @@ sub process { next if ($first ne $second); next if ($first eq 'long'); + next if ($first =~ /^$Hex$/; if (WARN("REPEATED_WORD", "Possible repeated word: '$first'\n" . $herecurr) &&