This takes the approach of hard-coding the words in a text file, rather
than trying to use a dynamic list of words. The inclusive naming project
provides a way to download their word list as JSON, but this is avoided
here for a few reasons:

* Only the base form of words are provided. We would need to analyze the
  part of speech and try to generate the other forms of the words.
* Some entries are more "example" than anything. For example, they
  provide "blackhat-whitehat" as a single entry, even though you're much
  more likely to come across the individual words, rather than this
  specific hyphenated variety. Similarly, "whitelist" is provided in the
  word list but "blacklist" is not.

If it turns out that the word list updates frequently, then it may be
worth moving to a more dynamic approach, at the expense of accuracy. For
now, this seems like a nice approach.

We do not consider this an error in checkpatch, but a warning. There are
some cases where non-inclusive words are acceptable, (such as
ovs_abort(), or referring to the "master" branch of a third-party repo).
This is why the warning only suggests to consider an alternative.

On a side note, running this patch through the updated checkpatch.py is
pretty funny.

Signed-off-by: Mark Michelson <mmich...@redhat.com>
---
 tests/checkpatch.at              | 38 ++++++++++++++++++++
 utilities/checkpatch.py          | 30 ++++++++++++++++
 utilities/excluded_word_list.txt | 59 ++++++++++++++++++++++++++++++++
 3 files changed, 127 insertions(+)
 create mode 100644 utilities/excluded_word_list.txt

diff --git a/tests/checkpatch.at b/tests/checkpatch.at
index 6ac0e51f3..79c02b229 100755
--- a/tests/checkpatch.at
+++ b/tests/checkpatch.at
@@ -642,5 +642,43 @@ try_checkpatch \
     #8 FILE: tests/something.at:1:
     C_H_E_C_K([as gw1 ovs-ofctl dump-flows br-int table=42 | grep 
"dl_dst=00:00:02:01:02:04" | wc -l], [0], [[1]])
 "
+AT_CLEANUP
+
+AT_SETUP([checkpatch - non-inclusive words])
+# This test does not extensively test every single word in the list of
+# non-inclusive words.
+
+# Try a simple exact match with a single word
+try_checkpatch \
+   "COMMON_PATCH_HEADER
+    +/* Let's be sure this change doesn't cripple our performance */
+    " \
+    "WARNING: Non-inclusive word 'cripple' found. Consider replacing.
+    #8 FILE: A.c:1:
+    /* Let's be sure this change doesn't cripple our performance */
+"
 
+# Put more than one word on the line, and use different forms of the word.
+try_checkpatch \
+   "COMMON_PATCH_HEADER
+    +/* The grandfathers are hallucinating again */
+    " \
+    "WARNING: Non-inclusive word 'grandfathers' found. Consider replacing.
+    WARNING: Non-inclusive word 'hallucinating' found. Consider replacing.
+    #8 FILE: A.c:1:
+    /* The grandfathers are hallucinating again */
+"
+
+# And finally, make sure punctuation, etc. don't interfere.
+try_checkpatch \
+   "COMMON_PATCH_HEADER
+    +/* Set up master/slave tribe, but don't abort! */
+    " \
+    "WARNING: Non-inclusive word 'abort' found. Consider replacing.
+    WARNING: Non-inclusive word 'master' found. Consider replacing.
+    WARNING: Non-inclusive word 'slave' found. Consider replacing.
+    WARNING: Non-inclusive word 'tribe' found. Consider replacing.
+    #8 FILE: A.c:1:
+    /* Set up master/slave tribe, but don't abort! */
+"
 AT_CLEANUP
diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py
index 35204daa2..9a06cf0a1 100755
--- a/utilities/checkpatch.py
+++ b/utilities/checkpatch.py
@@ -19,6 +19,8 @@ import getopt
 import os
 import re
 import sys
+import functools
+from pathlib import Path
 
 RETURN_CHECK_INITIAL_STATE = 0
 RETURN_CHECK_STATE_WITH_RETURN = 1
@@ -582,6 +584,32 @@ def empty_return_with_brace(line):
     return False
 
 
+@functools.cache
+def load_excluded_words():
+    parent_dir = Path(__file__).parent
+    with open(parent_dir / "excluded_word_list.txt", "r") as f:
+        return [line.strip() for line in f]
+
+
+def contains_non_inclusive_words(line):
+    # This returns true if a word is found that falls afoul of our inclusive
+    # language guidelines. The list of words is sourced from the Tier 1, Tier 
2,
+    # and Tier 3 word lists from https://inclusivenaming.org/word-lists/ .
+
+    excluded_words = load_excluded_words()
+
+    problem_found = False
+    for word in excluded_words:
+        match = re.search(rf'\b{word}\b', line, flags=re.IGNORECASE)
+        if match:
+            print_warning(
+                f"Non-inclusive word '{word}' found. Consider replacing."
+            )
+            problem_found = True
+
+    return problem_found
+
+
 file_checks = [
         {'regex': __regex_added_doc_rst,
          'check': check_new_docs_index},
@@ -668,6 +696,8 @@ checks = [
          lambda: print_warning("Use of hardcoded table=<NUMBER> or"
                                " resubmit=(,<NUMBER>) is discouraged in tests."
                                " Consider using MACRO instead.")},
+    {'regex': None, 'match_name': None,
+     'check': lambda x: contains_non_inclusive_words(x)},
 ]
 
 
diff --git a/utilities/excluded_word_list.txt b/utilities/excluded_word_list.txt
new file mode 100644
index 000000000..7a2ce4e09
--- /dev/null
+++ b/utilities/excluded_word_list.txt
@@ -0,0 +1,59 @@
+abort
+aborts
+aborting
+aborted
+abortion
+blackhat
+blackhats
+whitehat
+whitehats
+cripple
+cripples
+crippling
+cripplingly
+crippled
+grandfather
+grandfathers
+grandfathered
+grandfathering
+master
+masters
+slave
+slaves
+slaved
+slaving
+slavery
+slavish
+slavishly
+tribe
+tribes
+tribal
+tribally
+whitelist
+whitelists
+whitelisted
+whitelisting
+blacklist
+blacklists
+blacklisted
+blacklisting
+sanity-check
+sanity-checks
+sanity-checked
+sanity-checking
+blast-radius
+blast-radii
+hallucinate
+hallucinates
+hallucinated
+hallucinating
+hallucination
+man-hour
+man-hours
+man-in-the-middle
+men-in-the-middle
+segregate
+segregates
+segregated
+segregating
+segregation
-- 
2.45.2

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to