This takes the approach of hard-coding the words in a text file, rather than trying to use a dynamic list of words. The inclusive naming project provides a way to download their word list as JSON, but this is avoided here for a few reasons:
* Only the base form of words are provided. We would need to analyze the part of speech and try to generate the other forms of the words. * Some entries are more "example" than anything. For example, they provide "blackhat-whitehat" as a single entry, even though you're much more likely to come across the individual words, rather than this specific hyphenated variety. Similarly, "whitelist" is provided in the word list but "blacklist" is not. If it turns out that the word list updates frequently, then it may be worth moving to a more dynamic approach, at the expense of accuracy. For now, this seems like a nice approach. We do not consider this an error in checkpatch, but a warning. There are some cases where non-inclusive words are acceptable, (such as ovs_abort(), or referring to the "master" branch of a third-party repo). This is why the warning only suggests to consider an alternative. On a side note, running this patch through the updated checkpatch.py is pretty funny. Signed-off-by: Mark Michelson <mmich...@redhat.com> --- tests/checkpatch.at | 38 ++++++++++++++++++++ utilities/checkpatch.py | 30 ++++++++++++++++ utilities/excluded_word_list.txt | 59 ++++++++++++++++++++++++++++++++ 3 files changed, 127 insertions(+) create mode 100644 utilities/excluded_word_list.txt diff --git a/tests/checkpatch.at b/tests/checkpatch.at index 6ac0e51f3..79c02b229 100755 --- a/tests/checkpatch.at +++ b/tests/checkpatch.at @@ -642,5 +642,43 @@ try_checkpatch \ #8 FILE: tests/something.at:1: C_H_E_C_K([as gw1 ovs-ofctl dump-flows br-int table=42 | grep "dl_dst=00:00:02:01:02:04" | wc -l], [0], [[1]]) " +AT_CLEANUP + +AT_SETUP([checkpatch - non-inclusive words]) +# This test does not extensively test every single word in the list of +# non-inclusive words. + +# Try a simple exact match with a single word +try_checkpatch \ + "COMMON_PATCH_HEADER + +/* Let's be sure this change doesn't cripple our performance */ + " \ + "WARNING: Non-inclusive word 'cripple' found. Consider replacing. + #8 FILE: A.c:1: + /* Let's be sure this change doesn't cripple our performance */ +" +# Put more than one word on the line, and use different forms of the word. +try_checkpatch \ + "COMMON_PATCH_HEADER + +/* The grandfathers are hallucinating again */ + " \ + "WARNING: Non-inclusive word 'grandfathers' found. Consider replacing. + WARNING: Non-inclusive word 'hallucinating' found. Consider replacing. + #8 FILE: A.c:1: + /* The grandfathers are hallucinating again */ +" + +# And finally, make sure punctuation, etc. don't interfere. +try_checkpatch \ + "COMMON_PATCH_HEADER + +/* Set up master/slave tribe, but don't abort! */ + " \ + "WARNING: Non-inclusive word 'abort' found. Consider replacing. + WARNING: Non-inclusive word 'master' found. Consider replacing. + WARNING: Non-inclusive word 'slave' found. Consider replacing. + WARNING: Non-inclusive word 'tribe' found. Consider replacing. + #8 FILE: A.c:1: + /* Set up master/slave tribe, but don't abort! */ +" AT_CLEANUP diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py index 35204daa2..9a06cf0a1 100755 --- a/utilities/checkpatch.py +++ b/utilities/checkpatch.py @@ -19,6 +19,8 @@ import getopt import os import re import sys +import functools +from pathlib import Path RETURN_CHECK_INITIAL_STATE = 0 RETURN_CHECK_STATE_WITH_RETURN = 1 @@ -582,6 +584,32 @@ def empty_return_with_brace(line): return False +@functools.cache +def load_excluded_words(): + parent_dir = Path(__file__).parent + with open(parent_dir / "excluded_word_list.txt", "r") as f: + return [line.strip() for line in f] + + +def contains_non_inclusive_words(line): + # This returns true if a word is found that falls afoul of our inclusive + # language guidelines. The list of words is sourced from the Tier 1, Tier 2, + # and Tier 3 word lists from https://inclusivenaming.org/word-lists/ . + + excluded_words = load_excluded_words() + + problem_found = False + for word in excluded_words: + match = re.search(rf'\b{word}\b', line, flags=re.IGNORECASE) + if match: + print_warning( + f"Non-inclusive word '{word}' found. Consider replacing." + ) + problem_found = True + + return problem_found + + file_checks = [ {'regex': __regex_added_doc_rst, 'check': check_new_docs_index}, @@ -668,6 +696,8 @@ checks = [ lambda: print_warning("Use of hardcoded table=<NUMBER> or" " resubmit=(,<NUMBER>) is discouraged in tests." " Consider using MACRO instead.")}, + {'regex': None, 'match_name': None, + 'check': lambda x: contains_non_inclusive_words(x)}, ] diff --git a/utilities/excluded_word_list.txt b/utilities/excluded_word_list.txt new file mode 100644 index 000000000..7a2ce4e09 --- /dev/null +++ b/utilities/excluded_word_list.txt @@ -0,0 +1,59 @@ +abort +aborts +aborting +aborted +abortion +blackhat +blackhats +whitehat +whitehats +cripple +cripples +crippling +cripplingly +crippled +grandfather +grandfathers +grandfathered +grandfathering +master +masters +slave +slaves +slaved +slaving +slavery +slavish +slavishly +tribe +tribes +tribal +tribally +whitelist +whitelists +whitelisted +whitelisting +blacklist +blacklists +blacklisted +blacklisting +sanity-check +sanity-checks +sanity-checked +sanity-checking +blast-radius +blast-radii +hallucinate +hallucinates +hallucinated +hallucinating +hallucination +man-hour +man-hours +man-in-the-middle +men-in-the-middle +segregate +segregates +segregated +segregating +segregation -- 2.45.2 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev