Roi Dayan <[email protected]> writes: > On 12/02/2025 19:18, Aaron Conole wrote: >> Hi Roi, >> >> Roi Dayan via dev <[email protected]> writes: >> >>> Load dictionary_code.txt in addition to the default dictionary. >> >> The code dictionary isn't loaded by default with codespell >> (codespell_lib/_codespell.py):: >> >> _builtin_default = "clear,rare" >> >> And there are some questionable conversions in that dictionary (like >> uint to unit and stdio to studio). I think adding the _rare dictionary >> could make sense, but perhaps we should be more careful when adding the >> others. >> >> Can you add the rationale for turning these on? I think it's okay to >> turn on more than one codespell dict, but we should consider the >> individual dictionaries, too. > > I don't think it matters what is loaded by default or not as the script > uses enchant and not codespell.
Yes, but the point is the codespell authors don't think that this dictionary is a good default. > Also don't look at the conversions as it's not being used since we don't > use codespell. In the code below it's being stripped to take only the > final wording and add to enchant as allowed words. > > I looked again also in the others and I think most of the words already in > enchant dictionary but loading them won't harm. > I do think we can skip the main dictionary_en-GB_to_en-US.txt for example > as we use the enchant en_US dictionary which should be equal more or less. > The other has more unique words which I think this is what we can say in the > commit message. > > What do you think? Yes, as you noted most of the words are already there. I actually ran through many of the RHS spellings, and they already appear (as you noted). Actually, we only are not already getting: * copiable * clonable * subpatches * traceback * tracebacks Just 5 words and they are not actually universally agreed upon spellings. For example, if I use something like wiktionary (not the most authoritative source, I agree): https://en.wiktionary.org/wiki/clonable#English It says that 'cloneable' is an alternative form used in computing context. Enchant suggests 'clone able' or 'clone-able' Likewise, there isn't an accepted form of copiable (and enchant does similar, including with subpatches). So I guess 'traceback' and 'tracebacks' are for sure the ones that there isn't yet any ambiguity. Anyway, I guess it's okay to add, but we should probably consider looking at all the dictionaries and seeing which ones make sense to add as well. Otherwise, it's quite a bit of change here for something that could be done by just adding the words above directly (ie: you make 7 lines of change here, vs adding words to extra_keywords). > Files I see in codespell path: > > dictionary_code.txt > dictionary_en-GB_to_en-US.txt > dictionary_informal.txt > dictionary_names.txt > dictionary_rare.txt > dictionary.txt > dictionary_usage.txt > > >> >>> Signed-off-by: Roi Dayan <[email protected]> >>> Acked-by: Salem Sol <[email protected]> >>> --- >>> utilities/checkpatch.py | 14 ++++++++------ >>> 1 file changed, 8 insertions(+), 6 deletions(-) >>> >>> diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py >>> index f8caeb811604..9571380c291f 100755 >>> --- a/utilities/checkpatch.py >>> +++ b/utilities/checkpatch.py >>> @@ -42,14 +42,16 @@ missing_authors = [] >>> def open_spell_check_dict(): >>> import enchant >>> >>> + codespell_files = [] >>> try: >>> import codespell_lib >>> codespell_dir = os.path.dirname(codespell_lib.__file__) >>> - codespell_file = os.path.join(codespell_dir, 'data', >>> 'dictionary.txt') >>> - if not os.path.exists(codespell_file): >>> - codespell_file = '' >>> + for fn in ['dictionary.txt', 'dictionary_code.txt']: >>> + fn = os.path.join(codespell_dir, 'data', fn) >>> + if os.path.exists(fn): >>> + codespell_files.append(fn) >>> except: >>> - codespell_file = '' >>> + pass >>> >>> try: >>> extra_keywords = ['ovs', 'vswitch', 'vswitchd', 'ovs-vswitchd', >>> @@ -121,8 +123,8 @@ def open_spell_check_dict(): >>> >>> spell_check_dict = enchant.Dict("en_US") >>> >>> - if codespell_file: >>> - with open(codespell_file) as f: >>> + for fn in codespell_files: >>> + with open(fn) as f: >>> for line in f.readlines(): >>> words = line.strip().split('>')[1].strip(', >>> ').split(',') >>> for word in words: >> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
