Hi again, Axel Beckert wrote: > > $ grep -v ^# /usr/share/lintian/data/spelling/corrections | cut -d '|' -f 1 > > | while read word ; do grep "^$word\$" /usr/share/dict/american-english > > /usr/share/dict/british-english ; done > > Thanks for figuring out this nice little command! I though will try to > optimize it to not call grep for each word but use something like: > > grep -Fw -f <(grep -v '^#' /usr/share/lintian/data/spelling/corrections | > cut -d '|' -f 1) /usr/share/dict/american-english > /usr/share/dict/british-english
In the end this probably will be implemented in Perl instead as there are similar checks in t/scripts/spellintian.t already. > I now wonder if we should use wamerican/wbritish or > wamerican-insane/wbritish-insane for that. Maybe wamerican/wbritish is > a good start and if we still get too many false posiives, we can > extend it to use wamerican-insane/wbritish-insane. (The latter will > probably also take longer. But then again with my optimized query > above it also just takes less than a second on a 7 year old laptop. > And it yields about 350 hits.) Some more points on this question: t/scripts/spellintian.t already has (only) two checks for seldom, but valid words so that they don't get added again, namely "iff" and "publically". Both these words are not in /usr/share/dict/*-english but in /usr/share/dict/*-english-insane. Regards, Axel -- ,''`. | Axel Beckert <a...@debian.org>, https://people.debian.org/~abe/ : :' : | Debian Developer, ftp.ch.debian.org Admin `. `' | 4096R: 2517 B724 C5F6 CA99 5329 6E61 2FF9 CD59 6126 16B5 `- | 1024D: F067 EA27 26B9 C3FC 1486 202E C09E 1D89 9593 0EDE