Hi again,

Axel Beckert wrote:
> > $ grep -v ^# /usr/share/lintian/data/spelling/corrections | cut -d '|' -f 1 
> > | while read word ; do grep "^$word\$" /usr/share/dict/american-english 
> > /usr/share/dict/british-english ; done
> 
> Thanks for figuring out this nice little command! I though will try to
> optimize it to not call grep for each word but use something like:
> 
>   grep -Fw -f <(grep -v '^#' /usr/share/lintian/data/spelling/corrections | 
> cut -d '|' -f 1) /usr/share/dict/american-english 
> /usr/share/dict/british-english

In the end this probably will be implemented in Perl instead as there
are similar checks in t/scripts/spellintian.t already.

> I now wonder if we should use wamerican/wbritish or
> wamerican-insane/wbritish-insane for that. Maybe wamerican/wbritish is
> a good start and if we still get too many false posiives, we can
> extend it to use wamerican-insane/wbritish-insane. (The latter will
> probably also take longer. But then again with my optimized query
> above it also just takes less than a second on a 7 year old laptop.
> And it yields about 350 hits.)

Some more points on this question:

t/scripts/spellintian.t already has (only) two checks for seldom, but valid
words so that they don't get added again, namely "iff" and
"publically".

Both these words are not in /usr/share/dict/*-english but in
/usr/share/dict/*-english-insane.

                Regards, Axel
-- 
 ,''`.  |  Axel Beckert <a...@debian.org>, https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-    |  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE

Reply via email to