Dear list, I have a huge list[1] of 70,000 (potential) false positives in the latest de_DE_frami dictionary for Hunspell. Among other things, it is based on the first 10,000 (or so) entries of a machine-generated list Ruud Baars sent me a few months ago.
I think this is a valuable resource, but its size makes it difficult to handle. In particular, there is no guarantee that there aren't some true positives hidden somewhere, and I don't know how to filter out words that are very rare and would only clutter the dictionary. Any ideas how this can be made useful for the improvement of German Hunspell support with reasonable effort? -- Jan [1] https://sourceforge.net/p/germandict/code/HEAD/tree/hunspell_words.txt ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel