In case anyone's interested in the exported plain text file, it is here: http://sourceforge.net/projects/germandict/files/Morfologik/de_frequency.7z
I sorted the words by frequency class and additionally sorted the largest "A" class of least frequent words by word length. The frequency distribution for the first 200,000 words looks fairly plausible, but the vast majority (about 1.4 million word forms) is lumped together in one huge class. Ruud, you said you have larger frequency data sets available for most of the languages. If you happen to have data for German available I would love to have it, ideally in the gaia format so I don't have to hassle with converting it. But a tab-separated list or something like that would also be great. --Jan Am 12.10.2014 18:18, schrieb Jan Schreiber: > I figured out how to dump the dictionary. All I had to do was create a > hunspell subfolder and move the binary dictionary into it, then the > exporting process worked as advertised. ------------------------------------------------------------------------------ Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer http://p.sf.net/sfu/Zoho _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel