In case anyone's interested in the exported plain text file, it is here:
http://sourceforge.net/projects/germandict/files/Morfologik/de_frequency.7z

I sorted the words by frequency class and additionally sorted the
largest "A" class of least frequent words by word length.

The frequency distribution for the first 200,000 words looks fairly
plausible, but the vast majority (about 1.4 million word forms) is
lumped together in one huge class.

Ruud, you said you have larger frequency data sets available for most of
the languages. If you happen to have data for German available I would
love to have it, ideally in the gaia format so I don't have to hassle
with converting it. But a tab-separated list or something like that
would also be great.

--Jan

Am 12.10.2014 18:18, schrieb Jan Schreiber:
> I figured out how to dump the dictionary. All I had to do was create a
> hunspell subfolder and move the binary dictionary into it, then the
> exporting process worked as advertised.

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://p.sf.net/sfu/Zoho
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to