Hello,

Here is what I would like to do : generating reports which give, for a
> given language, a list of words which are used on the web with a number
> evaluating its occurencies, but which are not in a given wiktionary.
>
> How would you recommand to implemente that within the wikimedia
> infrastructure?
>

Related : the French Wiktionary folks did that using a Wikisource dump
(I’ll agree that fr.wikisource is a tiny subset of « the web » ;)

See <http://tools.wmflabs.org/dicompte/>

Hope that helps,
-- 
Jean-Frédéric
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to