Hello, Here is what I would like to do : generating reports which give, for a > given language, a list of words which are used on the web with a number > evaluating its occurencies, but which are not in a given wiktionary. > > How would you recommand to implemente that within the wikimedia > infrastructure? >
Related : the French Wiktionary folks did that using a Wikisource dump (I’ll agree that fr.wikisource is a tiny subset of « the web » ;) See <http://tools.wmflabs.org/dicompte/> Hope that helps, -- Jean-Frédéric _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l