Wiki Research Junkies,

I am investigating the comparative quality of articles about  Cote d'Ivoire and 
Uganda versus other countries. I wanted to answer the question of what makes 
high-quality articles? Can anyone point me to any existing research on 
heuristics of Article Quality? That is, determining an articles quality by the 
wikitext properties, without human rating? I would also consider using data 
from the Article Feedback Tools, if there were dumps available for each Article 
in English, French, and Swahili Wikipedias.  This is all the raw data I can 
seem to find  http://toolserver.org/~dartar/aft5/dumps/

The heuristic technique that I currently using is training a naive Bayesian 
filter based on:

  *   Per Section.

     *   Text length in each section

     *   Infoboxes in each section.

        *   Filled parameters in each infobox

     *   Images in each section

  *   Good Article, Featured Article?

  *   Then Normalize on Page Views per on population / speakers of native 
language

Can you also think of any other dimensions or heuristics to programatically 
rate?


Best,

Maximilian Klein
Wikipedian in Residence, OCLC
+17074787023
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to