Here is my feedback based on looking at a few pages on topics that I know very 
well.

 

Agile Software Development

·        http://wiki-trust.cse.ucsc.edu/index.php/Agile_software_development

·        Not bad. I counted 13 highlighted items, 5 of which I would say are 
questionable.

 

Usability

·        http://wiki-trust.cse.ucsc.edu/index.php/Usability

·        Not as good. 14 highlighted items 3 of which I would say are 
questionable.

 

Open Source Software

·        http://wiki-trust.cse.ucsc.edu/index.php/Open_source_software

·        Not so good either. 23 highlighted items, 3 of which I would say are 
questionable.

 

This is a very small sample, but it's all I have time to do. It will be 
interesting to see how other people rate the precision of the highlightings on 
a wider set of topics. Based on these three examples, it's not entirely clear 
to me that this system would help me identify questionable items in topics that 
I am not so familiar with.

 

Are you planning to do a larger scale evaluation with human judges? An issue in 
that kind of study is to avoid favourable or disfavourable bias on the part of 
the judges. Also, you have to make sure that your algorithm is doing better 
than random guessing (in other words, there may be so many questionable phrases 
in a wiki page that random guessing would be bound to guess right ounce out of 
every say, 5 times). One way to avoid these issues would be to produce pages 
where half of the highlightings are produced by your system, and the other half 
are highlighting a randomly selected contiguous contribution by a single author.

 

I think this is really interesting work worth doing, btw. I just don't know how 
useful it is in its current state.

 

Cheers,

 

Alain Désilets

 

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to