Re: [Wiki-research-l] Revert detection

2011-08-22 Thread Dmitry Chichkov
Hi Aaron, Neat LimitedQueue class. It looks like this reverts code wouldn't handle some corner cases, for example I don't see logic that would distinguish between blanking (which produces duplicate checksums) and reverts. -- Best, Dmitry On Sun, Aug 21, 2011 at 3:15 PM, Aaron Halfaker wrote: >

Re: [Wiki-research-l] Revert detection

2011-08-21 Thread Aaron Halfaker
I've updated my dump processing python project to include code for quickly detecting identity reverts from XML dumps. See https://bitbucket.org/halfak/wikimedia-utilities for the project and the process() function at bottom of https://bitbucket.org/halfak/wikimedia-utilities/src/f1c8fe7224f3/wmf/d

Re: [Wiki-research-l] Revert detection

2011-08-18 Thread Dmitry Chichkov
There have been a few publication on the subject: 1. "Us vs. them: Understanding social dynamics in Wikipedia with revert graph visualizations", B Suh, EH Chi, BA Pendleton. 2. "He says, she says: Conflict and coordination in Wikipedia.", A Kittur, B Suh, BA Pendleton. >From my experience I can t

[Wiki-research-l] Revert detection

2011-08-18 Thread Flöck , Fabian
Hi, I'm trying to detect reverts in Wikipedia for my research, right now with a self-built script using MD5hashes and DIFFs between revisions. I always read about people taking reverts into account in their data, but it's seldomly described HOW exactly a revert is determined or what tool they