Hi Aaron,
Neat LimitedQueue class. It looks like this reverts code wouldn't handle
some corner cases,
for example I don't see logic that would distinguish between blanking (which
produces duplicate checksums) and reverts.
-- Best, Dmitry
On Sun, Aug 21, 2011 at 3:15 PM, Aaron Halfaker wrote:
>
I've updated my dump processing python project to include code for quickly
detecting identity reverts from XML dumps. See
https://bitbucket.org/halfak/wikimedia-utilities for the project and the
process() function at bottom of
https://bitbucket.org/halfak/wikimedia-utilities/src/f1c8fe7224f3/wmf/d
There have been a few publication on the subject:
1. "Us vs. them: Understanding social dynamics in Wikipedia with revert
graph visualizations", B Suh, EH Chi, BA Pendleton.
2. "He says, she says: Conflict and coordination in Wikipedia.", A Kittur, B
Suh, BA Pendleton.
>From my experience I can t
Hi,
I'm trying to detect reverts in Wikipedia for my research, right now with a
self-built script using MD5hashes and DIFFs between revisions. I always read
about people taking reverts into account in their data, but it's seldomly
described HOW exactly a revert is determined or what tool they