I think a rough analysis user / IP talk pages could give you a number
pretty quickly. You probably would want to do it by hand first and then
write a script that analyses the wikipedia dump file. It is doable by hand,
if you just sub-sample a few hundred pages randomly. And if normalized by a
Hi Aaron,
Neat LimitedQueue class. It looks like this reverts code wouldn't handle
some corner cases,
for example I don't see logic that would distinguish between blanking (which
produces duplicate checksums) and reverts.
-- Best, Dmitry
On Sun, Aug 21, 2011 at 3:15 PM, Aaron Halfaker
There have been a few publication on the subject:
1. Us vs. them: Understanding social dynamics in Wikipedia with revert
graph visualizations, B Suh, EH Chi, BA Pendleton.
2. He says, she says: Conflict and coordination in Wikipedia., A Kittur, B
Suh, BA Pendleton.
From my experience I can tell
Just verified, it is back up. And actual changes are also coming through
[filtered by negative user ratings (calculated using some pretty old
wikipedia dump)].
-- Best, Dmitry
On Wed, Aug 17, 2011 at 2:33 AM, Dmitry Chichkov dchich...@gmail.comwrote:
Hmm... Somebody actually visited the site
Hello,
This is an excellent news!
Have you tried running it on Amazon EC2? It would be really nice to know how
well WikiHadoop scale up with the number of nodes.
Also, this timing - '3 x Quad Core / 14 days / full wikipedia dump, on what
kind of task (xml parsing, diffs, md5, etc?) was it
than science.
Diederik
On Wed, Aug 17, 2011 at 5:28 PM, Dmitry Chichkov dchich...@gmail.comwrote:
Hello,
This is an excellent news!
Have you tried running it on Amazon EC2? It would be really nice to know
how well WikiHadoop scale up with the number of nodes.
Also, this timing - '3 x Quad
I can recommend searching reverts wikipedia on the google scholar:
http://scholar.google.com/scholar?q=reverts+wikipedia
If you want to try running some analysis on the dump yourself, there's
reverts analysis python code available here:
http://code.google.com/p/pymwdat/
-- Best, Dmitry
On
- excellent work.
-- Cheers, Dmitry
On Fri, Aug 20, 2010 at 12:02 AM, Daniel Kinzler dan...@brightbyte.dewrote:
Hi Dimitry:
Dmitry Chichkov schrieb:
Some time ago as a Python/Django/JQuery/pywikipedia exercise I've hacked
a web based recent changes patrol tool. An alpha version can be seen
/ )
* OrderedDict (available in Python 2.7 or
http://pypi.python.org/pypi/ordereddict/)
* 7-Zip (command line 7za)
-- Dmitry
On Thu, Aug 19, 2010 at 8:46 AM, John Vandenberg jay...@gmail.com wrote:
On Sat, Aug 14, 2010 at 6:12 AM, Dmitry Chichkov dchich...@gmail.com
wrote:
If anybody is interested
If anybody is interested, I've made a list of 'most reverted pages' in the
english wikipedia based on the analysis of the enwiki-20100130 dump. Here is
the list:
http://wpcvn.com/enwiki-20100130.most.reverted.tar.bz
http://wpcvn.com/enwiki-20100130.most.reverted.txt
This list was calculated using
10 matches
Mail list logo