[Wikidata-bugs] [Maniphest] [Commented On] T189962: Run analysis of revert time and number changes over time for wikidata

2018-08-01 Thread Halfak
Halfak added a comment. Oh! And to the point of reviewing this specific task, please limit your aggregate analysis to 12 months. This will help account for seasonality. E.g., December/January and September look weird and can appear twice in a 17 month sample.TASK

[Wikidata-bugs] [Maniphest] [Commented On] T189962: Run analysis of revert time and number changes over time for wikidata

2018-07-30 Thread Halfak
Halfak added a comment. Make a report! Start from https://meta.wikimedia.org/wiki/Research:New_project :)TASK DETAILhttps://phabricator.wikimedia.org/T189962EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, HalfakCc: Halfak, Aklapper, Ladsgroup,

[Wikidata-bugs] [Maniphest] [Commented On] T189962: Run analysis of revert time and number changes over time for wikidata

2018-07-19 Thread Ladsgroup
Ladsgroup added a comment. This is the plot of geographical mean of revert time in Wikidata: F23900542: Figure_1.png This is median of reverts made by users who made more than five reverts in that month F23900564: Figure_2.png This is number of users who reverted more than five in the month:

[Wikidata-bugs] [Maniphest] [Commented On] T189962: Run analysis of revert time and number changes over time for wikidata

2018-07-16 Thread Halfak
Halfak added a comment. This is great. Please graph the results, write a report, and give a description of the weirdness in Spanish's dump files.TASK DETAILhttps://phabricator.wikimedia.org/T189962EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup,

[Wikidata-bugs] [Maniphest] [Commented On] T189962: Run analysis of revert time and number changes over time for wikidata

2018-04-09 Thread Halfak
Halfak added a comment. We discussed getting the plots cleaned up and adding English/Spanish Wikipedia at our sync meeting.TASK DETAILhttps://phabricator.wikimedia.org/T189962EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, HalfakCc: Halfak,

[Wikidata-bugs] [Maniphest] [Commented On] T189962: Run analysis of revert time and number changes over time for wikidata

2018-03-28 Thread Ladsgroup
Ladsgroup added a comment. This is the new data: MonthNumberAverage (hour)first quartile (hour)median (hour)last quartile (hour)Geo mean (seconds) 2013-0268212.19914683412337730.0052780.021670.1519132.86284580032776

[Wikidata-bugs] [Maniphest] [Commented On] T189962: Run analysis of revert time and number changes over time for wikidata

2018-03-28 Thread Ladsgroup
Ladsgroup added a comment. Another data: MonthNumber of users revertingAverage number of reverts per userFirst quartilemedianlast quartile 2013-03119810.4732888146911521.01.04.0 2013-04113319.503971756398941.02.04.0 2013-05115315.3330442324371211.02.04.0 2013-06113412.9611992945326281.01.03.0

[Wikidata-bugs] [Maniphest] [Commented On] T189962: Run analysis of revert time and number changes over time for wikidata

2018-03-28 Thread Halfak
Halfak added a comment. I think we need a plot of this data. I'd also suggested using the geometric mean for looking at time-to-revert. E.g. geometric_mean = function(x){ exp(mean(log(x))) } In python, I'd do: >>> from statistics import mean >>> from math import log, exp >>> >>> def

[Wikidata-bugs] [Maniphest] [Commented On] T189962: Run analysis of revert time and number changes over time for wikidata

2018-03-28 Thread Ladsgroup
Ladsgroup added a comment. This is result of the analysis (note that I omit any revert that took more than 48 hours in average): MonthNumber of revertsAverage revert time (seconds)Average revert time (hours) 2013-0268217916.9286028441612.199146834123378

[Wikidata-bugs] [Maniphest] [Commented On] T189962: Run analysis of revert time and number changes over time for wikidata

2018-03-27 Thread Ladsgroup
Ladsgroup added a comment. Analysing the dump has been finished and all of the reverted/reverting edits (metadata only) is a 2.7GB file. I need to download it from stat1005 and run some scripts on top of that to make some histograms and data.TASK