I want to echo what Nate said. We've been using this for more than a year within the Wikimedia Foundation, and it has made analyses of editing behavior much, much easier and faster, not to mention a lot less annoying.
This is the product of years of expert work by the Analytics team, and they deserve plenty of congratulations for it 😊 On Mon, 10 Feb 2020 at 10:42, Nate E TeBlunthuis <natha...@uw.edu> wrote: > Thank you so much Joal! I've been happily using this data for some time > and I'm optimistic that it can make doing thorough analyses of Wikimedia > projects much more accessible to the community, students, and researchers. > > -- Nate > ------------------------------ > *From:* Wiki-research-l <wiki-research-l-boun...@lists.wikimedia.org> on > behalf of Joseph Allemandou <jalleman...@wikimedia.org> > *Sent:* Monday, February 10, 2020 8:27 AM > *To:* A mailing list for the Analytics Team at WMF and everybody who has > an interest in Wikipedia and analytics. <analytics@lists.wikimedia.org>; > Research into Wikimedia content and communities < > wiki-researc...@lists.wikimedia.org>; Product Analytics < > product-analyt...@wikimedia.org> > *Subject:* [Wiki-research-l] Announcement - Mediawiki History Dumps > > Hi Analytics People, > > The Wikimedia Analytics Team is pleased to announce the release of the most > complete dataset we have to date to analyze content and contributors > metadata: Mediawiki History [1] [2]. > > Data is in TSV format, released monthly around the 3rd of the month > usually, and every new release contains the full history of metadata. > > The dataset contains an enhanced [3] and historified [4] version of user, > page and revision metadata and serves as a base to Wiksitats API on edits, > users and pages [5] [6]. > > We hope you will have as much fun playing with the data as we have building > it, and we're eager to hear from you [7], whether for issues, ideas or > usage of the data. > > Analytically yours, > > -- > Joseph Allemandou (joal) (he / him) > Sr Data Engineer > Wikimedia Foundation > > [1] https://dumps.wikimedia.org/other/mediawiki_history/readme.html > [2] > > https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_history_dumps > [3] Many pre-computed fields are present in the dataset, from edit-counts > by user and page to reverts and reverted information, as well as time > between events. > [4] As accurate as possible historical usernames and page-titles (as well > as user-groups and blocks) is available in addition to current values, and > are provided in a denormalized way to every event of the dataset. > [5] https://wikitech.wikimedia.org/wiki/Analytics/AQS/Wikistats_2 > [6] https://wikimedia.org/api/rest_v1/ > [7] > > https://phabricator.wikimedia.org/maniphest/task/edit/?title=Mediawiki%20History%20Dumps&projectPHIDs=Analytics-Wikistats,Analytics > _______________________________________________ > Wiki-research-l mailing list > wiki-researc...@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >
_______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics