Re: [Analytics] Wikipedia Clickstream dataset refreshed (March 2016)

2016-05-03 Thread Thomas Steiner
> I have created a phab task with your awesome use case. > https://phabricator.wikimedia.org/T134231 Thanks, subscribed. Looking forward to the feature :-) -- Dr. Thomas Steiner, Employee (http://blog.tomayac.com, https://twitter.com/tomayac) Google Germany GmbH, ABC-Str. 19, 20354 Hamburg, Germ

Re: [Analytics] Wikipedia Clickstream dataset refreshed (March 2016)

2016-05-03 Thread Joseph Allemandou
Hi Dario and Ellery :) As Nuria said there is a lot of work ongoing on Edit data, but I'll be very interested in discussing how to better productionize / serve the data. Cheers ! Joseph On Tue, May 3, 2016 at 6:17 AM, Nuria Ruiz wrote: > >Ellery and I have been talking about the idea of – at

Re: [Analytics] Wikipedia Clickstream dataset refreshed (March 2016)

2016-05-02 Thread Nuria Ruiz
>Ellery and I have been talking about the idea of – at the very least – scheduling the generation of new dumps, if not exposing the data programmatically. Right now, I am afraid >this is not within my team's capacity and Analytics has a number of other high-priority areas to focus on. Right. Most

Re: [Analytics] Wikipedia Clickstream dataset refreshed (March 2016)

2016-05-02 Thread Dario Taraborelli
Hey Thomas, yes, I agree this dataset is really valuable (just looking at the sheer number of downloads [1] and requests for similar data we've received). I can see the value of making it more easily accessible via an API. Ellery and I have been talking about the idea of – at the very least – sc

Re: [Analytics] Wikipedia Clickstream dataset refreshed (March 2016)

2016-05-02 Thread reguyla
and everybody who has an interest in Wikipedia and analytics.;Cc: Research into Wikimedia content and communities;Subject:Re: [Analytics] Wikipedia Clickstream dataset refreshed (March 2016) Hi Dario,This data is super interesting! How realistic is it that your teammake it available through the

Re: [Analytics] Wikipedia Clickstream dataset refreshed (March 2016)

2016-05-02 Thread Thomas Steiner
Hi Dario, This data is super interesting! How realistic is it that your team make it available through the Wikimedia REST API [1]? I would then in turn love to add it to Wikipedia Tools [2], just imagine how amazing it would be to be able to ask a spreadsheet for… =WIKI{OUT|IN}BOUNDTRAFFIC("en:

[Analytics] Wikipedia Clickstream dataset refreshed (March 2016)

2016-04-28 Thread Dario Taraborelli
Hey all, heads up that a refreshed Wikipedia Clickstream dataset is now available for March 2016, containing 25 million (referer, resource) pairs extracted from about 7 billion webrequests. https://dx.doi.org/10.6084/m9.figshare.1305770.v16 Ellery (the author of the dataset) is cc'ed if you have