Hirav, Bharath – I also want to hear from you if there's a specific reason to ask for English Wikipedia only or if a dataset encompassing aggregate pageviews across all Wikimedia properties would do the job.
Dario On Wed, Apr 15, 2015 at 9:09 AM, Dario Taraborelli < dtarabore...@wikimedia.org> wrote: > Oliver -- thanks for running a preliminary check, I'm fine releasing this > data in aggregate under CC0, I believe it would be valuable for this and > other research projects (copying Michelle from Legal). > > Before we do so, though, I want to confirm the specs: aggregate pageviews > per second to English Wikipedia, excluding bot traffic, broken down by > access method (mobile web vs desktop site, not apps) for a 60-day period. > Oliver – are these the filters you used to identify the data point with the > smallest number of observations? > > Obviously, we will need to take into account this release when we start > working on projects such as > https://meta.wikimedia.org/wiki/Research:Geo-aggregation_of_Wikipedia_edits > and > https://meta.wikimedia.org/wiki/Research:Geo-aggregation_of_Wikipedia_pageviews > > Dario > > On Mon, Apr 13, 2015 at 9:37 PM, Oliver Keyes <oke...@wikimedia.org> > wrote: > >> Bumping for Dario, per Pine's excellent example :) >> >> On 13 April 2015 at 22:18, Hirav Gandhi <hirav.gan...@gmail.com> wrote: >> > Oliver: Two months is fine. Thank you so much for your help! >> > >> >> On Apr 13, 2015, at 4:40 PM, analytics-requ...@lists.wikimedia.org >> wrote: >> >> >> >> Send Analytics mailing list submissions to >> >> analytics@lists.wikimedia.org >> >> >> >> To subscribe or unsubscribe via the World Wide Web, visit >> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> or, via email, send a message with subject or body 'help' to >> >> analytics-requ...@lists.wikimedia.org >> >> >> >> You can reach the person managing the list at >> >> analytics-ow...@lists.wikimedia.org >> >> >> >> When replying, please edit your Subject line so it is more specific >> >> than "Re: Contents of Analytics digest..." >> >> >> >> >> >> Today's Topics: >> >> >> >> 1. Re: Page views on a more frequent than hourly basis (Pine W) >> >> 2. Re: Page views on a more frequent than hourly basis (Hirav Gandhi) >> >> 3. Re: Page views on a more frequent than hourly basis (Oliver Keyes) >> >> >> >> >> >> ---------------------------------------------------------------------- >> >> >> >> Message: 1 >> >> Date: Mon, 13 Apr 2015 13:34:23 -0700 >> >> From: Pine W <wiki.p...@gmail.com> >> >> To: "A mailing list for the Analytics Team at WMF and everybody who >> >> has an interest in Wikipedia and analytics." >> >> <analytics@lists.wikimedia.org> >> >> Subject: Re: [Analytics] Page views on a more frequent than hourly >> >> basis >> >> Message-ID: >> >> <CAF= >> dyjjzmdfthz+0+lwnhb9m8xuod4wetgcfuxyb9qyf7cy...@mail.gmail.com> >> >> Content-Type: text/plain; charset="utf-8" >> >> >> >> Hi Oliver, re ccing people who are on list, this is the protocol we >> >> followed in IEGCom to ping people who are subscribed and mentioned in >> >> certain emails but, like many of us, may automatically move emails from >> >> lists directly to folders where they may be unread for days. So there >> is a >> >> reason to do this. >> >> >> >> Thanks, >> >> >> >> Pine >> >> -------------- next part -------------- >> >> An HTML attachment was scrubbed... >> >> URL: < >> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/aac0ef89/attachment-0001.html >> > >> >> >> >> ------------------------------ >> >> >> >> Message: 2 >> >> Date: Mon, 13 Apr 2015 16:30:43 -0700 >> >> From: Hirav Gandhi <hirav.gan...@gmail.com> >> >> To: analytics@lists.wikimedia.org >> >> Subject: Re: [Analytics] Page views on a more frequent than hourly >> >> basis >> >> Message-ID: >> >> <CANzC_EOvi4MP7G_SsxvW= >> uojpt2vxbnfmhcipqn1pumace-...@mail.gmail.com> >> >> Content-Type: text/plain; charset="utf-8" >> >> >> >> Thanks Oliver! >> >> >> >> We would like this data for as broad of a time period as you can >> muster. >> >> The more days, months and year represented in the dataset, the better. >> >> >> >> >> >>> Okay, so: >> >>> >> >>> I took an hour from the pageviews logs,[0] and aggregated pageviews to >> >>> enwiki (mobile and desktop both) by timestamp, down to one-second >> >>> resolution levels. The lowest number of pageviews to enwiki per second >> >>> was 2,981 >> >>> >> >>> So, I don't personally have a problem with generating a release of: >> >>> >> >>> 1. Pageviews per second; >> >>> 2. To enwiki; >> >>> 3. Over $TIME_PERIOD; >> >>> 4. grouping the mobile and desktop site >> >>> >> >>> But Dario or someone should chip in before I touch anything ;p >> >>> >> >>> 6am yesterday. 6am because it should be low-traffic, right? At least >> >>> given our biases towards north america and europe >> >>> >> >>> On 13 April 2015 at 11:54, Oliver Keyes <oke...@wikimedia.org> wrote: >> >>>> Then that sounds much more viable. I'll run a quick test now to see >> >>>> how much clustering we'd see at, say, the one-second resolution >> level, >> >>>> and throw it out here so we can make more informed decisions about a >> >>>> data release on this. >> >>>> >> >>>> On 13 April 2015 at 08:08, Hirav Gandhi <hirav.gan...@gmail.com> >> wrote: >> >>>>> Hi Oliver, >> >>>>> >> >>>>> Re: Hirav: would you be looking for temporally /and/ contextually >> >>> granular >> >>>>> pageviews, i.e. "a view to X page at Y time", or just temporally >> >>> granular, >> >>>>> so "a view to a page on enwiki at X time"? If the latter you've got >> >>> more of >> >>>>> a shot, I suspect. >> >>>>> >> >>>>> I only want the latter - I am not concerned with the context so >> much as >> >>> just >> >>>>> “a view to a page on enwiki at X time.” >> >>>>> >> >>>>> Hirav >> >>>>> >> >>>>> >> >>>>> On Apr 13, 2015, at 5:00 AM, analytics-requ...@lists.wikimedia.org >> >>> wrote: >> >>>>> >> >>>>> Send Analytics mailing list submissions to >> >>>>> analytics@lists.wikimedia.org >> >>>>> >> >>>>> To subscribe or unsubscribe via the World Wide Web, visit >> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>>>> or, via email, send a message with subject or body 'help' to >> >>>>> analytics-requ...@lists.wikimedia.org >> >>>>> >> >>>>> You can reach the person managing the list at >> >>>>> analytics-ow...@lists.wikimedia.org >> >>>>> >> >>>>> When replying, please edit your Subject line so it is more specific >> >>>>> than "Re: Contents of Analytics digest..." >> >>>>> >> >>>>> >> >>>>> Today's Topics: >> >>>>> >> >>>>> 1. Re: Page views on a more frequent than hourly basis (Pine W) >> >>>>> 2. Re: Page views on a more frequent than hourly basis (Oliver >> Keyes) >> >>>>> >> >>>>> >> >>>>> >> ---------------------------------------------------------------------- >> >>>>> >> >>>>> Message: 1 >> >>>>> Date: Mon, 13 Apr 2015 00:47:31 -0700 >> >>>>> From: Pine W <wiki.p...@gmail.com> >> >>>>> To: "A mailing list for the Analytics Team at WMF and everybody who >> >>>>> has an interest in Wikipedia and analytics." >> >>>>> <analytics@lists.wikimedia.org> >> >>>>> Cc: Bharath Sitaraman <bharath1...@gmail.com> >> >>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly >> >>>>> basis >> >>>>> Message-ID: >> >>>>> <CAF=dyjgnut+t6n6mujq16duyiwp7et6ruht3_-tzdnsep+2...@mail.gmail.com >> > >> >>>>> Content-Type: text/plain; charset="utf-8" >> >>>>> >> >>>>> >> >>>>> Hi, >> >>>>> >> >>>>> This issue of pageview data granularity has been discussed before, >> and >> >>> the >> >>>>> answer has been that hourly is the smallest increment allowed to be >> >>>>> revealed publicly, for privacy reasons. >> >>>>> >> >>>>> I believe that the person you will want to discuss your request >> with is >> >>>>> Toby, who I have cc'd here. >> >>>>> >> >>>>> Pine >> >>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <hirav.gan...@gmail.com> >> >>> wrote: >> >>>>> >> >>>>> Hi Wikimedia Analytics Team, >> >>>>> >> >>>>> My colleague Bharath and I are doing research on dynamic server >> >>> allocation >> >>>>> algorithms and we were looking for a suitable datasets to test our >> >>>>> predictive algorithm on. We noticed that Wikimedia has an amazing >> data >> >>> set >> >>>>> of hourly page views, but we were looking for something a bit more >> >>>>> granular, such as aggregated page requests to English Wikipedia on a >> >>> minute >> >>>>> by minute basis or second by second basis if possible. >> >>>>> >> >>>>> We are more than happy to pour through any raw data you might have >> that >> >>>>> would help us calculate page requests at this granular level. Please >> >>> let us >> >>>>> know if it would be possible to get such data and if so how. Thank >> you >> >>> in >> >>>>> advance for your help. >> >>>>> >> >>>>> Best, >> >>>>> >> >>>>> Hirav Gandhi >> >>>>> _______________________________________________ >> >>>>> Analytics mailing list >> >>>>> Analytics@lists.wikimedia.org >> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>>>> >> >>>>> -------------- next part -------------- >> >>>>> An HTML attachment was scrubbed... >> >>>>> URL: >> >>>>> < >> >>> >> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html >> >>>> >> >>>>> >> >>>>> ------------------------------ >> >>>>> >> >>>>> Message: 2 >> >>>>> Date: Mon, 13 Apr 2015 06:39:45 -0400 >> >>>>> From: Oliver Keyes <oke...@wikimedia.org> >> >>>>> To: "A mailing list for the Analytics Team at WMF and everybody who >> >>>>> has an interest in Wikipedia and analytics." >> >>>>> <analytics@lists.wikimedia.org> >> >>>>> Cc: Bharath Sitaraman <bharath1...@gmail.com> >> >>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly >> >>>>> basis >> >>>>> Message-ID: >> >>>>> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-=h...@mail.gmail.com >> > >> >>>>> Content-Type: text/plain; charset=UTF-8 >> >>>>> >> >>>>> >> >>>>> Preeetty sure that Toby is on the analytics list, Pine. He's the >> >>>>> director of analytics. >> >>>>> >> >>>>> Hirav: would you be looking for temporally /and/ contextually >> granular >> >>>>> pageviews, i.e. "a view to X page at Y time", or just temporally >> >>>>> granular, so "a view to a page on enwiki at X time"? If the latter >> >>>>> you've got more of a shot, I suspect. >> >>>>> >> >>>>> On 13 April 2015 at 03:47, Pine W <wiki.p...@gmail.com> wrote: >> >>>>> >> >>>>> Hi, >> >>>>> >> >>>>> This issue of pageview data granularity has been discussed before, >> and >> >>> the >> >>>>> answer has been that hourly is the smallest increment allowed to be >> >>> revealed >> >>>>> publicly, for privacy reasons. >> >>>>> >> >>>>> I believe that the person you will want to discuss your request >> with is >> >>>>> Toby, who I have cc'd here. >> >>>>> >> >>>>> Pine >> >>>>> >> >>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <hirav.gan...@gmail.com> >> >>> wrote: >> >>>>> >> >>>>> >> >>>>> Hi Wikimedia Analytics Team, >> >>>>> >> >>>>> My colleague Bharath and I are doing research on dynamic server >> >>> allocation >> >>>>> algorithms and we were looking for a suitable datasets to test our >> >>>>> predictive algorithm on. We noticed that Wikimedia has an amazing >> data >> >>> set >> >>>>> of hourly page views, but we were looking for something a bit more >> >>> granular, >> >>>>> such as aggregated page requests to English Wikipedia on a minute by >> >>> minute >> >>>>> basis or second by second basis if possible. >> >>>>> >> >>>>> We are more than happy to pour through any raw data you might have >> that >> >>>>> would help us calculate page requests at this granular level. Please >> >>> let us >> >>>>> know if it would be possible to get such data and if so how. Thank >> you >> >>> in >> >>>>> advance for your help. >> >>>>> >> >>>>> Best, >> >>>>> >> >>>>> Hirav Gandhi >> >>>>> _______________________________________________ >> >>>>> Analytics mailing list >> >>>>> Analytics@lists.wikimedia.org >> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>>>> >> >>>>> >> >>>>> >> >>>>> _______________________________________________ >> >>>>> Analytics mailing list >> >>>>> Analytics@lists.wikimedia.org >> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> Oliver Keyes >> >>>>> Research Analyst >> >>>>> Wikimedia Foundation >> >>>>> >> >>>>> >> >>>>> >> >>>>> ------------------------------ >> >>>>> >> >>>>> _______________________________________________ >> >>>>> Analytics mailing list >> >>>>> Analytics@lists.wikimedia.org >> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>>>> >> >>>>> >> >>>>> End of Analytics Digest, Vol 38, Issue 21 >> >>>>> ***************************************** >> >>>>> >> >>>>> >> >>>>> >> >>>>> _______________________________________________ >> >>>>> Analytics mailing list >> >>>>> Analytics@lists.wikimedia.org >> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>>>> >> >>>> >> >>>> >> >>>> >> >>>> -- >> >>>> Oliver Keyes >> >>>> Research Analyst >> >>>> Wikimedia Foundation >> >>> >> >>> >> >>> >> >>> -- >> >>> Oliver Keyes >> >>> Research Analyst >> >>> Wikimedia Foundation >> >>> >> >>> >> >>> >> >>> ------------------------------ >> >>> >> >>> _______________________________________________ >> >>> Analytics mailing list >> >>> Analytics@lists.wikimedia.org >> >>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>> >> >> -------------- next part -------------- >> >> An HTML attachment was scrubbed... >> >> URL: < >> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/3a5df491/attachment-0001.html >> > >> >> >> >> ------------------------------ >> >> >> >> Message: 3 >> >> Date: Mon, 13 Apr 2015 19:40:04 -0400 >> >> From: Oliver Keyes <oke...@wikimedia.org> >> >> To: "A mailing list for the Analytics Team at WMF and everybody who >> >> has an interest in Wikipedia and analytics." >> >> <analytics@lists.wikimedia.org> >> >> Subject: Re: [Analytics] Page views on a more frequent than hourly >> >> basis >> >> Message-ID: >> >> < >> caauqgdd6z5ussu11vw49fdmbsrhyejxku9yopyserib79j-...@mail.gmail.com> >> >> Content-Type: text/plain; charset=UTF-8 >> >> >> >> .... >> >> >> >> >> >> ...years? >> >> >> >> We have unsampled logs for, ah. 2 months. >> >> >> >> On 13 April 2015 at 19:30, Hirav Gandhi <hirav.gan...@gmail.com> >> wrote: >> >>> Thanks Oliver! >> >>> >> >>> We would like this data for as broad of a time period as you can >> muster. The >> >>> more days, months and year represented in the dataset, the better. >> >>> >> >>>> >> >>>> Okay, so: >> >>>> >> >>>> I took an hour from the pageviews logs,[0] and aggregated pageviews >> to >> >>>> enwiki (mobile and desktop both) by timestamp, down to one-second >> >>>> resolution levels. The lowest number of pageviews to enwiki per >> second >> >>>> was 2,981 >> >>>> >> >>>> So, I don't personally have a problem with generating a release of: >> >>>> >> >>>> 1. Pageviews per second; >> >>>> 2. To enwiki; >> >>>> 3. Over $TIME_PERIOD; >> >>>> 4. grouping the mobile and desktop site >> >>>> >> >>>> But Dario or someone should chip in before I touch anything ;p >> >>>> >> >>>> 6am yesterday. 6am because it should be low-traffic, right? At least >> >>>> given our biases towards north america and europe >> >>>> >> >>>> On 13 April 2015 at 11:54, Oliver Keyes <oke...@wikimedia.org> >> wrote: >> >>>>> Then that sounds much more viable. I'll run a quick test now to see >> >>>>> how much clustering we'd see at, say, the one-second resolution >> level, >> >>>>> and throw it out here so we can make more informed decisions about a >> >>>>> data release on this. >> >>>>> >> >>>>> On 13 April 2015 at 08:08, Hirav Gandhi <hirav.gan...@gmail.com> >> wrote: >> >>>>>> Hi Oliver, >> >>>>>> >> >>>>>> Re: Hirav: would you be looking for temporally /and/ contextually >> >>>>>> granular >> >>>>>> pageviews, i.e. "a view to X page at Y time", or just temporally >> >>>>>> granular, >> >>>>>> so "a view to a page on enwiki at X time"? If the latter you've got >> >>>>>> more of >> >>>>>> a shot, I suspect. >> >>>>>> >> >>>>>> I only want the latter - I am not concerned with the context so >> much as >> >>>>>> just >> >>>>>> “a view to a page on enwiki at X time.” >> >>>>>> >> >>>>>> Hirav >> >>>>>> >> >>>>>> >> >>>>>> On Apr 13, 2015, at 5:00 AM, analytics-requ...@lists.wikimedia.org >> >>>>>> wrote: >> >>>>>> >> >>>>>> Send Analytics mailing list submissions to >> >>>>>> analytics@lists.wikimedia.org >> >>>>>> >> >>>>>> To subscribe or unsubscribe via the World Wide Web, visit >> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>>>>> or, via email, send a message with subject or body 'help' to >> >>>>>> analytics-requ...@lists.wikimedia.org >> >>>>>> >> >>>>>> You can reach the person managing the list at >> >>>>>> analytics-ow...@lists.wikimedia.org >> >>>>>> >> >>>>>> When replying, please edit your Subject line so it is more specific >> >>>>>> than "Re: Contents of Analytics digest..." >> >>>>>> >> >>>>>> >> >>>>>> Today's Topics: >> >>>>>> >> >>>>>> 1. Re: Page views on a more frequent than hourly basis (Pine W) >> >>>>>> 2. Re: Page views on a more frequent than hourly basis (Oliver >> Keyes) >> >>>>>> >> >>>>>> >> >>>>>> >> ---------------------------------------------------------------------- >> >>>>>> >> >>>>>> Message: 1 >> >>>>>> Date: Mon, 13 Apr 2015 00:47:31 -0700 >> >>>>>> From: Pine W <wiki.p...@gmail.com> >> >>>>>> To: "A mailing list for the Analytics Team at WMF and everybody who >> >>>>>> has an interest in Wikipedia and analytics." >> >>>>>> <analytics@lists.wikimedia.org> >> >>>>>> Cc: Bharath Sitaraman <bharath1...@gmail.com> >> >>>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly >> >>>>>> basis >> >>>>>> Message-ID: >> >>>>>> <CAF= >> dyjgnut+t6n6mujq16duyiwp7et6ruht3_-tzdnsep+2...@mail.gmail.com> >> >>>>>> Content-Type: text/plain; charset="utf-8" >> >>>>>> >> >>>>>> >> >>>>>> Hi, >> >>>>>> >> >>>>>> This issue of pageview data granularity has been discussed before, >> and >> >>>>>> the >> >>>>>> answer has been that hourly is the smallest increment allowed to be >> >>>>>> revealed publicly, for privacy reasons. >> >>>>>> >> >>>>>> I believe that the person you will want to discuss your request >> with is >> >>>>>> Toby, who I have cc'd here. >> >>>>>> >> >>>>>> Pine >> >>>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <hirav.gan...@gmail.com> >> >>>>>> wrote: >> >>>>>> >> >>>>>> Hi Wikimedia Analytics Team, >> >>>>>> >> >>>>>> My colleague Bharath and I are doing research on dynamic server >> >>>>>> allocation >> >>>>>> algorithms and we were looking for a suitable datasets to test our >> >>>>>> predictive algorithm on. We noticed that Wikimedia has an amazing >> data >> >>>>>> set >> >>>>>> of hourly page views, but we were looking for something a bit more >> >>>>>> granular, such as aggregated page requests to English Wikipedia on >> a >> >>>>>> minute >> >>>>>> by minute basis or second by second basis if possible. >> >>>>>> >> >>>>>> We are more than happy to pour through any raw data you might have >> that >> >>>>>> would help us calculate page requests at this granular level. >> Please >> >>>>>> let us >> >>>>>> know if it would be possible to get such data and if so how. Thank >> you >> >>>>>> in >> >>>>>> advance for your help. >> >>>>>> >> >>>>>> Best, >> >>>>>> >> >>>>>> Hirav Gandhi >> >>>>>> _______________________________________________ >> >>>>>> Analytics mailing list >> >>>>>> Analytics@lists.wikimedia.org >> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>>>>> >> >>>>>> -------------- next part -------------- >> >>>>>> An HTML attachment was scrubbed... >> >>>>>> URL: >> >>>>>> >> >>>>>> < >> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html >> > >> >>>>>> >> >>>>>> ------------------------------ >> >>>>>> >> >>>>>> Message: 2 >> >>>>>> Date: Mon, 13 Apr 2015 06:39:45 -0400 >> >>>>>> From: Oliver Keyes <oke...@wikimedia.org> >> >>>>>> To: "A mailing list for the Analytics Team at WMF and everybody who >> >>>>>> has an interest in Wikipedia and analytics." >> >>>>>> <analytics@lists.wikimedia.org> >> >>>>>> Cc: Bharath Sitaraman <bharath1...@gmail.com> >> >>>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly >> >>>>>> basis >> >>>>>> Message-ID: >> >>>>>> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-= >> h...@mail.gmail.com> >> >>>>>> Content-Type: text/plain; charset=UTF-8 >> >>>>>> >> >>>>>> >> >>>>>> Preeetty sure that Toby is on the analytics list, Pine. He's the >> >>>>>> director of analytics. >> >>>>>> >> >>>>>> Hirav: would you be looking for temporally /and/ contextually >> granular >> >>>>>> pageviews, i.e. "a view to X page at Y time", or just temporally >> >>>>>> granular, so "a view to a page on enwiki at X time"? If the latter >> >>>>>> you've got more of a shot, I suspect. >> >>>>>> >> >>>>>> On 13 April 2015 at 03:47, Pine W <wiki.p...@gmail.com> wrote: >> >>>>>> >> >>>>>> Hi, >> >>>>>> >> >>>>>> This issue of pageview data granularity has been discussed before, >> and >> >>>>>> the >> >>>>>> answer has been that hourly is the smallest increment allowed to be >> >>>>>> revealed >> >>>>>> publicly, for privacy reasons. >> >>>>>> >> >>>>>> I believe that the person you will want to discuss your request >> with is >> >>>>>> Toby, who I have cc'd here. >> >>>>>> >> >>>>>> Pine >> >>>>>> >> >>>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <hirav.gan...@gmail.com> >> >>>>>> wrote: >> >>>>>> >> >>>>>> >> >>>>>> Hi Wikimedia Analytics Team, >> >>>>>> >> >>>>>> My colleague Bharath and I are doing research on dynamic server >> >>>>>> allocation >> >>>>>> algorithms and we were looking for a suitable datasets to test our >> >>>>>> predictive algorithm on. We noticed that Wikimedia has an amazing >> data >> >>>>>> set >> >>>>>> of hourly page views, but we were looking for something a bit more >> >>>>>> granular, >> >>>>>> such as aggregated page requests to English Wikipedia on a minute >> by >> >>>>>> minute >> >>>>>> basis or second by second basis if possible. >> >>>>>> >> >>>>>> We are more than happy to pour through any raw data you might have >> that >> >>>>>> would help us calculate page requests at this granular level. >> Please >> >>>>>> let us >> >>>>>> know if it would be possible to get such data and if so how. Thank >> you >> >>>>>> in >> >>>>>> advance for your help. >> >>>>>> >> >>>>>> Best, >> >>>>>> >> >>>>>> Hirav Gandhi >> >>>>>> _______________________________________________ >> >>>>>> Analytics mailing list >> >>>>>> Analytics@lists.wikimedia.org >> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> _______________________________________________ >> >>>>>> Analytics mailing list >> >>>>>> Analytics@lists.wikimedia.org >> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> Oliver Keyes >> >>>>>> Research Analyst >> >>>>>> Wikimedia Foundation >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> ------------------------------ >> >>>>>> >> >>>>>> _______________________________________________ >> >>>>>> Analytics mailing list >> >>>>>> Analytics@lists.wikimedia.org >> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>>>>> >> >>>>>> >> >>>>>> End of Analytics Digest, Vol 38, Issue 21 >> >>>>>> ***************************************** >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> _______________________________________________ >> >>>>>> Analytics mailing list >> >>>>>> Analytics@lists.wikimedia.org >> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> Oliver Keyes >> >>>>> Research Analyst >> >>>>> Wikimedia Foundation >> >>>> >> >>>> >> >>>> >> >>>> -- >> >>>> Oliver Keyes >> >>>> Research Analyst >> >>>> Wikimedia Foundation >> >>>> >> >>>> >> >>>> >> >>>> ------------------------------ >> >>>> >> >>>> _______________________________________________ >> >>>> Analytics mailing list >> >>>> Analytics@lists.wikimedia.org >> >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>> >> >>> >> >>> _______________________________________________ >> >>> Analytics mailing list >> >>> Analytics@lists.wikimedia.org >> >>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>> >> >> >> >> >> >> >> >> -- >> >> Oliver Keyes >> >> Research Analyst >> >> Wikimedia Foundation >> >> >> >> >> >> >> >> ------------------------------ >> >> >> >> _______________________________________________ >> >> Analytics mailing list >> >> Analytics@lists.wikimedia.org >> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> >> >> >> End of Analytics Digest, Vol 38, Issue 24 >> >> ***************************************** >> > >> > >> > _______________________________________________ >> > Analytics mailing list >> > Analytics@lists.wikimedia.org >> > https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> >> -- >> Oliver Keyes >> Research Analyst >> Wikimedia Foundation >> > > > > -- > Dario Taraborelli > Senior Research Scientist, Research and Data Lead > Wikimedia Foundation > http://wikimediafoundation.org > http://nitens.org/taraborelli > -- Dario Taraborelli Senior Research Scientist, Research and Data Lead Wikimedia Foundation http://wikimediafoundation.org http://nitens.org/taraborelli
_______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics