Hirav, Bharath – I also want to hear from you if there's a specific reason
to ask for English Wikipedia only or if a dataset encompassing aggregate
pageviews across all Wikimedia properties would do the job.

Dario

On Wed, Apr 15, 2015 at 9:09 AM, Dario Taraborelli <
dtarabore...@wikimedia.org> wrote:

> Oliver -- thanks for running a preliminary check, I'm fine releasing this
> data in aggregate under CC0, I believe it would be valuable for this and
> other research projects (copying Michelle from Legal).
>
> Before we do so, though, I want to confirm the specs: aggregate pageviews
> per second to English Wikipedia, excluding bot traffic, broken down by
> access method (mobile web vs desktop site, not apps) for a 60-day period.
> Oliver – are these the filters you used to identify the data point with the
> smallest number of observations?
>
> Obviously, we will need to take into account this release when we start
> working on projects such as
> https://meta.wikimedia.org/wiki/Research:Geo-aggregation_of_Wikipedia_edits
> and
> https://meta.wikimedia.org/wiki/Research:Geo-aggregation_of_Wikipedia_pageviews
>
> Dario
>
> On Mon, Apr 13, 2015 at 9:37 PM, Oliver Keyes <oke...@wikimedia.org>
> wrote:
>
>> Bumping for Dario, per Pine's excellent example :)
>>
>> On 13 April 2015 at 22:18, Hirav Gandhi <hirav.gan...@gmail.com> wrote:
>> > Oliver: Two months is fine. Thank you so much for your help!
>> >
>> >> On Apr 13, 2015, at 4:40 PM, analytics-requ...@lists.wikimedia.org
>> wrote:
>> >>
>> >> Send Analytics mailing list submissions to
>> >>       analytics@lists.wikimedia.org
>> >>
>> >> To subscribe or unsubscribe via the World Wide Web, visit
>> >>       https://lists.wikimedia.org/mailman/listinfo/analytics
>> >> or, via email, send a message with subject or body 'help' to
>> >>       analytics-requ...@lists.wikimedia.org
>> >>
>> >> You can reach the person managing the list at
>> >>       analytics-ow...@lists.wikimedia.org
>> >>
>> >> When replying, please edit your Subject line so it is more specific
>> >> than "Re: Contents of Analytics digest..."
>> >>
>> >>
>> >> Today's Topics:
>> >>
>> >>   1. Re: Page views on a more frequent than hourly basis (Pine W)
>> >>   2. Re: Page views on a more frequent than hourly basis (Hirav Gandhi)
>> >>   3. Re: Page views on a more frequent than hourly basis (Oliver Keyes)
>> >>
>> >>
>> >> ----------------------------------------------------------------------
>> >>
>> >> Message: 1
>> >> Date: Mon, 13 Apr 2015 13:34:23 -0700
>> >> From: Pine W <wiki.p...@gmail.com>
>> >> To: "A mailing list for the Analytics Team at WMF and everybody who
>> >>       has an  interest in Wikipedia and analytics."
>> >>       <analytics@lists.wikimedia.org>
>> >> Subject: Re: [Analytics] Page views on a more frequent than hourly
>> >>       basis
>> >> Message-ID:
>> >>       <CAF=
>> dyjjzmdfthz+0+lwnhb9m8xuod4wetgcfuxyb9qyf7cy...@mail.gmail.com>
>> >> Content-Type: text/plain; charset="utf-8"
>> >>
>> >> Hi Oliver, re ccing people who are on list, this is the protocol we
>> >> followed in IEGCom to ping people who are subscribed and mentioned in
>> >> certain emails but, like many of us, may automatically move emails from
>> >> lists directly to folders where they may be unread for days. So there
>> is a
>> >> reason to do this.
>> >>
>> >> Thanks,
>> >>
>> >> Pine
>> >> -------------- next part --------------
>> >> An HTML attachment was scrubbed...
>> >> URL: <
>> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/aac0ef89/attachment-0001.html
>> >
>> >>
>> >> ------------------------------
>> >>
>> >> Message: 2
>> >> Date: Mon, 13 Apr 2015 16:30:43 -0700
>> >> From: Hirav Gandhi <hirav.gan...@gmail.com>
>> >> To: analytics@lists.wikimedia.org
>> >> Subject: Re: [Analytics] Page views on a more frequent than hourly
>> >>       basis
>> >> Message-ID:
>> >>       <CANzC_EOvi4MP7G_SsxvW=
>> uojpt2vxbnfmhcipqn1pumace-...@mail.gmail.com>
>> >> Content-Type: text/plain; charset="utf-8"
>> >>
>> >> Thanks Oliver!
>> >>
>> >> We would like this data for as broad of a time period as you can
>> muster.
>> >> The more days, months and year represented in the dataset, the better.
>> >>
>> >>
>> >>> Okay, so:
>> >>>
>> >>> I took an hour from the pageviews logs,[0] and aggregated pageviews to
>> >>> enwiki (mobile and desktop both) by timestamp, down to one-second
>> >>> resolution levels. The lowest number of pageviews to enwiki per second
>> >>> was 2,981
>> >>>
>> >>> So, I don't personally have a problem with generating a release of:
>> >>>
>> >>> 1. Pageviews per second;
>> >>> 2. To enwiki;
>> >>> 3. Over $TIME_PERIOD;
>> >>> 4. grouping the mobile and desktop site
>> >>>
>> >>> But Dario or someone should chip in before I touch anything ;p
>> >>>
>> >>> 6am yesterday. 6am because it should be low-traffic, right? At least
>> >>> given our biases towards north america and europe
>> >>>
>> >>> On 13 April 2015 at 11:54, Oliver Keyes <oke...@wikimedia.org> wrote:
>> >>>> Then that sounds much more viable. I'll run a quick test now to see
>> >>>> how much clustering we'd see at, say, the one-second resolution
>> level,
>> >>>> and throw it out here so we can make more informed decisions about a
>> >>>> data release on this.
>> >>>>
>> >>>> On 13 April 2015 at 08:08, Hirav Gandhi <hirav.gan...@gmail.com>
>> wrote:
>> >>>>> Hi Oliver,
>> >>>>>
>> >>>>> Re: Hirav: would you be looking for temporally /and/ contextually
>> >>> granular
>> >>>>> pageviews, i.e. "a view to X page at Y time", or just temporally
>> >>> granular,
>> >>>>> so "a view to a page on enwiki at X time"? If the latter you've got
>> >>> more of
>> >>>>> a shot, I suspect.
>> >>>>>
>> >>>>> I only want the latter - I am not concerned with the context so
>> much as
>> >>> just
>> >>>>> “a view to a page on enwiki at X time.”
>> >>>>>
>> >>>>> Hirav
>> >>>>>
>> >>>>>
>> >>>>> On Apr 13, 2015, at 5:00 AM, analytics-requ...@lists.wikimedia.org
>> >>> wrote:
>> >>>>>
>> >>>>> Send Analytics mailing list submissions to
>> >>>>> analytics@lists.wikimedia.org
>> >>>>>
>> >>>>> To subscribe or unsubscribe via the World Wide Web, visit
>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>>> or, via email, send a message with subject or body 'help' to
>> >>>>> analytics-requ...@lists.wikimedia.org
>> >>>>>
>> >>>>> You can reach the person managing the list at
>> >>>>> analytics-ow...@lists.wikimedia.org
>> >>>>>
>> >>>>> When replying, please edit your Subject line so it is more specific
>> >>>>> than "Re: Contents of Analytics digest..."
>> >>>>>
>> >>>>>
>> >>>>> Today's Topics:
>> >>>>>
>> >>>>>  1. Re: Page views on a more frequent than hourly basis (Pine W)
>> >>>>>  2. Re: Page views on a more frequent than hourly basis (Oliver
>> Keyes)
>> >>>>>
>> >>>>>
>> >>>>>
>> ----------------------------------------------------------------------
>> >>>>>
>> >>>>> Message: 1
>> >>>>> Date: Mon, 13 Apr 2015 00:47:31 -0700
>> >>>>> From: Pine W <wiki.p...@gmail.com>
>> >>>>> To: "A mailing list for the Analytics Team at WMF and everybody who
>> >>>>> has an interest in Wikipedia and analytics."
>> >>>>> <analytics@lists.wikimedia.org>
>> >>>>> Cc: Bharath Sitaraman <bharath1...@gmail.com>
>> >>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly
>> >>>>> basis
>> >>>>> Message-ID:
>> >>>>> <CAF=dyjgnut+t6n6mujq16duyiwp7et6ruht3_-tzdnsep+2...@mail.gmail.com
>> >
>> >>>>> Content-Type: text/plain; charset="utf-8"
>> >>>>>
>> >>>>>
>> >>>>> Hi,
>> >>>>>
>> >>>>> This issue of pageview data granularity has been discussed before,
>> and
>> >>> the
>> >>>>> answer has been that hourly is the smallest increment allowed to be
>> >>>>> revealed publicly, for privacy reasons.
>> >>>>>
>> >>>>> I believe that the person you will want to discuss your request
>> with is
>> >>>>> Toby, who I have cc'd here.
>> >>>>>
>> >>>>> Pine
>> >>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <hirav.gan...@gmail.com>
>> >>> wrote:
>> >>>>>
>> >>>>> Hi Wikimedia Analytics Team,
>> >>>>>
>> >>>>> My colleague Bharath and I are doing research on dynamic server
>> >>> allocation
>> >>>>> algorithms and we were looking for a suitable datasets to test our
>> >>>>> predictive algorithm on. We noticed that Wikimedia has an amazing
>> data
>> >>> set
>> >>>>> of hourly page views, but we were looking for something a bit more
>> >>>>> granular, such as aggregated page requests to English Wikipedia on a
>> >>> minute
>> >>>>> by minute basis or second by second basis if possible.
>> >>>>>
>> >>>>> We are more than happy to pour through any raw data you might have
>> that
>> >>>>> would help us calculate page requests at this granular level. Please
>> >>> let us
>> >>>>> know if it would be possible to get such data and if so how. Thank
>> you
>> >>> in
>> >>>>> advance for your help.
>> >>>>>
>> >>>>> Best,
>> >>>>>
>> >>>>> Hirav Gandhi
>> >>>>> _______________________________________________
>> >>>>> Analytics mailing list
>> >>>>> Analytics@lists.wikimedia.org
>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>>>
>> >>>>> -------------- next part --------------
>> >>>>> An HTML attachment was scrubbed...
>> >>>>> URL:
>> >>>>> <
>> >>>
>> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html
>> >>>>
>> >>>>>
>> >>>>> ------------------------------
>> >>>>>
>> >>>>> Message: 2
>> >>>>> Date: Mon, 13 Apr 2015 06:39:45 -0400
>> >>>>> From: Oliver Keyes <oke...@wikimedia.org>
>> >>>>> To: "A mailing list for the Analytics Team at WMF and everybody who
>> >>>>> has an interest in Wikipedia and analytics."
>> >>>>> <analytics@lists.wikimedia.org>
>> >>>>> Cc: Bharath Sitaraman <bharath1...@gmail.com>
>> >>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly
>> >>>>> basis
>> >>>>> Message-ID:
>> >>>>> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-=h...@mail.gmail.com
>> >
>> >>>>> Content-Type: text/plain; charset=UTF-8
>> >>>>>
>> >>>>>
>> >>>>> Preeetty sure that Toby is on the analytics list, Pine. He's the
>> >>>>> director of analytics.
>> >>>>>
>> >>>>> Hirav: would you be looking for temporally /and/ contextually
>> granular
>> >>>>> pageviews, i.e. "a view to X page at Y time", or just temporally
>> >>>>> granular, so "a view to a page on enwiki at X time"? If the latter
>> >>>>> you've got more of a shot, I suspect.
>> >>>>>
>> >>>>> On 13 April 2015 at 03:47, Pine W <wiki.p...@gmail.com> wrote:
>> >>>>>
>> >>>>> Hi,
>> >>>>>
>> >>>>> This issue of pageview data granularity has been discussed before,
>> and
>> >>> the
>> >>>>> answer has been that hourly is the smallest increment allowed to be
>> >>> revealed
>> >>>>> publicly, for privacy reasons.
>> >>>>>
>> >>>>> I believe that the person you will want to discuss your request
>> with is
>> >>>>> Toby, who I have cc'd here.
>> >>>>>
>> >>>>> Pine
>> >>>>>
>> >>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <hirav.gan...@gmail.com>
>> >>> wrote:
>> >>>>>
>> >>>>>
>> >>>>> Hi Wikimedia Analytics Team,
>> >>>>>
>> >>>>> My colleague Bharath and I are doing research on dynamic server
>> >>> allocation
>> >>>>> algorithms and we were looking for a suitable datasets to test our
>> >>>>> predictive algorithm on. We noticed that Wikimedia has an amazing
>> data
>> >>> set
>> >>>>> of hourly page views, but we were looking for something a bit more
>> >>> granular,
>> >>>>> such as aggregated page requests to English Wikipedia on a minute by
>> >>> minute
>> >>>>> basis or second by second basis if possible.
>> >>>>>
>> >>>>> We are more than happy to pour through any raw data you might have
>> that
>> >>>>> would help us calculate page requests at this granular level. Please
>> >>> let us
>> >>>>> know if it would be possible to get such data and if so how. Thank
>> you
>> >>> in
>> >>>>> advance for your help.
>> >>>>>
>> >>>>> Best,
>> >>>>>
>> >>>>> Hirav Gandhi
>> >>>>> _______________________________________________
>> >>>>> Analytics mailing list
>> >>>>> Analytics@lists.wikimedia.org
>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Analytics mailing list
>> >>>>> Analytics@lists.wikimedia.org
>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> Oliver Keyes
>> >>>>> Research Analyst
>> >>>>> Wikimedia Foundation
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> ------------------------------
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Analytics mailing list
>> >>>>> Analytics@lists.wikimedia.org
>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>>>
>> >>>>>
>> >>>>> End of Analytics Digest, Vol 38, Issue 21
>> >>>>> *****************************************
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Analytics mailing list
>> >>>>> Analytics@lists.wikimedia.org
>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Oliver Keyes
>> >>>> Research Analyst
>> >>>> Wikimedia Foundation
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Oliver Keyes
>> >>> Research Analyst
>> >>> Wikimedia Foundation
>> >>>
>> >>>
>> >>>
>> >>> ------------------------------
>> >>>
>> >>> _______________________________________________
>> >>> Analytics mailing list
>> >>> Analytics@lists.wikimedia.org
>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>
>> >> -------------- next part --------------
>> >> An HTML attachment was scrubbed...
>> >> URL: <
>> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/3a5df491/attachment-0001.html
>> >
>> >>
>> >> ------------------------------
>> >>
>> >> Message: 3
>> >> Date: Mon, 13 Apr 2015 19:40:04 -0400
>> >> From: Oliver Keyes <oke...@wikimedia.org>
>> >> To: "A mailing list for the Analytics Team at WMF and everybody who
>> >>       has an  interest in Wikipedia and analytics."
>> >>       <analytics@lists.wikimedia.org>
>> >> Subject: Re: [Analytics] Page views on a more frequent than hourly
>> >>       basis
>> >> Message-ID:
>> >>       <
>> caauqgdd6z5ussu11vw49fdmbsrhyejxku9yopyserib79j-...@mail.gmail.com>
>> >> Content-Type: text/plain; charset=UTF-8
>> >>
>> >> ....
>> >>
>> >>
>> >> ...years?
>> >>
>> >> We have unsampled logs for, ah. 2 months.
>> >>
>> >> On 13 April 2015 at 19:30, Hirav Gandhi <hirav.gan...@gmail.com>
>> wrote:
>> >>> Thanks Oliver!
>> >>>
>> >>> We would like this data for as broad of a time period as you can
>> muster. The
>> >>> more days, months and year represented in the dataset, the better.
>> >>>
>> >>>>
>> >>>> Okay, so:
>> >>>>
>> >>>> I took an hour from the pageviews logs,[0] and aggregated pageviews
>> to
>> >>>> enwiki (mobile and desktop both) by timestamp, down to one-second
>> >>>> resolution levels. The lowest number of pageviews to enwiki per
>> second
>> >>>> was 2,981
>> >>>>
>> >>>> So, I don't personally have a problem with generating a release of:
>> >>>>
>> >>>> 1. Pageviews per second;
>> >>>> 2. To enwiki;
>> >>>> 3. Over $TIME_PERIOD;
>> >>>> 4. grouping the mobile and desktop site
>> >>>>
>> >>>> But Dario or someone should chip in before I touch anything ;p
>> >>>>
>> >>>> 6am yesterday. 6am because it should be low-traffic, right? At least
>> >>>> given our biases towards north america and europe
>> >>>>
>> >>>> On 13 April 2015 at 11:54, Oliver Keyes <oke...@wikimedia.org>
>> wrote:
>> >>>>> Then that sounds much more viable. I'll run a quick test now to see
>> >>>>> how much clustering we'd see at, say, the one-second resolution
>> level,
>> >>>>> and throw it out here so we can make more informed decisions about a
>> >>>>> data release on this.
>> >>>>>
>> >>>>> On 13 April 2015 at 08:08, Hirav Gandhi <hirav.gan...@gmail.com>
>> wrote:
>> >>>>>> Hi Oliver,
>> >>>>>>
>> >>>>>> Re: Hirav: would you be looking for temporally /and/ contextually
>> >>>>>> granular
>> >>>>>> pageviews, i.e. "a view to X page at Y time", or just temporally
>> >>>>>> granular,
>> >>>>>> so "a view to a page on enwiki at X time"? If the latter you've got
>> >>>>>> more of
>> >>>>>> a shot, I suspect.
>> >>>>>>
>> >>>>>> I only want the latter - I am not concerned with the context so
>> much as
>> >>>>>> just
>> >>>>>> “a view to a page on enwiki at X time.”
>> >>>>>>
>> >>>>>> Hirav
>> >>>>>>
>> >>>>>>
>> >>>>>> On Apr 13, 2015, at 5:00 AM, analytics-requ...@lists.wikimedia.org
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>> Send Analytics mailing list submissions to
>> >>>>>> analytics@lists.wikimedia.org
>> >>>>>>
>> >>>>>> To subscribe or unsubscribe via the World Wide Web, visit
>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>>>> or, via email, send a message with subject or body 'help' to
>> >>>>>> analytics-requ...@lists.wikimedia.org
>> >>>>>>
>> >>>>>> You can reach the person managing the list at
>> >>>>>> analytics-ow...@lists.wikimedia.org
>> >>>>>>
>> >>>>>> When replying, please edit your Subject line so it is more specific
>> >>>>>> than "Re: Contents of Analytics digest..."
>> >>>>>>
>> >>>>>>
>> >>>>>> Today's Topics:
>> >>>>>>
>> >>>>>>  1. Re: Page views on a more frequent than hourly basis (Pine W)
>> >>>>>>  2. Re: Page views on a more frequent than hourly basis (Oliver
>> Keyes)
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> ----------------------------------------------------------------------
>> >>>>>>
>> >>>>>> Message: 1
>> >>>>>> Date: Mon, 13 Apr 2015 00:47:31 -0700
>> >>>>>> From: Pine W <wiki.p...@gmail.com>
>> >>>>>> To: "A mailing list for the Analytics Team at WMF and everybody who
>> >>>>>> has an interest in Wikipedia and analytics."
>> >>>>>> <analytics@lists.wikimedia.org>
>> >>>>>> Cc: Bharath Sitaraman <bharath1...@gmail.com>
>> >>>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly
>> >>>>>> basis
>> >>>>>> Message-ID:
>> >>>>>> <CAF=
>> dyjgnut+t6n6mujq16duyiwp7et6ruht3_-tzdnsep+2...@mail.gmail.com>
>> >>>>>> Content-Type: text/plain; charset="utf-8"
>> >>>>>>
>> >>>>>>
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> This issue of pageview data granularity has been discussed before,
>> and
>> >>>>>> the
>> >>>>>> answer has been that hourly is the smallest increment allowed to be
>> >>>>>> revealed publicly, for privacy reasons.
>> >>>>>>
>> >>>>>> I believe that the person you will want to discuss your request
>> with is
>> >>>>>> Toby, who I have cc'd here.
>> >>>>>>
>> >>>>>> Pine
>> >>>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <hirav.gan...@gmail.com>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>> Hi Wikimedia Analytics Team,
>> >>>>>>
>> >>>>>> My colleague Bharath and I are doing research on dynamic server
>> >>>>>> allocation
>> >>>>>> algorithms and we were looking for a suitable datasets to test our
>> >>>>>> predictive algorithm on. We noticed that Wikimedia has an amazing
>> data
>> >>>>>> set
>> >>>>>> of hourly page views, but we were looking for something a bit more
>> >>>>>> granular, such as aggregated page requests to English Wikipedia on
>> a
>> >>>>>> minute
>> >>>>>> by minute basis or second by second basis if possible.
>> >>>>>>
>> >>>>>> We are more than happy to pour through any raw data you might have
>> that
>> >>>>>> would help us calculate page requests at this granular level.
>> Please
>> >>>>>> let us
>> >>>>>> know if it would be possible to get such data and if so how. Thank
>> you
>> >>>>>> in
>> >>>>>> advance for your help.
>> >>>>>>
>> >>>>>> Best,
>> >>>>>>
>> >>>>>> Hirav Gandhi
>> >>>>>> _______________________________________________
>> >>>>>> Analytics mailing list
>> >>>>>> Analytics@lists.wikimedia.org
>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>>>>
>> >>>>>> -------------- next part --------------
>> >>>>>> An HTML attachment was scrubbed...
>> >>>>>> URL:
>> >>>>>>
>> >>>>>> <
>> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html
>> >
>> >>>>>>
>> >>>>>> ------------------------------
>> >>>>>>
>> >>>>>> Message: 2
>> >>>>>> Date: Mon, 13 Apr 2015 06:39:45 -0400
>> >>>>>> From: Oliver Keyes <oke...@wikimedia.org>
>> >>>>>> To: "A mailing list for the Analytics Team at WMF and everybody who
>> >>>>>> has an interest in Wikipedia and analytics."
>> >>>>>> <analytics@lists.wikimedia.org>
>> >>>>>> Cc: Bharath Sitaraman <bharath1...@gmail.com>
>> >>>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly
>> >>>>>> basis
>> >>>>>> Message-ID:
>> >>>>>> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-=
>> h...@mail.gmail.com>
>> >>>>>> Content-Type: text/plain; charset=UTF-8
>> >>>>>>
>> >>>>>>
>> >>>>>> Preeetty sure that Toby is on the analytics list, Pine. He's the
>> >>>>>> director of analytics.
>> >>>>>>
>> >>>>>> Hirav: would you be looking for temporally /and/ contextually
>> granular
>> >>>>>> pageviews, i.e. "a view to X page at Y time", or just temporally
>> >>>>>> granular, so "a view to a page on enwiki at X time"? If the latter
>> >>>>>> you've got more of a shot, I suspect.
>> >>>>>>
>> >>>>>> On 13 April 2015 at 03:47, Pine W <wiki.p...@gmail.com> wrote:
>> >>>>>>
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> This issue of pageview data granularity has been discussed before,
>> and
>> >>>>>> the
>> >>>>>> answer has been that hourly is the smallest increment allowed to be
>> >>>>>> revealed
>> >>>>>> publicly, for privacy reasons.
>> >>>>>>
>> >>>>>> I believe that the person you will want to discuss your request
>> with is
>> >>>>>> Toby, who I have cc'd here.
>> >>>>>>
>> >>>>>> Pine
>> >>>>>>
>> >>>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <hirav.gan...@gmail.com>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>
>> >>>>>> Hi Wikimedia Analytics Team,
>> >>>>>>
>> >>>>>> My colleague Bharath and I are doing research on dynamic server
>> >>>>>> allocation
>> >>>>>> algorithms and we were looking for a suitable datasets to test our
>> >>>>>> predictive algorithm on. We noticed that Wikimedia has an amazing
>> data
>> >>>>>> set
>> >>>>>> of hourly page views, but we were looking for something a bit more
>> >>>>>> granular,
>> >>>>>> such as aggregated page requests to English Wikipedia on a minute
>> by
>> >>>>>> minute
>> >>>>>> basis or second by second basis if possible.
>> >>>>>>
>> >>>>>> We are more than happy to pour through any raw data you might have
>> that
>> >>>>>> would help us calculate page requests at this granular level.
>> Please
>> >>>>>> let us
>> >>>>>> know if it would be possible to get such data and if so how. Thank
>> you
>> >>>>>> in
>> >>>>>> advance for your help.
>> >>>>>>
>> >>>>>> Best,
>> >>>>>>
>> >>>>>> Hirav Gandhi
>> >>>>>> _______________________________________________
>> >>>>>> Analytics mailing list
>> >>>>>> Analytics@lists.wikimedia.org
>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> Analytics mailing list
>> >>>>>> Analytics@lists.wikimedia.org
>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Oliver Keyes
>> >>>>>> Research Analyst
>> >>>>>> Wikimedia Foundation
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> ------------------------------
>> >>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> Analytics mailing list
>> >>>>>> Analytics@lists.wikimedia.org
>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>>>>
>> >>>>>>
>> >>>>>> End of Analytics Digest, Vol 38, Issue 21
>> >>>>>> *****************************************
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> Analytics mailing list
>> >>>>>> Analytics@lists.wikimedia.org
>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> Oliver Keyes
>> >>>>> Research Analyst
>> >>>>> Wikimedia Foundation
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Oliver Keyes
>> >>>> Research Analyst
>> >>>> Wikimedia Foundation
>> >>>>
>> >>>>
>> >>>>
>> >>>> ------------------------------
>> >>>>
>> >>>> _______________________________________________
>> >>>> Analytics mailing list
>> >>>> Analytics@lists.wikimedia.org
>> >>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> Analytics mailing list
>> >>> Analytics@lists.wikimedia.org
>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Oliver Keyes
>> >> Research Analyst
>> >> Wikimedia Foundation
>> >>
>> >>
>> >>
>> >> ------------------------------
>> >>
>> >> _______________________________________________
>> >> Analytics mailing list
>> >> Analytics@lists.wikimedia.org
>> >> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>
>> >>
>> >> End of Analytics Digest, Vol 38, Issue 24
>> >> *****************************************
>> >
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > Analytics@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>>
>> --
>> Oliver Keyes
>> Research Analyst
>> Wikimedia Foundation
>>
>
>
>
> --
> Dario Taraborelli
> Senior Research Scientist, Research and Data Lead
> Wikimedia Foundation
> http://wikimediafoundation.org
> http://nitens.org/taraborelli
>



-- 
Dario Taraborelli
Senior Research Scientist, Research and Data Lead
Wikimedia Foundation
http://wikimediafoundation.org
http://nitens.org/taraborelli
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to