Re: [Analytics] How many times has a video been played?

2015-12-14 Thread Dan Andreescu
> > I don't need, nor want, access to any data about unique readers/viewers. > Is there a way of sanitizing the data? There is public data about pageviews > so it seems to me that there should also be public data about media > playback. > > Alternatively, is there a way that I can file a request wi

Re: [Analytics] Announcing the pageview API

2015-12-14 Thread Nuria
Kudos to Dan for fighting for this project, so looking forward to see how people will use it! > On Dec 14, 2015, at 3:18 PM, Oliver Keyes wrote: > > Yay! Thank you to the AnEng team for this awesome work :) > >> On 14 December 2015 at 17:23, Rachel diCerbo wrote: >> Excellent news! Thank yo

Re: [Analytics] Python client for the new pageview API

2015-12-14 Thread Aaron Halfaker
Awesome! Welcome to the unix-like MediaWiki python libraries revolution. Now to set aside some time to contribute and play around :) On Mon, Dec 14, 2015 at 8:32 AM, Dan Andreescu wrote: > I wasn't aware of some conventions that came before me, so I moved the > project from milimetric/wmf to m

Re: [Analytics] Page view API questions regarding user agent

2015-12-14 Thread Madhumitha Viswanathan
+1 Oliver - User agents tagged with WikimediaBot are tagged as bot - I do agree that our documentation on this can be approved, I'll update the Webrequest and Pageview tables docs to reflect this. The backfilling jobs for May-July have been paused at the moment, the plan is to resume backfilling i

Re: [Analytics] Page view API questions regarding user agent

2015-12-14 Thread Felix J. Scholz
Thanks for the quick answers, Oliver! On Mon, Dec 14, 2015 at 6:45 PM, Oliver Keyes wrote: > Hey Felix, > > To answer some questions in order: > > 1. Bots are automated systems with a Wikimedia specific tag > (WikimediaBot, iirc) in their user agent. We don't expect this to be > widely adopted y

Re: [Analytics] Page view API questions regarding user agent

2015-12-14 Thread Oliver Keyes
Hey Felix, To answer some questions in order: 1. Bots are automated systems with a Wikimedia specific tag (WikimediaBot, iirc) in their user agent. We don't expect this to be widely adopted yet because it hasn't been widely advertised. The standard itself is very new, which is probably why you do

[Analytics] Page view API questions regarding user agent

2015-12-14 Thread Felix J. Scholz
Dear All: Maybe this question is a little bit too simple, but I did not immediately find the answer in the docs. How does the API differentiate between the two user agents spider and bot? I'm asking because for some articles, there seems to be no bot traffic at all, including the main page in Au

Re: [Analytics] How many times has a video been played?

2015-12-14 Thread Madhumitha Viswanathan
There's a task in our backlog to publish this data as part of the API - https://phabricator.wikimedia.org/T88775. On Mon, Dec 14, 2015 at 2:58 PM, Pine W wrote: > I don't need, nor want, access to any data about unique readers/viewers. > Is there a way of sanitizing the data? There is public dat

Re: [Analytics] Announcing the pageview API

2015-12-14 Thread Oliver Keyes
Yay! Thank you to the AnEng team for this awesome work :) On 14 December 2015 at 17:23, Rachel diCerbo wrote: > Excellent news! Thank you team! > > > On Mon, Dec 14, 2015 at 2:10 PM, Dan Andreescu > wrote: >> >> \o/ :) >> >> On Mon, Dec 14, 2015 at 4:54 PM, Kevin Leduc wrote: >>> >>> Hi All, >>

Re: [Analytics] How many times has a video been played?

2015-12-14 Thread Pine W
I don't need, nor want, access to any data about unique readers/viewers. Is there a way of sanitizing the data? There is public data about pageviews so it seems to me that there should also be public data about media playback. Alternatively, is there a way that I can file a request with someone wh

Re: [Analytics] How many times has a video been played?

2015-12-14 Thread Oliver Keyes
I should caution that for idle questions I sincerely doubt cluster access will be given; there's no way of partitioning it so that you can't access, say, random readers' IP addresses ;p On 14 December 2015 at 17:54, Madhumitha Viswanathan wrote: > Hi Pine, > > Yes, you need stat1002 access to run

Re: [Analytics] How many times has a video been played?

2015-12-14 Thread Madhumitha Viswanathan
Hi Pine, Yes, you need stat1002 access to run Hive queries. It's not the same as Labs. There's plenty of documentation here, on how to request access, and how to query data - https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hive. On Mon, Dec 14, 2015 at 2:42 PM, Pine W wrote: > Hi Dan, > >

Re: [Analytics] Data collection

2015-12-14 Thread Oliver Keyes
.se or https://dumps.wikimedia.org/other/pagecounts-raw/ > . There must be a good method to go through and pick out page views by name > rather than by hand (which obviously isn’t feasible)? I’d also need to be > able to find the total number of page views for each period in order to > standardize

Re: [Analytics] How many times has a video been played?

2015-12-14 Thread Pine W
Hi Dan, I have a Labs account which I've barely used. Is access to the cluster a separate step from having access to Labs? Also, is there a "how to" guide somewhere for how to query the cluster? Thanks, Pine On Mon, Dec 14, 2015 at 2:11 PM, Dan Andreescu wrote: > Pine, right now you can eithe

Re: [Analytics] Data collection

2015-12-14 Thread Caitlin.Gardner
dner ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics -- Thank you. Alex Druk, PhD wikipediatrends.com alex.d...@gmail.com (775) 237-8550 Google voice -- next part

Re: [Analytics] Announcing the pageview API

2015-12-14 Thread Rachel diCerbo
Excellent news! Thank you team! On Mon, Dec 14, 2015 at 2:10 PM, Dan Andreescu wrote: > \o/ :) > > On Mon, Dec 14, 2015 at 4:54 PM, Kevin Leduc wrote: > >> Hi All, >> >> It's official: we have a pageview API. You can read more about it on >> *Wikipedia's >> blog* >> http://blog.wikimedia.org

Re: [Analytics] How many times has a video been played?

2015-12-14 Thread Dan Andreescu
Pine, right now you can either query Hive if you have access to the cluster, or you can download the days you're interested from here: http://dumps.wikimedia.org/other/mediacounts/daily/2015/ and crunch the numbers for the articles you're interested in (not too bad) On Mon, Dec 14, 2015 at 5:01 PM

Re: [Analytics] Announcing the pageview API

2015-12-14 Thread Dan Andreescu
\o/ :) On Mon, Dec 14, 2015 at 4:54 PM, Kevin Leduc wrote: > Hi All, > > It's official: we have a pageview API. You can read more about it on > *Wikipedia's > blog* > http://blog.wikimedia.org/2015/12/14/pageview-data-easily-accessible/ > > You can help us spread the word via > *Twitter* https

[Analytics] How many times has a video been played?

2015-12-14 Thread Pine W
Hi Analytics, How do I determine how many times this video has been played in the last 90 days? Thanks, Pine ___ Analytics mailing list Analytics@l

[Analytics] Announcing the pageview API

2015-12-14 Thread Kevin Leduc
Hi All, It's official: we have a pageview API. You can read more about it on *Wikipedia's blog* http://blog.wikimedia.org/2015/12/14/pageview-data-easily-accessible/ You can help us spread the word via *Twitter* https://twitter.com/Wikipedia/status/676511422902218752 or *Facebook* https://www.fa

[Analytics] FYI: No Wikimedia Research Showcase for December

2015-12-14 Thread Aaron Halfaker
Hey folks, I'm emailing to let you know that we won't be holding a Wikimedia Research Showcase for this month due to the holidays. We'll kick it back in gear for January 2016 though. Watch this page for updates: https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase Happy holidays! -Aaron _

Re: [Analytics] mobile and zero legacy tsvs on stat1002

2015-12-14 Thread Andrew Otto
If we don’t hear any objections by Dec 30th, we will move forward with the plan to no longer generate this data. > On Dec 11, 2015, at 12:40, Andrew Otto wrote: > > Hi all, > > Soon, we will be merging the mobile web cache requests with the text cache > requests. text caches will now serve

Re: [Analytics] mobile and zero legacy tsvs on stat1002

2015-12-14 Thread Oliver Keyes
Gotcha! Long as it's set for every request, perfect :) On 14 December 2015 at 04:50, Joseph Allemandou wrote: > @Oliver: I think the closest we'll have is the access-method field, that can > take values desktop, mobile-web, mobile-app. > > On Sun, Dec 13, 2015 at 8:37 PM, Oliver Keyes wrote: >>

Re: [Analytics] Python client for the new pageview API

2015-12-14 Thread Dan Andreescu
I wasn't aware of some conventions that came before me, so I moved the project from milimetric/wmf to mediawiki-utilities/python-mwviews. I promise it'll stay there, sorry for the inconvenience. Updated links: PyPI: https://pypi.python.org/pypi/mwviews/0.0.2 code: https://github.com/mediawiki-ut

Re: [Analytics] Data collection

2015-12-14 Thread Federico Leva (Nemo)
Erik Zachte, 14/12/2015 14:14: I can run similar reports for earlier months. Thanks for publishing that code too! https://github.com/wikimedia/analytics-wikistats/tree/master/dammit.lt/bash Nemo ___ Analytics mailing list Analytics@lists.wikimedia

Re: [Analytics] Data collection

2015-12-14 Thread Erik Zachte
Hi Caitlin, Here is a breakdown of categories within Phytopathology on English wikipedia: http://ow.ly/VQNVL and the articles within those categories ranked by page view for Oct 2015 : http://ow.ly/VQNCv I can run similar reports for earlier months. Cheers, Erik From: Analyti

Re: [Analytics] Readership metrics for the fortnight until December 6, 2015

2015-12-14 Thread Federico Leva (Nemo)
Interesting country breakdown! Tilman Bayer, 14/12/2015 12:32: For the top three, I looked at how pageviews developed on a daily basis during the last three month including the week after this large change (until Dec 6): In Greece, the +21.6% rise was the result of an isolated spike from Nove

[Analytics] Readership metrics for the fortnight until December 6, 2015

2015-12-14 Thread Tilman Bayer
Hi all, here is the usual look at our most important readership metrics. This time examining, among other things, pageview changes in various countries, such as the effect of a brief block in China and of the sudden popularity of a Greek expression derived from Latin. The Android has been seeing a

Re: [Analytics] mobile and zero legacy tsvs on stat1002

2015-12-14 Thread Joseph Allemandou
@Oliver: I think the closest we'll have is the access-method field, that can take values desktop, mobile-web, mobile-app. On Sun, Dec 13, 2015 at 8:37 PM, Oliver Keyes wrote: > Not an answer to the question, but a question of my own; will the > nature of the content being served still be present

Re: [Analytics] Data collection

2015-12-14 Thread Alex Druk
Hi Caitlin, If you have a list of relevant articles and understanding what time period you would like to research, contact me of the list and I probably can help you. Also my advise: have a look at wikipediatrends.com or stats.grok.se and try some of your queries to get a better undestanding of po