Re: [Analytics] pageviews_hourly table

2015-08-23 Thread Oliver Keyes
Indeed. For transparency, Joseph, Andrew and myself had a meeting late last week to talk about how we handle these issues. The resolution was to go for positive, as well as negative, checking, probably using Christian's "guard" framework. So, for example, suppose we want to make sure projects are

Re: [Analytics] pageviews_hourly table

2015-08-23 Thread Federico Leva (Nemo)
Tilman Bayer, 22/08/2015 19:33: And I know that other issues were caught by ErikZ's proactive vigilance, which will need to find an equivalent in the upcoming replacement for Wikistats. +1 Nemo ___ Analytics mailing list Analytics@lists.wikimedia.or

Re: [Analytics] pageviews_hourly table

2015-08-22 Thread Tilman Bayer
To add a bit: First, regarding to the initial technical discussion about the pageview definition used for pageview_hourly: It now seems that apart from Outreach wiki, it differs from the earlier Cube v0.5 data also regarding the inclusion of mediawiki.org and wikimediafoundation.org, see https://

Re: [Analytics] pageviews_hourly table

2015-08-17 Thread Kevin Leduc
Tilman, to answer your question, the presentation of analytics at Monthly Metrics Meetings will change month to month. Next month I am on vacation so I have asked Jon to present something. I'm assuming it will have Pageviews and be readership focused - it's up to Jon. On Mon, Aug 17, 2015 at 4:

Re: [Analytics] pageviews_hourly table

2015-08-17 Thread Oliver Keyes
This seems perfect. Is it currently used? On 17 August 2015 at 18:03, Andrew Otto wrote: > BTW, Christian foresaw this issue and wrote this: > https://github.com/wikimedia/analytics-refinery-source/tree/master/guard > > It should be useable for pageviews too, I think. For this issue, a guard >

Re: [Analytics] pageviews_hourly table

2015-08-17 Thread Andrew Otto
BTW, Christian foresaw this issue and wrote this: https://github.com/wikimedia/analytics-refinery-source/tree/master/guard It should be useable for pageviews too, I think. For this issue, a guard that made sure that outreach.wikimedia.org never appeared would have been an error. > On Aug 17

Re: [Analytics] pageviews_hourly table

2015-08-17 Thread Adam Baso
> > Yeah, I wasn't talking about review in the sense of using it, I was > talking about review in the sense of actively looking for issues. Makes sense. Thanks! ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman

Re: [Analytics] pageviews_hourly table

2015-08-17 Thread Oliver Keyes
On 17 August 2015 at 16:20, Adam Baso wrote: >> Whose job is it to review pageviews and update the definition when >> issues are found? > > > I see the thread evolved a bit today. But I'll note this for people going > through the archives: > > There seem to be a few levels of review of pageviews.

Re: [Analytics] pageviews_hourly table

2015-08-17 Thread Adam Baso
> > Whose job is it to review pageviews and update the definition when > issues are found? I see the thread evolved a bit today. But I'll note this for people going through the archives: There seem to be a few levels of review of pageviews. There's been stuff for the monthly metrics meetings (e.

Re: [Analytics] pageviews_hourly table

2015-08-17 Thread Oliver Keyes
On 17 August 2015 at 13:48, Joseph Allemandou wrote: > Hey Oliver, > > The analytics team is responsible for the pageview definition. > When finding issues, sending an email to the analytics mailing list is the > right thing to do :) > Indeed; my point is not about issues reported upstream. My po

Re: [Analytics] pageviews_hourly table

2015-08-17 Thread Joseph Allemandou
Hey Oliver, The analytics team is responsible for the pageview definition. When finding issues, sending an email to the analytics mailing list is the right thing to do :) On our end, we could surely do a better job to communicate changes in the pageview definition code for anybody interested to r

Re: [Analytics] pageviews_hourly table

2015-08-17 Thread Oliver Keyes
You should also note that donate-wiki pageviews are making it into the counts (again, the definition was designed to exclude these). Whose job is it to review pageviews and update the definition when issues are found? On 17 August 2015 at 10:32, Oliver Keyes wrote: > Just to clarify; there is no

Re: [Analytics] pageviews_hourly table

2015-08-17 Thread Oliver Keyes
Just to clarify; there is no need to ask me before making changes (obviously I find my approval for pageviews changes being sought incredibly flattering, but I am not the only person involved in this project ;p). What I'm more driving towards is directly informing customers when the definition is a

Re: [Analytics] pageviews_hourly table

2015-08-17 Thread Oliver Keyes
Excellent; thank you. On 17 August 2015 at 04:42, Joseph Allemandou wrote: > Oliver, > > It was a mistake from me to add the 'outreach' subdomain without asking you. > > From a documentation perspective, the analytics team uses that place to > document changes: > https://wikitech.wikimedia.org/wi

Re: [Analytics] pageviews_hourly table

2015-08-17 Thread Joseph Allemandou
Oliver, It was a mistake from me to add the 'outreach' subdomain without asking you. >From a documentation perspective, the analytics team uses that place to document changes: https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest and I didn't know about up-to-date documentation you sent.

Re: [Analytics] pageviews_hourly table

2015-08-16 Thread Oliver Keyes
Ah, I see the problem; someone patched it and never documented it. We have documentation at https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters of the generalised filters. There is also a log, on https://meta.wikimedia.org/wiki/Research:Page_view, of changes to the pageview defi

Re: [Analytics] pageviews_hourly table

2015-08-16 Thread Madhumitha Viswanathan
The new one. The code that generates it - - https://github.com/wikimedia/analytics-refinery/blob/master/hive/pageview/hourly/create_pageview_hourly_table.hql - https://github.com/wikimedia/analytics-refinery/tree/master/oozie/pageview/hourly On Sun, Aug 16, 2015 at 11:01 AM, Oliver Keyes wrot

[Analytics] pageviews_hourly table

2015-08-16 Thread Oliver Keyes
Is the pageviews_hourly table meant to contain pageviews according to the new or old definition? If old, where can I find aggregates for the new one? -- Oliver Keyes Count Logula Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia