Re: [Analytics] WikiGrok and EventLogging

2015-01-08 Thread Ori Livneh
On Thu, Jan 8, 2015 at 2:52 PM, Ryan Kaldari rkald...@wikimedia.org wrote: After further discussion, we've decided to just show WikiGrok to a fraction of users during the test. I currently have it set to show WikiGrok to 10 out of every 62 users or ~16% (the userToken is a base 62 number).

Re: [Analytics] WikiGrok and EventLogging

2015-01-08 Thread Nuria Ruiz
That should be fine, please give us a heads up when you deploy the instrumenting. On Thu, Jan 8, 2015 at 2:52 PM, Ryan Kaldari rkald...@wikimedia.org wrote: After further discussion, we've decided to just show WikiGrok to a fraction of users during the test. I currently have it set to show

Re: [Analytics] WikiGrok and EventLogging

2015-01-07 Thread Leila Zia
Thanks everyone for chiming in. Your comments were very helpful. :-) Nuria, I checked the per second pageview count for the pages wikigrok will be live on for 3 hours in 2015-01-07 (as a sample). We're talking about a total of ~170 events per sec for these pages. Of course major events can affect

Re: [Analytics] WikiGrok and EventLogging

2015-01-07 Thread Aaron Halfaker
Leila, It might be worthwhile to merge that article set with the webrequest data we have in order to get a sense for how many pageloads/second to expect. -Aaron On Tue, Jan 6, 2015 at 7:50 PM, Ryan Kaldari rkald...@wikimedia.org wrote: The highest volume events we are going to log will be:

Re: [Analytics] WikiGrok and EventLogging

2015-01-07 Thread Nuria Ruiz
Sorry, I send it too soon, trying again: We're talking about a total of ~170 events per sec for these pages. This is to high to log in 1:1 rate, we would need to do 1:10. At this time most events on EL logging log at a much lower rate, events over 1 per sec are the following, as you can see

Re: [Analytics] WikiGrok and EventLogging

2015-01-07 Thread Ryan Kaldari
Thanks everyone for the research on this! I'll go ahead and create a card for implementing sampling on the high-throughput WikiGrok events. Kaldari On Wed, Jan 7, 2015 at 5:20 PM, Nuria Ruiz nu...@wikimedia.org wrote: Sorry, I send it too soon, trying again: We're talking about a total of

Re: [Analytics] WikiGrok and EventLogging

2015-01-07 Thread Dario Taraborelli
agreed. Many of these articles will see spikes in traffic during the test (as the sample includes many celebrity articles) but the historical volume of traffic for the whole sample should give us a decent estimate of the throughput. I also wouldn’t worry about any events other than

Re: [Analytics] WikiGrok and EventLogging

2015-01-06 Thread Nuria Ruiz
(cc-ing mobile-tech) Since we do not the details of how wikigrok is used and its throughput of requests we can not estimate sampling ourselves. I imagine wikigrok is been deployed to a number of users and it is with that usage the mobile team could estimate the total throughput expected, with

[Analytics] WikiGrok and EventLogging

2015-01-06 Thread Leila Zia
Hi, The mobile team is planning to switch WikiGrok on for non-logged in users next week (2014-01-12). The widget will be on on 166,029 article pages in enwiki. There are two EventLogging schema that may collect data heavily and we want to make sure EL can handle the influx of data. The two

Re: [Analytics] WikiGrok and EventLogging

2015-01-06 Thread Ryan Kaldari
I can elaborate on this after I finished the SWAT deployment Gimme 30 minutes or so. On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia le...@wikimedia.org wrote: Hi, The mobile team is planning to switch WikiGrok on for non-logged in users next week (2014-01-12). The widget will be on on

Re: [Analytics] WikiGrok and EventLogging

2015-01-06 Thread Ryan Kaldari
The highest volume events we are going to log will be: 1. For each of the 166,000 articles, one event when the page loads 2. For each of the 166,000 articles, one event when the WikiGrok widget enters the viewport (about half as often as #1) These will be active for all mobile users, logged in