[Analytics] Interpreting opt-out numbers

2014-06-09 Thread Gergo Tisza
Hi all, after rolling out MediaViewer to the English and German Wikipedias, we have gotten quite a few complaints; to understand how representative they are, I have looked at the number of users who have opted out (there is a user preference for that; it is linked from the MediaViewer interface, a

Re: [Analytics] Varnishkafka Delivery Errors

2014-06-09 Thread Sean Pringle
Hi! A bit more info: Being the earliest Opsen around I poked around on analytics1021 and 1022 (the brokers) and found a disk failure for /dev/sdf on analytics1021, along with corresponding java call stack in the log when the broker died due to the fs remounting as read-only. I unmounted the disk

Re: [Analytics] Varnishkafka Delivery Errors

2014-06-09 Thread Andrew Otto
We do care! I generally don’t check my email on weekends, and I didn’t get any SMS texts about this, h! This is a known occasional problem, see the last reply in this ticket: https://rt.wikimedia.org/Ticket/Display.html?id=6877 This problem is hard to reproduce. I am waiting for a pro

Re: [Analytics] Varnishkafka Delivery Errors

2014-06-09 Thread Andrew Otto
Thanks Sean! On Jun 9, 2014, at 3:56 AM, Sean Pringle wrote: > Hi! > > A bit more info: > > Being the earliest Opsen around I poked around on analytics1021 and 1022 (the > brokers) and found a disk failure for /dev/sdf on analytics1021, along with > corresponding java call stack in the lo

Re: [Analytics] Data quality issues with account creation log

2014-06-09 Thread Nuria Ruiz
>Just to narrow this down a little further from the DB server-side: the eventlogging tables do use utf-8, so the fix probably doesn't require laborious schema changes (if that's what you meant by changing database types). To follow the structure on mediawiki I think the easiest is to change db type

Re: [Analytics] Interpreting opt-out numbers

2014-06-09 Thread Aaron Halfaker
Hmm... It's hard to evaluate your strategy without more context. Why are you limiting your query to users with more than 10k lifetime edits? Are your trying to generate a proportion of a subset of users? If so, what's the denominator? Also, opt-out rates tend to be low no matter how obvious and

Re: [Analytics] [Multimedia] Interpreting opt-out numbers

2014-06-09 Thread Gergo Tisza
On Mon, Jun 9, 2014 at 11:20 AM, Aaron Halfaker wrote: > Hmm... It's hard to evaluate your strategy without more context. Why are > you limiting your query to users with more than 10k lifetime edits? Are > your trying to generate a proportion of a subset of users? If so, what's > the denominat

Re: [Analytics] [Multimedia] Interpreting opt-out numbers

2014-06-09 Thread Gergo Tisza
On Mon, Jun 9, 2014 at 11:20 AM, Aaron Halfaker wrote: > Also, opt-out rates tend to be low no matter how obvious and desired they > are. If the goal of this analysis is to find out if opt-out rates are high > (or low), then I'd recommend comparing them with opt-out rates for another > feature.

Re: [Analytics] [Multimedia] Interpreting opt-out numbers

2014-06-09 Thread Oliver Keyes
Re user counts; we have, I think, 1 editor who has 1M+ edits. I imagine we don't have many with 100K edits. How big are those user groups? It's useful to know that power users are more likely to opt out, great, but if you only have 30 users in your definition of 'power users' it's going to be throw

Re: [Analytics] [Multimedia] Interpreting opt-out numbers

2014-06-09 Thread Andrew Gray
I was just about to suggest this and, indeed, say that Oliver had done some similar work before :-) If you look at (# of users who have opted out) / (# of users who have set preferences) you may get a more meaningful result - you'll be screening out all the ones who don't know how to or aren't com

[Analytics] Event Logging Handover

2014-06-09 Thread Toby Negrin
Hi All -- This email is a long time coming, but I wanted to confirm that the Analytics team is going to take over maintenance/development of Event Logging from Ori. Nuria will be co-ordinating the details transition (based on the plan we've all put together [1]) Please look for further emails fro

Re: [Analytics] Data quality issues with account creation log

2014-06-09 Thread Sean Pringle
On Tue, Jun 10, 2014 at 1:04 AM, Nuria Ruiz wrote: > >Just to narrow this down a little further from the DB server-side: the > eventlogging tables do use utf-8, so the fix probably doesn't require > laborious schema changes (if that's what you meant by changing database > types). > To follow the

Re: [Analytics] Data quality issues with account creation log

2014-06-09 Thread Ori Livneh
On Mon, Jun 9, 2014 at 8:00 PM, Sean Pringle wrote: > On Tue, Jun 10, 2014 at 1:04 AM, Nuria Ruiz wrote: > >> >Just to narrow this down a little further from the DB server-side: the >> eventlogging tables do use utf-8, so the fix probably doesn't require >> laborious schema changes (if that's wh

Re: [Analytics] Data quality issues with account creation log

2014-06-09 Thread Sean Pringle
On Tue, Jun 10, 2014 at 1:12 PM, Ori Livneh wrote: > > ...and back to utf8 as default charset > > The version of MySQLdb that is packaged for Precise does not know about > utf8mb4. I (inexcusably) tested against the dev branch of MySQLdb. > Bet that was a fun day :) Somewhat like to