Re: [Analytics] response to Magnus' post about page views

2014-02-20 Thread Ryan Kaldari
API would open up a world of new feature possibilities, many of which would be useful at promoting editor engagement. I'm sure you have a lot of other concerns to balance with such requests, but just wanted to throw in 2 cents from the mobile team. Cheers, Ryan Kaldari On Wed, Feb 19, 2014 a

[Analytics] pitching the Gender Edit Dashboard

2014-08-25 Thread Ryan Kaldari
ender-edit-dashboard Ryan Kaldari ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] pitching the Gender Edit Dashboard

2014-08-25 Thread Ryan Kaldari
On Mon, Aug 25, 2014 at 11:41 AM, Steven Walling wrote: > > On Mon, Aug 25, 2014 at 11:05 AM, Ryan Kaldari > wrote: > >> There is nothing stopping us, however, from analysing *relative* trends >> using existing data. For example, we could generate graphs showing the &g

Re: [Analytics] pitching the Gender Edit Dashboard

2014-08-25 Thread Ryan Kaldari
On Mon, Aug 25, 2014 at 4:54 PM, Steven Walling wrote: > > On Mon, Aug 25, 2014 at 12:21 PM, Ryan Kaldari > wrote: > >> You can get accurate information from bad or incomplete data. > > > The issue is not merely that data are incomplete like your tides example, >

Re: [Analytics] pitching the Gender Edit Dashboard

2014-08-25 Thread Ryan Kaldari
On Mon, Aug 25, 2014 at 7:08 PM, Steven Walling wrote: > > On Mon, Aug 25, 2014 at 6:10 PM, Ryan Kaldari > wrote: > >> Yes, it's biased, but do we have any reason to think that this bias has >> changed significantly over time? > > > Yes. For instance, the

Re: [Analytics] pitching the Gender Edit Dashboard

2014-08-25 Thread Ryan Kaldari
On Mon, Aug 25, 2014 at 6:22 PM, Dan Garry wrote: > Honestly, I disagree with pretty much everything you just said. Even if we > assume the bias has remained the same, we still don't understand how it > transforms the underlying data, and without that understanding any > conclusions you draw will

Re: [Analytics] pitching the Gender Edit Dashboard

2014-08-27 Thread Ryan Kaldari
On Tue, Aug 26, 2014 at 9:53 AM, Leila Zia wrote: > 1. We look at the self-reported gender data and do some simple > observations. > Pros: >+ we will have an updated view of the gender gap problem. >+ we may spread seeds for further internal and/or external research > about it. > Cons: >

Re: [Analytics] pitching the Gender Edit Dashboard

2014-08-28 Thread Ryan Kaldari
et their gender > preference and there's nothing for us to do? Or do we then decide that it > is important for us to gather good data so that we can actually know what's > going on? > > -Aaron > > > On Thu, Aug 28, 2014 at 4:50 AM, Ryan Kaldari > wrote: &g

Re: [Analytics] pitching the Gender Edit Dashboard

2014-08-28 Thread Ryan Kaldari
least help to control for overall changes in the rate, for example, due to the change in the interface that Steven mentioned. Kaldari On Aug 28, 2014, at 9:50 AM, Ryan Kaldari wrote: > We could restrict the query to only look at editors who had explicitly set > their gender preference

Re: [Analytics] pitching the Gender Edit Dashboard

2014-08-28 Thread Ryan Kaldari
for them. > > > > Given that it would be useful to have some data on gendered editing > patterns > > (whether we share it publicly or not), what are our options? > > > > - Jonathan > > > > > > On Thu, Aug 28, 2014 at 10:03 AM, Ryan Kaldari > >

Re: [Analytics] datasets.wikimedia.org

2014-09-12 Thread Ryan Kaldari
I created a documentation page for it on Wikitech: https://wikitech.wikimedia.org/wiki/Datasets.wikimedia.org Feel free to expand. On Tue, Aug 5, 2014 at 10:41 AM, Andrew Otto wrote: > Hi all! > > For a while now, we’ve been hosting some public datasets at > http://stat1001.wikimedia.org/public

Re: [Analytics] eventlogging largest tables

2014-09-30 Thread Ryan Kaldari
Maryana, would it be OK if we delete the MobileWebClickTracking records from before 2014? Would we still need those for any reason? On Tue, Sep 30, 2014 at 10:32 AM, Maryana Pinchuk wrote: > On Mon, Sep 29, 2014 at 3:10 PM, Dario Taraborelli < > dtarabore...@wikimedia.org> wrote: > >> On Sep 27,

Re: [Analytics] [WikimediaMobile] Reportcards

2014-10-15 Thread Ryan Kaldari
There are other dashboards too, like http://ee-dashboard.wmflabs.org/dashboards/enwiki-features# and http://ee-dashboard.wmflabs.org/dashboards/enwiki-metrics# On Wed, Oct 15, 2014 at 2:46 PM, Dan Andreescu wrote: > Just curious... I just discovered http://reportcard.wmflabs.org/ which >> someho

Re: [Analytics] [WikimediaMobile] Reportcards

2014-10-15 Thread Ryan Kaldari
There's a nice directory of the various dashboards and stats pages at https://meta.wikimedia.org/wiki/Statistics. ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] Writing EventLogging events to database failed on 2014-11-18 for ~50 minutes between 14:14 and 15:02

2014-11-18 Thread Ryan Kaldari
It looks like the same problem is happening now. No new events have been written to the log tables on analytics-store for about the past hour and a half. And it looks like the slave db stopped replicating about 6 hours ago. Ryan Kaldari On Tue, Nov 18, 2014 at 9:31 AM, Christian Aistleitner

Re: [Analytics] EventLogging and Adblock on Linux/Firefox

2014-12-11 Thread Ryan Kaldari
If Adblock is blocking EventLogging requests, this could have a serious effect on the accuracy of EventLogging data. Adblock has 10M+ users. It would be good to troubleshoot this and find out exactly what Adblock settings trigger this and escalate the issue if needed. Kaldari On Thu, Dec 11, 2014

Re: [Analytics] EventLogging and Adblock on Linux/Firefox

2014-12-11 Thread Ryan Kaldari
Leila, can you give us more details about your Adblock settings and what filter subscription(s) you have set up? On Thu, Dec 11, 2014 at 3:00 PM, Leila Zia wrote: > > On my machine, Linux/Chrome works fine with Adblock 2.14.4 and Chrome > 35.0.1914.114 > > On Thu, Dec 11, 2014 at 2:57 PM, Toby Ne

Re: [Analytics] EventLogging data QA

2014-12-15 Thread Ryan Kaldari
I filed a bug about the difficultly of debugging Schema failure back in November, but no one ever responded to it: https://phabricator.wikimedia.org/T75678 On Mon, Dec 15, 2014 at 10:06 AM, Toby Negrin wrote: > > I share Christian's concerns - > > Dario/Leila - can you comment based on your recen

Re: [Analytics] EventLogging data QA

2014-12-16 Thread Ryan Kaldari
I added a comment to the ticket requesting a simple error log for validation errors. I think that would solve about 50% of the problem and should be easy to implement. Kaldari On Mon, Dec 15, 2014 at 5:58 PM, Leila Zia wrote: > > > > On Monday, December 15, 2014, Kevin Leduc wrote: > >> >> I'd

Re: [Analytics] WikiGrok and EventLogging

2015-01-06 Thread Ryan Kaldari
I can elaborate on this after I finished the SWAT deployment Gimme 30 minutes or so. On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia wrote: > Hi, > > The mobile team is planning to switch WikiGrok on for non-logged in > users next week (2014-01-12). The widget will be on on 166,029 article > pag

Re: [Analytics] WikiGrok and EventLogging

2015-01-06 Thread Ryan Kaldari
number of users and it is with that usage the mobile > team could estimate the total throughput expected, with this throughput we > can recommend sampling ratios. > > > Thanks for asking about this without before deploying! > > > On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kalda

Re: [Analytics] Only parts of EventLogging events getting written to the database since 2015-01-07 ~1:55

2015-01-07 Thread Ryan Kaldari
Who is actually maintaining the EventLogging Extension now? As far as I can tell, none of the members of the Analytics-EventLogging project in Phabricator are developers. This makes it hard to know who to ping when there is a problem. For example, this EL bug that I filed a month ago was never tria

[Analytics] Beta Labs EventLogging logs

2015-01-07 Thread Ryan Kaldari
It seems the EventLogging logs have disappeared from /var/log/upstart/ on Beta Labs (deployment-bastion). Does anyone know where they are now? Kaldari ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/a

Re: [Analytics] Beta Labs EventLogging logs

2015-01-07 Thread Ryan Kaldari
e-events.log > eventlogging_processor-server-side-events.log > > On Wed, Jan 7, 2015 at 12:57 PM, Ryan Kaldari > wrote: > >> It seems the EventLogging logs have disappeared from /var/log/upstart/ on >> Beta Labs (deployment-bastion). Does anyone

Re: [Analytics] WikiGrok and EventLogging

2015-01-07 Thread Ryan Kaldari
270 events per sec) I would think that sending 10 events per sec on your >>>> case would be pretty safe. That would be sampling about 1/200 for a load >>>> event per every pageview. This seems like a good upper bound. >>>> >>>> Now, since there are

Re: [Analytics] WikiGrok and EventLogging

2015-01-08 Thread Ryan Kaldari
where around 60 events per second in that case. Would that be acceptable or should we sample the widget-impression event as well? Kaldari On Wed, Jan 7, 2015 at 5:33 PM, Leila Zia wrote: > Thanks, Nuria! > > On Wed, Jan 7, 2015 at 5:30 PM, Ryan Kaldari > wrote: > >> Thanks ev

Re: [Analytics] WikiGrok and EventLogging

2015-01-08 Thread Ryan Kaldari
t; > See that now we go beyond 300 events per sec here and there: > http://ibin.co/1nTsNYc1bekd > > I recommend sampling those events 1:10. > > Thanks, > > > Nuria > > On Thu, Jan 8, 2015 at 12:06 PM, Ryan Kaldari > wrote: > >> After talking with Dario a

Re: [Analytics] Beta Labs EventLogging logs

2015-01-09 Thread Ryan Kaldari
ases where server-side EventLogging was failing on en.wiki, but working on Beta Labs and locally. It would also be useful for catching obscure failures that only happen for edge cases. Kaldari On Wed, Jan 7, 2015 at 1:27 PM, Ryan Kaldari wrote: > Ah, sorry, I was looking on the wrong serve

Re: [Analytics] Beta Labs EventLogging logs

2015-01-11 Thread Ryan Kaldari
n EventLogging error log for the cluster. Kaldari On Fri, Jan 9, 2015 at 5:49 PM, Ryan Kaldari wrote: > >> Looks like I've lost permission to view those logs on Beta Labs again. >> Any chance you could fix them? Also, was any progress ever made on piping >> the live clu

Re: [Analytics] Beta Labs EventLogging logs

2015-01-13 Thread Ryan Kaldari
On Tue, Jan 13, 2015 at 2:50 PM, Kevin Leduc wrote: > Access permisions > What's the machine and the log you're trying to access (please be > specific, I am not a developer so assume I know very little). I'll pass > this on to Ops so they can have a look at why permissions changed. This > shoul

Re: [Analytics] Virtual file view hack for Media Viewer views

2015-02-05 Thread Ryan Kaldari
I have to admit that I haven't read all of this rather lengthy thread, but why wouldn't we just track this with EventLogging? That would avoid all the pitfalls of other possible solutions: dealing with caching, creating bogus extra file requests, etc. On Thu, Feb 5, 2015 at 8:51 AM, Toby Negrin w

[Analytics] Data on Wikidata references coverage

2015-03-19 Thread Ryan Kaldari
Hi, I'm looking to generate some data around what percentage of claims in Wikidata have references. What's the best way for me to get this data? As a bonus, I would like to find out what percentage of claims in Wikidata have references other than "XX Wikipedia". Thanks, Kaldari

Re: [Analytics] noccokie tag on X-analytics

2015-10-21 Thread Ryan Kaldari
I was under the impression that most of the MediaWiki bot frameworks do accept cookies, but I imagine many of the home-made bots don't. Seems like using an unknown useragent string might be a better proxy. On Wed, Oct 21, 2015 at 1:49 PM, Nuria Ruiz wrote: > >What was the motivation for this cha

Re: [Analytics] On toxic communities

2015-11-13 Thread Ryan Kaldari
I was skeptical of even reading this article, but it actually seems pretty insightful. It also seems more relevant to Wikipedia than I was expecting: "The answer had to be community-wide reform of cultural norms. We had to change how people thought about online society and change their expectations

Re: [Analytics] top articles script

2016-01-22 Thread Ryan Kaldari
Any idea why the most popular article in India is "-"? CCing Dan Garry of Discovery team. On Fri, Jan 22, 2016 at 5:13 PM, Tilman Bayer wrote: > Below is an example Hive query yielding the 50 most viewed pages in > India during December 2015. It took less than 10 minutes of wall clock > time to

Re: [Analytics] WikimediaBot convention

2016-01-27 Thread Ryan Kaldari
Yeah, I don't see anything at [[Manual:Bots]] that mentions a user-agent convention for bots that edit Wikipedia. As far as I know, there isn't any. Most editing bots either use a user-agent string set by the framework they are using or have a completely unique user-agent string. It seems like it w

[Analytics] looking for stats on edit conflicts

2016-02-10 Thread Ryan Kaldari
The Community Tech team is trying to find out stats about edit conflicts. It looks like there was a patch merged back in January to collect stats on this (https://gerrit.wikimedia.org/r/#/c/266760/2/includes/EditPage.php) but I can't figure out where this is actually collecting the stats at. It loo

Re: [Analytics] looking for stats on edit conflicts

2016-02-10 Thread Ryan Kaldari
On Wed, Feb 10, 2016 at 12:08 PM, Dan Andreescu wrote: > What you're looking at now is the percent of edits that ended in an edit > conflict since last April. > So when it says the average edit conflict rate for VisualEditor on 2015-10-07 was "0.01", does that mean 1% or 0.01%? I'm guessing 1%,

Re: [Analytics] Demographics survey

2016-04-14 Thread Ryan Kaldari
The discussion on Jimbo's talk page centers around the fact that the demographic information at https://en.wikipedia.org/wiki/Wikipedia#Diversity is from 2008. Even if survey results are always biased, surely it would be better to have biased data from 2016 than biased data from 2008. On Thu, Apr

Re: [Analytics] Demographics survey

2016-04-14 Thread Ryan Kaldari
This request is not about English Wikipedia; it's about demographics for Wikipedia editors in general. Apparently, the last time we did a cross-wiki demographic survey was 2008. Let me know if that isn't correct. The specific questions that people would like answers to are: * What is the gender of

Re: [Analytics] Demographics survey

2016-04-14 Thread Ryan Kaldari
egular English Wikipedia editors?"* > > Either way, I have no objections. Just opinions :) > > Jonathan > > On Thu, Apr 14, 2016 at 9:00 AM, Ryan Kaldari > wrote: > >> This request is not about English Wikipedia; it's about demographics for >> Wikipedia editors i

[Analytics] The WikiLove research project

2016-05-19 Thread Ryan Kaldari
The folks on Meta are considering whether or not to enable WikiLove and they were hoping to find some data about it. There is a research project on Meta about WikiLove (https://meta.wikimedia.org/wiki/Research:WikiLove), but it seems to have been "in progress" since 2011. Could someone in Analytics

Re: [Analytics] Pagecount Datasets to be Deprecated at the end of May

2016-05-26 Thread Ryan Kaldari
Hey Dan, thanks for the reminder! I'm worried there are a lot of community and GLAM tools that rely on these datasets and are not yet transitioned to the new data sources (like WikiProject popular pages ). Is there any chance we could get a 1 month reprieve

Re: [Analytics] Pagecount Datasets to be Deprecated at the end of May

2016-05-27 Thread Ryan Kaldari
> matter of changing the url you download from, the rest is meant to be > compatible and we can help if you have questions. > > From: Ryan Kaldari > Sent: Thursday, May 26, 2016 17:27 > To: A mailing list for the Analytics Team at WMF and everybody who has an > interest in W

Re: [Analytics] Pageview analysis graphs not loading

2016-06-23 Thread Ryan Kaldari
Musikanimal fixed it. > On Jun 23, 2016, at 1:09 AM, Pine W wrote: > > Thanks Toby. > > Pine > >> On Jun 22, 2016 16:02, "Toby Negrin" wrote: >> ok -- I'm getting a spinning wheel of doom where the graphs used to be. I >> suspect there's something amiss with the underlying service. >> >> h

Re: [Analytics] Pagecount Datasets to be Deprecated at the end of May

2016-07-01 Thread Ryan Kaldari
gt; rush getting back to us. > > On Fri, May 27, 2016 at 12:45 PM, Ryan Kaldari > wrote: > >> Cool. WikiProject Popular Pages is fixed now, BTW. We'll try to make sure >> everyone is switched over ASAP. Thanks for the extra time! >> >> >> On May 26, 201

Re: [Analytics] browser dashboards again!

2016-08-30 Thread Ryan Kaldari
Very cool! At first I was confused by Ubuntu being the 3rd most popular operating system.[1] But then I realized it was actually iOS, which for some reason is missing from the key. 1. https://analytics.wikimedia.org/dashboards/browsers/#all-sites-by-os On Tue, Aug 30, 2016 at 12:00 PM, Toby Negri

Re: [Analytics] Tool to identify the articles versions across Wikipedia language editions?

2016-10-31 Thread Ryan Kaldari
You can do such queries through the Wikidata Query Service ( https://query.wikidata.org/). For example, if you wanted to get a list of 100 paintings by women that have articles in both the French and English Wikipedias, you would do something like: SELECT DISTINCT ?painting ?paintingLabel ?artist

[Analytics] fishy browser stats

2017-07-21 Thread Ryan Kaldari
According to... https://analytics.wikimedia.org/dashboards/browsers/#all-sites-by-browser/browser-family-and-major-hierarchical-view ... IE7 accounts for 2.5% of all pageviews in the last month. According to... https://analytics.wikimedia.org/dashboards/browsers/#desktop-site-by-browser/browser-fa

[Analytics] abnormal traffic to https://en.wikivoyage.org/wiki/Zimbabwe

2018-02-13 Thread Ryan Kaldari
Since the beginning of February, English Wikivoyage has seen it's daily pageviews double: http://tools.wmflabs.org/siteviews/?platform=all-access&source=pageviews&agent=user&range=latest-20&sites=en.wikivoyage.org This seems to be caused by a sustained spike in desktop-only views to the Zimbabwe p

[Analytics] need metric definition clarification

2019-03-27 Thread Ryan Kaldari
I need clarification about whether or not this graph reflects deletions or not. It seems to be unclear: https://meta.wikimedia.org/w/index.php?title=Research%3AWikistats_metrics%2FNew_pag

Re: [Analytics] need metric definition clarification

2019-03-27 Thread Ryan Kaldari
ow 3. The reason I was > confused and the code does not show an explicit filter for deleted pages is > because we exclude deleted pages from the entire dataset right now. > > On Wed, Mar 27, 2019 at 12:41 PM Ryan Kaldari > wrote: > >> I need clarification

Re: [Analytics] Wikipedia Early Page View Data Set Inquiry

2020-01-16 Thread Ryan Kaldari
Note that the definition of pageviews has changed several times over the years. Only the data from 2015 to present is strictly comparable. I'm sure some data analysts will chime in with more details. Good luck with your project! > On Jan 15, 2020, at 6:59 PM, Emily Chen wrote: > > Hi, > > My

Re: [Analytics] Community health metrics kit: Input needed!

2020-02-20 Thread Ryan Kaldari
Hey Joe, whatever happened with this? Is it still being worked on? On Fri, Oct 5, 2018 at 3:29 PM Joe Sutherland wrote: > Hello everyone - apologies for cross-posting! *TL;DR*: We would like your > feedback on our Metrics Kit project. Please have a look and comment on > Meta-Wiki: > https://meta

Re: [Analytics] Community health metrics kit: Input needed!

2020-02-25 Thread Ryan Kaldari
Anyone else know anything about the fate of the community health metrics kit? The wiki page <https://meta.wikimedia.org/wiki/Community_health_initiative/Metrics_kit> still says it's expected to be launched in 2019. On Thu, Feb 20, 2020 at 1:19 PM Ryan Kaldari wrote: > Hey J

Re: [Analytics] Community health metrics kit: Input needed!

2020-02-27 Thread Ryan Kaldari
A quick update: I talked to Patrick Earley and he informed me that the Community Health Metrics Kit project was put on ice last year due to lack of resources. On Wed, Feb 26, 2020 at 4:57 AM Cristian Consonni wrote: > Hi, > > On 25/2/20 18:35, Dan Andreescu wrote: > > In my last meeting with Joe

[Analytics] EventLogging blocked by ad blockers

2020-09-22 Thread Ryan Kaldari
say that ad blockers should not be blocking EventLogging (since it's just an internal logging system)? 2. If the answer to #1 is "yes", could we change the URL that EventLogging uses so that it is no longer blacklisted by ad blockers? -- *Ryan Kaldari* (they/them) Dire

Re: [Analytics] EventLogging blocked by ad blockers

2020-09-22 Thread Ryan Kaldari
> > In most instances what we look when deriving insights are ratios. For > example: "of the people that saw the red link how many clicked it". In this > scenario, with an adequate sample sizes, insights can be extracted without > any issues. > Yes, I agree that in most cases this doesn't signific