Re: [Analytics] Pivot is now Turnilo!

2018-05-23 Thread Toby Negrin
Thanks Andrew! This data is super useful. On Wed, May 23, 2018 at 07:28 Andrew Otto wrote: > :) > > This is done! pivot.wikimedia.org now redirects to turnilo.wikimedia.org. > > https://phabricator.wikimedia.org/T194427 > > On Mon, May 21, 2018 at 2:36 PM, Jon Katz

Re: [Analytics] Migrated Reportcard with Updated Data

2017-04-07 Thread Toby Negrin
Congrats Nuria and team! This looks great and I'm super excited for Wikistats 2.0. -Toby On Fri, Apr 7, 2017 at 11:30 AM, Nuria Ruiz wrote: > Hello! > > The Analytics team would like to announce that we have migrated the > reportcard to a new domain: > >

Re: [Analytics] Readership metrics for the timespan until December 4, 2016

2016-12-14 Thread Toby Negrin
Thank you so much for this Zareen! It's really great to see this report -- so much interesting data to think about! -Toby On Mon, Dec 12, 2016 at 5:45 PM, Zareen Farooqui wrote: > Hi all, > > This resumes the usual look >

Re: [Analytics] [Wikidata] SPARQL power users and developers

2016-10-04 Thread Toby Negrin
We already track use of the action API. Combine with this? https://www.mediawiki.org/wiki/Wikimedia_Reading_Infrastructure_team/Action_API_request_analytics -Toby On Tue, Oct 4, 2016 at 7:56 AM, Nuria Ruiz wrote: > mmm...There are several things here that are already

Re: [Analytics] Upcoming reboots of stat and Hadoop hosts due to Kernel upgrades

2016-09-21 Thread Toby Negrin
engineering is probably a better list -- I think hadoop is sufficiently mainstream now :) On Wed, Sep 21, 2016 at 10:05 AM, Dan Andreescu wrote: > + research > > (btw, if we cc analytics and research does that reach everyone using these > boxes? Like discovery,

[Analytics] browser dashboards again!

2016-08-30 Thread Toby Negrin
The browser dashboards made Ben Evans[1] _and_ Product Hunt[2] :) Congrats! -Toby [1] http://us6.campaign-archive2.com/?u=b98e2de85f03865f1d38de74f=73f838f55a [2] https://www.producthunt.com/tech/data-dashboard-by-wikimedia-foundation ___ Analytics

Re: [Analytics] [Pageview API] Data Retention Question

2016-07-29 Thread Toby Negrin
Just curious -- how much would it cost to make all of the data available at a daily granularity for a year? On Fri, Jul 29, 2016 at 4:30 PM, Jonathan Morgan wrote: > Hi Dan, > > Making dumps much easier to use would definitely help. We Wikipedia > researchers are kind of

Re: [Analytics] [Pageview API] Data Retention Question

2016-07-29 Thread Toby Negrin
My personal use cases which are primarily using the visualization tools would appreciate more dimensionality in daily and weekly views (which increases storage). I think you should definitely degrade the resolution, possibly more aggressively than you propose. RRDTool has been doing this for

Re: [Analytics] Issues with clickstream data

2016-07-08 Thread Toby Negrin
Another approach we discussed back in the day was setting up a canary script to send known good messages whose delivery is monitored. This might be a bit easier to set up. It's been effective on other systems I've worked on; also a good way to measure delivery latency. -Toby On Friday, July 8,

Re: [Analytics] Pageview analysis graphs not loading

2016-06-22 Thread Toby Negrin
Turn off your ad-blocker and it should work. At least this solved my issues. -Toby On Wed, Jun 22, 2016 at 4:52 PM, Pine W wrote: > Hi folks, > > I can't get pageview analysis graphs to load on 2 wikis that I've tested, > and I've tried desktop and mobile on multiple

Re: [Analytics] Spotify Kafka -> Google Pub/Sub article

2016-03-07 Thread Toby Negrin
It's the same order of magnitude. It looks like their primary problems were with Mirror Maker; everything else seemed to work. Also, I don't know why they use google to queue but ingest the data back into their own datacenters. Why not keep it in Google and use big query? -Toby On Mon, Mar 7,

Re: [Analytics] WikimediaBot convention

2016-01-28 Thread Toby Negrin
On Thu, Jan 28, 2016 at 4:27 AM, Marcel Ruiz Forns wrote: > I wonder if Marcel means "crawlers". > > Toby, do you mean when referring to spiders? Yes, I think they are > equivalent terms. Do you think we should change the naming there? > Hi Marcel -- Here's some

Re: [Analytics] WikimediaBot convention

2016-01-27 Thread Toby Negrin
I wonder if Marcel means "crawlers". On Wednesday, January 27, 2016, Bryan Davis wrote: > On Wed, Jan 27, 2016 at 5:15 PM, Marcel Ruiz Forns > wrote: > > (*) There is already another convention[2] for bots that EDIT Wikimedia > >

Re: [Analytics] Edits per month for Mathematics articles

2016-01-07 Thread Toby Negrin
Hi Paul -- Quarry[1] is good place to start for this kind of data. There are other people on this list who have much better understanding of the database schemas but there's enough documentation to definitely get you started. -Toby [1] https://meta.wikimedia.org/wiki/Research:Quarry On Wed,

Re: [Analytics] Page view API questions regarding user agent

2015-12-22 Thread Toby Negrin
Thanks for the information Oliver. Hi John -- I just wanted to point out in a friendly way that your original email would have been just as effective if you had omitted the last line about a waste of effort to build. We always like to get feedback and questions from the community but the

Re: [Analytics] Wikistats upgraded to new page view definition

2015-12-03 Thread Toby Negrin
HI Erik -- This is fantastic work. It's great to see the familiar tools updated with the new definitions. Thank you for this and everything you've done for the movement. -Toby On Thu, Dec 3, 2015 at 2:09 PM, Erik Zachte wrote: > Hi all, > > > > I just released a major

Re: [Analytics] Commons Alexa rank drop in May 2015

2015-11-18 Thread Toby Negrin
In general, we struggle with these external sources of information. (And this extends to the industry in general). The Research team expended a lot of energy to correct the way another service interpreted our traffic and I believe this feedback was not incorporated. We are public with our page

Re: [Analytics] Transitioning wikistats pageview reports to use new pageview definition

2015-11-10 Thread Toby Negrin
Congrats all -- this is a big achievement. I'm looking forward to the rest of the reports. -Toby On Tue, Nov 10, 2015 at 1:59 PM, Nuria Ruiz wrote: > Hello! > > The analytics team wishes to announce that we have finally transitioned > several of the pageview reports in

[Analytics] Kudu

2015-09-29 Thread Toby Negrin
>From the intertubes: @tlipcon: Super excited to finally talk about what I've been working on the last 3 years: Kudu! http://t.co/1W4sqFBcyH http://t.co/1mZCwgdOO5 Might be useful for the media wiki tables. -Toby ___ Analytics mailing list

Re: [Analytics] Kudu

2015-09-29 Thread Toby Negrin
: > >> Thanks Toby, >> >> If anyone's really passionate about discussing the tons of big data tools >> available, this would be the place: >> https://wikitech.wikimedia.org/wiki/Analytics/DataStore/Evaluation#Candidates >> (just added Kudu) >> >>

Re: [Analytics] [Survey] Pageview API

2015-09-16 Thread Toby Negrin
Hadoop was originally built for indexing the web by processing the web map and exporting indexes to serving systems. I think integration with Elastic Search would work well. -Toby On Wed, Sep 16, 2015 at 7:03 AM, Joseph Allemandou < jalleman...@wikimedia.org> wrote: > @Erik: > Reading this

Re: [Analytics] [Survey] Pageview API

2015-09-11 Thread Toby Negrin
This seems like a weird way to use restful URLs. Why not parameters? -Toby On Fri, Sep 11, 2015 at 4:27 PM, Gabriel Wicke wrote: > > > On Fri, Sep 11, 2015 at 4:26 PM, Gabriel Wicke > wrote: > >> Another option would be a single entry point >> >>

[Analytics] Reddit comment corpus

2015-07-21 Thread Toby Negrin
All of the comments since 2007: https://archive.org/details/2015_reddit_comments_corpus -Toby ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] [Technical] Pick storage for pageview cubes

2015-06-08 Thread Toby Negrin
As always, I'd recommend that we go with tech we are familiar with -- mysql or cassandra. We have a cassandra committer on staff who would be able to answer these questions in detail. -Toby On Mon, Jun 8, 2015 at 4:46 PM, Marcel Ruiz Forns mfo...@wikimedia.org wrote: *This discussion is

[Analytics] Fwd: [WikimediaMobile] Share a Fact Initial Analysis

2015-05-21 Thread Toby Negrin
Hi all - some interesting analysis on the share-a-fact feature from the mobile team. -Toby Begin forwarded message: From: Adam Baso ab...@wikimedia.org Date: May 21, 2015 at 12:05:29 PDT To: mobile-l mobil...@lists.wikimedia.org Subject: [WikimediaMobile] Share a Fact Initial Analysis

Re: [Analytics] stats.grok.se: Missing data for May 9

2015-05-11 Thread Toby Negrin
AFAIK Henrik is using the original source. -Toby On Mon, May 11, 2015 at 7:38 AM, Dan Andreescu dandree...@wikimedia.org wrote: Right now it looks like both May 9 and 10 are missing from here: http://dumps.wikimedia.org/other/pagecounts-ez/merged/2015/2015-05/ but both May 9 and May 10 are

Re: [Analytics] [Technical] WMF-Last-Access

2015-04-29 Thread Toby Negrin
Yes - thanks Brandon for the detailed explanation. Not urgent but I'd love to see a list of the low hanging fruit for where our pages are inefficient. On Apr 29, 2015, at 07:55, Dan Andreescu dandree...@wikimedia.org wrote: Thanks Brandon, that works for me. The cookie has been great btw,

Re: [Analytics] New fields in wmf.webrequest hive table

2015-04-12 Thread Toby Negrin
Hi Yuri -- In general, I do not think this table will change a lot moving forward. We're migrating to a more complete definition right now so some changes are to be expected but things should settle down. Thanks for the new fields! -Toby On Sun, Apr 12, 2015 at 9:55 AM, Andrew Otto

Re: [Analytics] [Technical] final pageviews QA

2015-03-12 Thread Toby Negrin
I'm also confused. As I understand it, stats.wikimedia.org is consuming the data that is represented by the green line in your graph. Therefore we would see this drop in the wikistats data that Erik referred to, but we don't. I think we need to understand why this is so. -Toby On Thu, Mar 12,

Re: [Analytics] [Technical] final pageviews QA

2015-03-12 Thread Toby Negrin
transition that only kicks in on the day-by-day. On 12 March 2015 at 18:21, Toby Negrin tneg...@wikimedia.org wrote: I'm also confused. As I understand it, stats.wikimedia.org is consuming the data that is represented by the green line in your graph. Therefore we would see this drop

Re: [Analytics] page views by location

2015-03-02 Thread Toby Negrin
Hi Seth -- we're currently working to provide geo-located page views with a privacy acceptable level of aggregation. We don't currently have an ETA. I'm cc'ing the public analytics list for more information. Best, -Toby On Mon, Mar 2, 2015 at 9:41 AM, Seth Stephens-Davidowitz

Re: [Analytics] [Data][Outage] Statistics per wikipedia for 2015

2015-02-27 Thread Toby Negrin
Thank you Erik! On Feb 27, 2015, at 17:05, Erik Zachte ezac...@wikimedia.org wrote: The dump generation process was restarted Feb 12, after a long outage. It takes several weeks for all dumps to refresh. You can follow the progress at http://dumps.wikimedia.org/backup-index.html All

Re: [Analytics] Cluster issues. Refining suspended. Hence a few datasets start to lag.

2015-02-26 Thread Toby Negrin
Thank you Christian! On Wed, Feb 25, 2015 at 5:18 PM, Christian Aistleitner christ...@quelltextlich.at wrote: Hi, just a quick heads up that the Analytics cluster got stuck today. And jobs deadlocked themselves waiting for other jobs to free resources. For the time being, to allow the

Re: [Analytics] [Offline-l] Fwd: Reasons you use the XML dumps or want to, but can't?

2015-02-25 Thread Toby Negrin
Thanks for doing that Andrew! On Tue, Feb 24, 2015 at 1:41 PM, Andrew Otto ao...@wikimedia.org wrote: I also added some Hadoop based used cases to that document. https://www.mediawiki.org/w/index.php?title=Wikimedia_MediaWiki_Core_Team%2FBacklog%2FImprove_dumpsdiff=1422073oldid=1421455

Re: [Analytics] Confluent, whoa

2015-02-25 Thread Toby Negrin
The schema manager is _really_ interesting. Can we take it for a spin? -Toby On Wed, Feb 25, 2015 at 12:07 PM, Andrew Otto ao...@wikimedia.org wrote: Whoa, Confluent (Kafka folks) just packaged up everything we've been building over the last two years:

Re: [Analytics] eventlogging master

2015-02-21 Thread Toby Negrin
Sorry - this is my bad for not tying these threads together. I saw that Dan suggested we replace vanadium at the same time we move the master. I've been concerned about EL capacity for a while now and it seemed like a good chance to take some downtime and fix both issues. At the very least

[Analytics] Welcome Joseph

2015-02-18 Thread Toby Negrin
Hi Everyone, I'd like to welcome Joseph Allemendou to the Analytics team! We are really excited to get some of Joseph's calibre to help take our analytics work to the next level. In his own words: Joseph's experiences were mostly with private companies and almost always involved open source

[Analytics] stats.grok.se not updating

2015-02-11 Thread Toby Negrin
It looks like stats.grok.se has not updated in a few days. Kevin -- could you ping Henrik and see if he can restart the service? thanks, -Toby ___ Analytics mailing list Analytics@lists.wikimedia.org

Re: [Analytics] Virtual file view hack for Media Viewer views

2015-02-05 Thread Toby Negrin
views, add an X-Analytics header value of real-view=true to the request itself? If that's not feasible, we should look into using statsv for this (not sure how that works) or having this be a different kafka topic and not consumed into HDFS. On Thu, Feb 5, 2015 at 11:59 AM, Toby Negrin tneg

Re: [Analytics] Virtual file view hack for Media Viewer views

2015-02-05 Thread Toby Negrin
I created a card -- modify as desired: https://trello.com/c/HMgVD4mz -Toby On Thu, Feb 5, 2015 at 8:51 AM, Toby Negrin tneg...@wikimedia.org wrote: It turns out that the media viewer (on desktop; don't know about mobile) does a lot of caching so just because an image is loaded from swift

Re: [Analytics] Virtual file view hack for Media Viewer views

2015-02-05 Thread Toby Negrin
Hi Gergo -- I like this idea. As far as capacity, any EL-Hadoop based solution would be basically doing the same thing as you propose. Can you please run it past ops (especially the 404 v 204) part? Oliver -- the issue is that we'd like to figure out a way to provide accurate views of the media

Re: [Analytics] Virtual file view hack for Media Viewer views

2015-02-05 Thread Toby Negrin
; it's artificial work for both users and us. If this is the only way of doing things that's totally fine. On 5 February 2015 at 11:38, Toby Negrin tneg...@wikimedia.org wrote: Hi Gergo -- I like this idea. As far as capacity, any EL-Hadoop based solution would be basically doing the same thing

Re: [Analytics] most clicked links in articles

2015-01-12 Thread Toby Negrin
Hi Amir -- Would you like to see these datasets released publicly or was there a specific project you were interested in using them for? thanks, -Toby On Mon, Jan 12, 2015 at 5:44 AM, Amir E. Aharoni amir.ahar...@mail.huji.ac.il wrote: Hi, Are there metrics about which links in each

Re: [Analytics] Making EventLogging output to a log file instead of the DB

2015-01-07 Thread Toby Negrin
was that -regardless of collection method- we might not need every single data point to calculate uniques. On Wed, Jan 7, 2015 at 10:38 AM, Toby Negrin tneg...@wikimedia.org wrote: Yes -- we disabled it because there wasn't a use case. We have one now :) On Wed, Jan 7, 2015 at 10:32 AM, Nuria Ruiz nu

Re: [Analytics] Only parts of EventLogging events getting written to the database since 2015-01-07 ~1:55

2015-01-07 Thread Toby Negrin
Folks -- thanks for owning this. One concern -- this is the second deployment related problem in the last couple of months. I'm concerned that we need to investigate more resources in a testing environment as well as a deployment checklist. I'm also considering having EL added to Greg's deployment

Re: [Analytics] Making EventLogging output to a log file instead of the DB

2015-01-07 Thread Toby Negrin
. I'm not sure how much we would miss then. iirc Gilles said this browsing feature was used quite a long, but I'm not sure. *From:* analytics-boun...@lists.wikimedia.org [ mailto:analytics-boun...@lists.wikimedia.org analytics-boun...@lists.wikimedia.org] *On Behalf Of *Toby Negrin *Sent

Re: [Analytics] Making EventLogging output to a log file instead of the DB

2015-01-06 Thread Toby Negrin
cases never actually requested. https://www.mediawiki.org/wiki/Requests_for_comment/Media_file_request_counts#Prefetched_images - Erik *From:* analytics-boun...@lists.wikimedia.org [mailto: analytics-boun...@lists.wikimedia.org] *On Behalf Of *Toby Negrin *Sent:* Tuesday, January 06

Re: [Analytics] analytics-store replag s1 and s5

2014-12-22 Thread Toby Negrin
Jon -- we made some changes to stat1003 logins; I believe you need to use the internal address now. I _thought_ we sent updated instructions to this list but I can't find it. What specific issues are you having? -Toby On Mon, Dec 22, 2014 at 9:37 AM, Jon Robson jrob...@wikimedia.org wrote:

Re: [Analytics] analytics-store replag s1 and s5

2014-12-22 Thread Toby Negrin
Great - thanks all. On Dec 22, 2014, at 9:58 AM, Jon Robson jrob...@wikimedia.org wrote: It's fine. Andrew fixed issues for me. I just had to switch stat1003.wikimedia.org for stat1003.eqiad.wmnet :-) On Mon, Dec 22, 2014 at 9:49 AM, Toby Negrin tneg...@wikimedia.org wrote: Jon -- we

Re: [Analytics] Page view generalized filter draft (due Friday, Dec 12th)

2014-12-16 Thread Toby Negrin
demands we end up implementing in C, or something. On 15 December 2014 at 13:46, Toby Negrin tneg...@wikimedia.org wrote: I think the hive code is representative in that it's an implementation. It's certainly not the only permitted one. On Dec 15, 2014, at 10:34 AM, Andrew Otto ao

Re: [Analytics] Is VisualEditor good for preserving new editors?

2014-12-15 Thread Toby Negrin
Hi Amir -- Because VE is not widely rolled out and is controversial, we haven't spent a lot of time studying it after the initial rollout in 2013. AFAIK, that team is working on performance and functionality issues and I haven't heard anything about additional rollouts. Until we have a plan for

Re: [Analytics] Page view generalized filter draft (due Friday, Dec 12th)

2014-12-15 Thread Toby Negrin
Hi Aaron, all -- I haven't seen any discussion on this which is a sign that we can forward with turning over the draft. Thoughts? thanks, -Toby On Tue, Dec 9, 2014 at 5:15 PM, Aaron Halfaker ahalfa...@wikimedia.org wrote: Hey folks, As discussions on the new page view definition have been

Re: [Analytics] Page view generalized filter draft (due Friday, Dec 12th)

2014-12-15 Thread Toby Negrin
on Hadoop heapsize issues, but I'm sure we'll work it through :). On 15 December 2014 at 12:10, Toby Negrin tneg...@wikimedia.org wrote: Hi Aaron, all -- I haven't seen any discussion on this which is a sign that we can forward with turning over the draft. Thoughts? thanks, -Toby On Tue

Re: [Analytics] EventLogging data QA

2014-12-15 Thread Toby Negrin
I share Christian's concerns - Dario/Leila - can you comment based on your recent experiences with WikiGrok? Thanks -Toby On Dec 15, 2014, at 9:42 AM, Christian Aistleitner christ...@quelltextlich.at wrote: Hi, On Mon, Dec 15, 2014 at 08:34:39AM -0800, Kevin Leduc wrote: I closed

Re: [Analytics] Page view generalized filter draft (due Friday, Dec 12th)

2014-12-15 Thread Toby Negrin
:10, Toby Negrin tneg...@wikimedia.org wrote: Hi Aaron, all -- I haven't seen any discussion on this which is a sign that we can forward with turning over the draft. Thoughts? thanks, -Toby On Tue, Dec 9, 2014 at 5:15 PM, Aaron Halfaker ahalfa...@wikimedia.org wrote: Hey folks

Re: [Analytics] Switching the RD team to Phabricator

2014-12-15 Thread Toby Negrin
To be clear - I do not want to move to Fabricator without reviewing our prioritization process. Shall we make this a Q3 goal since people seem really into it? On Dec 15, 2014, at 10:44 AM, Leila Zia le...@wikimedia.org wrote: Hi Oliver, I'd like to give Phabricator a try. I suggest

Re: [Analytics] Switching the RD team to Phabricator

2014-12-15 Thread Toby Negrin
to using Phab, but haste can sometimes, ya know, make waste. So let's talk about this more... On Mon, Dec 15, 2014 at 10:48 AM, Toby Negrin tneg...@wikimedia.org wrote: To be clear - I do not want to move to Fabricator without reviewing our prioritization process. Shall we make this a Q3 goal

Re: [Analytics] The state of field names in MediaWiki data

2014-12-11 Thread Toby Negrin
Bikeshed indeed -- this seems to be a project that could soak up a lot of time. I'm with Aaron -- let's be consistent with the principle of least surprise and use an existing identifier. The database seems as good a place to start as any. On Thu, Dec 11, 2014 at 11:00 AM, Aaron Halfaker

Re: [Analytics] EventLogging data QA

2014-12-11 Thread Toby Negrin
Thanks Dario, et al. A +1 from me -- this will make integration a lot easier. Let's see if we can address this in the Q3 project about dashboarding. -Toby On Thu, Dec 11, 2014 at 4:11 PM, Dario Taraborelli dtarabore...@wikimedia.org wrote: I am kicking off this thread after a good

Re: [Analytics] [wmfresearch] stat* box VLAN move

2014-12-05 Thread Toby Negrin
: A whole day of downtime will be rough. I don't really do much without the stats machines. If it's necessary, it's necessary. On Fri, Dec 5, 2014 at 10:11 AM, Toby Negrin tneg...@wikimedia.org wrote: Why? On Fri, Dec 5, 2014 at 8:10 AM, Andrew Otto ao...@wikimedia.org wrote: Hi all! Ops

Re: [Analytics] Contribute

2014-12-03 Thread Toby Negrin
Thanks Andre! Hi Ron -- thanks for reaching out. We have a project called Wikimetrics[1] that helps the community understand what specific groups of editors are doing on the site. It's generally the first project one works on in Analytics. The readme[2] has information on how to set up a dev

Re: [Analytics] Round-up of network outage from 2014-11-30 [was: Re: Fwd: [Ops] Network outage for rack C4 in eqiad]

2014-12-01 Thread Toby Negrin
Thanks Christian. I do not believe that we need to backfill the TSVs that are filled from the udp2log stream. Oliver -- GLEE uses the geo-edit data. -Toby On Mon, Dec 1, 2014 at 4:57 AM, Oliver Keyes oke...@wikimedia.org wrote: Thanks, Christian! :) What do we use geowiki for, out of

Re: [Analytics] My transition to Hadoop streaming (thanks to gage ottomata!)

2014-12-01 Thread Toby Negrin
This is indeed awesome -- thanks for trying out the new tools (and all the pain involved) and also for documenting your work. -Toby On Mon, Dec 1, 2014 at 8:07 AM, Andrew Otto ao...@wikimedia.org wrote: AwesoOOME! On Nov 30, 2014, at 16:06, Aaron Halfaker ahalfa...@wikimedia.org wrote:

Re: [Analytics] Backfilling done [was: Re: Round-up of recent EventLogging issues]

2014-11-24 Thread Toby Negrin
Awesome -- nice work all. Thanks for the updates Christian. -Toby On Mon, Nov 24, 2014 at 6:45 AM, Christian Aistleitner christ...@quelltextlich.at wrote: Hi, On Mon, Nov 24, 2014 at 03:32:48PM +0100, Christian Aistleitner wrote: On Fri, Nov 21, 2014 at 02:22:47PM +0100, Christian

Re: [Analytics] Writing EventLogging events to database failed on 2014-11-22 for between ~19:30 and ~21:00

2014-11-23 Thread Toby Negrin
Thanks as usual for dealing with this Christian. Things have been very unstable lately. We're going to have a post mortem on Monday morning of the recent event logging issues and we'll publish results to this list. -Toby On Nov 23, 2014, at 3:12 PM, Christian Aistleitner

Re: [Analytics] data in Vital Signs

2014-11-04 Thread Toby Negrin
Created tracking bug -- please add yourselves to the cc if desired. https://bugzilla.wikimedia.org/show_bug.cgi?id=72973 -Toby On Tue, Nov 4, 2014 at 12:07 PM, James Forrester jforres...@wikimedia.org wrote: On 4 November 2014 12:00, Aaron Halfaker ahalfa...@wikimedia.org wrote: Understood

Re: [Analytics] Records of article access

2014-10-17 Thread Toby Negrin
to do unethical activities with WMF's data could also access the logs without being noticed. Thanks, Pine On Thu, Oct 16, 2014 at 9:31 PM, Toby Negrin tneg...@wikimedia.org wrote: Hi Pine -- Thanks for this -- it's a challenging topic but one that the Analytics team takes very seriously

Re: [Analytics] Analytics dev points

2014-10-16 Thread Toby Negrin
Thank you Dan -- explaining points is difficult and you did it well. It's a very powerful estimation technique. -Toby On Thu, Oct 16, 2014 at 5:22 PM, Dan Garry dga...@wikimedia.org wrote: In Agile methodologies, story points are arbitrary unit [1] of measurement for the difficulty of

Re: [Analytics] Records of article access

2014-10-16 Thread Toby Negrin
Hi Pine -- Thanks for this -- it's a challenging topic but one that the Analytics team takes very seriously. I'm not familiar with the IP address review that's referenced in the link. I don't know who the staffer might be. We don't currently calculate unique visitors to anything in Analytics and

[Analytics] Mobile page views available

2014-10-14 Thread Toby Negrin
Hi all -- I have some good news to share. At the beginning of the month, we announced that mobile page views were available on our servers. Somewhat belatedly, I have follow up information available here: goog_271963740 https://wikitech.wikimedia.org/wiki/Analytics/Pagecounts-all-sites Thanks

[Analytics] Welcome Marcel Ruiz Forns to the Analytics Development team

2014-10-07 Thread Toby Negrin
Hi Everyone, I'd like to welcome Marcel to the Analytics team. We're super excited to have someone with Marcel's skills and experience on the team. In his own words: Marcel is a Spanish computer science engineer, currently living in Brazil. He has worked lately with recommender systems and

Re: [Analytics] stats.wikimedia.org and datasets.wikimedia.org unavailable

2014-10-05 Thread Toby Negrin
Hi Pine -- We'll have more information tomorrow, US Pacific Morning. -Toby On Sun, Oct 5, 2014 at 4:53 PM, Pine W wiki.p...@gmail.com wrote: Thanks. It would also be interesting to know why the Icinga alarm was muted. Pine On Oct 5, 2014 4:48 PM, Christian Aistleitner

Re: [Analytics] Errors on stats.grok.se

2014-09-26 Thread Toby Negrin
Yes -- thanks all. -Toby On Fri, Sep 26, 2014 at 12:40 AM, Pine W wiki.p...@gmail.com wrote: Thanks, Christian, Henrik and Alex. Pine ___ Analytics mailing list Analytics@lists.wikimedia.org

Re: [Analytics] LinkedIn Samza Use

2014-09-18 Thread Toby Negrin
Not a troll - but we are getting there with the simple stuff. On Sep 18, 2014, at 11:47 AM, Andrew Otto ao...@wikimedia.org wrote: Haha, troll accepted. On Sep 18, 2014, at 2:41 PM, Steven Walling swall...@wikimedia.org wrote: On Thu, Sep 18, 2014 at 11:37 AM, Andrew Otto

Re: [Analytics] [Wiki-research-l] [Wikimedia-l] wikipedia access traces ?

2014-09-17 Thread Toby Negrin
Hi all, I can't figure out the use case from the thread but it's unlikely we would release unaggregated page views as this would have privacy implications that we would need to consider very carefully. Hourly is likely the smallest granularity we will release. Best, -Toby On Wed, Sep 17,

Re: [Analytics] Anonymizing and releasing 'edits per country' data for Wiki Projects

2014-08-29 Thread Toby Negrin
Hi Folks -- sorry for the delay in responding. While this data is awesome, we need to review the anonymization carefully. We once shared this data in dashboards and found some privacy issues so we needed to take it down. We have an action item to review this issue with legal. I will follow up

Re: [Analytics] datasets.wikimedia.org

2014-08-07 Thread Toby Negrin
Hi Andrew -- should we redirect / to /public-datasets? There's no content there, just a default web page which is a bit confusing. -Toby On Tue, Aug 5, 2014 at 11:13 AM, Andrew Otto ao...@wikimedia.org wrote: Everything is the same as before, the public-datasets that are mostly mentioned are

Re: [Analytics] Reportcard instructions

2014-08-06 Thread Toby Negrin
Thanks Nuria! On Aug 5, 2014, at 8:07 PM, Nuria Ruiz nu...@wikimedia.org wrote: Team, I have updated the reportcard instructions on how to generate the reportcard from the files Erik Z sends. https://www.mediawiki.org/wiki/Analytics/ReportCard Thanks,

Re: [Analytics] [Wikimetrics] Wikimetrics unresponsive

2014-07-28 Thread Toby Negrin
Thanks for the quick response folks. On Mon, Jul 28, 2014 at 9:18 AM, Dan Andreescu dandree...@wikimedia.org wrote: Bug fixed, we were pushing the limits of what wikimetrics could do to help us backfill a lot of data. It failed, but we learned and we'll improve it. Sorry for any

Re: [Analytics] Analytics Dev Team Commitments 2014-07-24 -- 2014-08-05

2014-07-24 Thread Toby Negrin
Thanks Kevin and team! On Jul 24, 2014, at 12:22 PM, Kevin Leduc ke...@wikimedia.org wrote: Hi, the dev team has committed to the following user stories for the sprint starting today, ending August 5. Bug ID Component Summary Points 68516 Wikimetrics Story: Researcher has

Re: [Analytics] Varnishkafka Delivery Errors

2014-06-08 Thread Toby Negrin
Hi Greg -- Sometimes we have connectivity issues with Amsterdam that may cause these errors. We care but they are intermittent and it's been tricky to debug. Andrew will take a look tomorrow. -Toby On Sun, Jun 8, 2014 at 8:52 PM, Greg Grossmeier g...@wikimedia.org wrote: quote name=Greg

Re: [Analytics] s[23467]-analytics-slave getting pointed to analytics-store

2014-06-03 Thread Toby Negrin
Thanks Christian. On Tue, Jun 3, 2014 at 4:50 AM, Christian Aistleitner christ...@quelltextlich.at wrote: Hi, No action required. Just a heads up. Since there were some problems around machine capacities, slow queries and subsequent slave lag and alarms in the past months, springle

Re: [Analytics] Analytics maintainers

2014-05-03 Thread Toby Negrin
I took a look at this and realized we needed to talk about it. I'm not really sure what primary and secondary means. Let's discuss in CH. Thanks for the ping. On May 3, 2014, at 8:14 AM, Christian Aistleitner christ...@quelltextlich.at wrote: Hi Toby, Hi Kevin, the below Email hasn't

Re: [Analytics] db1047 one box to rule them all

2014-04-30 Thread Toby Negrin
I think we'll put everything on Hadoop at some point but we're focusing on the page views now. Regarding the bug - if you're ready to use it I can see if Andrew can install the java package. -Toby On Apr 30, 2014, at 9:34 AM, Oliver Keyes oke...@wikimedia.org wrote: On 30 April

Re: [Analytics] [WikimediaMobile] Eventlogging for editing broken

2014-04-30 Thread Toby Negrin
Let's give the database the time it needs to replicate and perform needed validation before we start troubleshooting other issues. I'm concerned that too many things are going on here. Thanks to everyone who is working on this right now. -Toby On Wed, Apr 30, 2014 at 3:17 PM, Jon Robson

Re: [Analytics] Survey tool for features

2014-04-29 Thread Toby Negrin
We reviewed building a survey tool a couple of quarterly reviews ago and like Stephen said it wasn't prioritized highly compared to many other requests. This, combined with the availability of sub-optimal but workable solutions like SurveyMonkey makes it unlikely we'll look into building one in

Re: [Analytics] Analytics' s1 slave not working

2014-04-28 Thread Toby Negrin
Thanks Christian. I'm speaking with Jeff Gage and he has been working with Chase on the db1047. They will update the ops list shortly. On Mon, Apr 28, 2014 at 3:21 PM, Christian Aistleitner christ...@quelltextlich.at wrote: Hi, it seems that since some hours, the analytics s1 slave

Re: [Analytics] Social Machines Event

2014-04-25 Thread Toby Negrin
Thanks for posting Edward. I too will be off the grid (and don't have much to contribute anyway) but this is a really interesting topic. -Toby On Fri, Apr 25, 2014 at 10:17 AM, Jonathan Morgan jmor...@wikimedia.orgwrote: No, sorry. Memorial Day weekend camping trip. I'll be off the grid

Re: [Analytics] s7 slave issues [was: Re: Pmpta going away and taking some analytics slaves with it :-)]

2014-04-22 Thread Toby Negrin
Thanks for managing this. Is there any action required from us? Should we fix those CNAMEs or will that be addressed when we move/replace the slaves? On Tue, Apr 22, 2014 at 1:20 AM, Christian Aistleitner christ...@quelltextlich.at wrote: Hi, all analytics slaves are working again. On

Re: [Analytics] Pmpta going away and taking some analytics slaves with it :-)

2014-04-21 Thread Toby Negrin
Hi Christian -- Thanks for the heads-up. I've verbally notified Dario and the Research and Data team. They will follow up with tech-ops. -Toby On Mon, Apr 21, 2014 at 3:33 PM, Christian Aistleitner christ...@quelltextlich.at wrote: Hi, one might be tempted to think that the pmtpa data

Re: [Analytics] Filtering out outliers in data used to generate tsvs

2014-04-16 Thread Toby Negrin
Nice Aaron! On Apr 16, 2014, at 7:26 AM, Aaron Halfaker ahalfa...@wikimedia.org wrote: The SVGs plots I made don't show up well in gmail, so here's some PNGs On Wed, Apr 16, 2014 at 9:24 AM, Aaron Halfaker ahalfa...@wikimedia.org wrote: Hi Gilles, I think I know just the thing

Re: [Analytics] Header for IRC

2014-04-14 Thread Toby Negrin
, Toby Negrin tneg...@wikimedia.org wrote: Proposed links Analytics Wiki: https://www.mediawiki.org/wiki/Analytics (or shortened) Channel logs http://bit.ly/1fjJnZo I'd keep the batcave and shorten the wiki link, but sure I'd also keep the batcave link

[Analytics] Analytics quarterly review updates

2014-04-09 Thread Toby Negrin
Hi Everyone, We had our quarterly review with WMF management last week. The minutes[1] are posted up on meta along with the deck we presented. (Thank you to Tilman for taking the minutes and helping post the slides) Please take a look at the deck and let me know if you have any questions. In

[Analytics] Update on Analytics Development Epics

2014-03-02 Thread Toby Negrin
Hi Everyone, The development team is in the middle of the sprint that I sent you an update on last week; At the end of next week I'll update this list with what we accomplished during the sprint. For this week, here is the current status of the

[Analytics] Analytics Development Sprint Planning Update

2014-02-21 Thread Toby Negrin
Hi all, We're going to start being more transparent with the community today about the work being done by the analytics development team. Today's update is a summary of our sprint planning that we did yesterday. Short background -- the development team works in 2 week sprints which usually start