[Analytics] Re: WikimediaAQS Pageview API Data Integrity

2023-08-24 Thread Marcel Ruiz Forns
www.predata.com/> > > ___ > Analytics mailing list -- analytics@lists.wikimedia.org > To unsubscribe send an email to analytics-le...@lists.wikimedia.org > -- *Marcel Ruiz Forns** (he/him)* Senior Software Engineer ___

[Analytics] Re: Wikimedia AQS Pageviews API - 2023-06-19

2023-06-20 Thread Marcel Ruiz Forns
_ > Analytics mailing list -- analytics@lists.wikimedia.org > To unsubscribe send an email to analytics-le...@lists.wikimedia.org > -- *Marcel Ruiz Forns** (he/him)* Senior Software Engineer ___ Analytics mailing list -- ana

[Analytics] Re: API Outages

2023-03-03 Thread Marcel Ruiz Forns
end an email to analytics-le...@lists.wikimedia.org >>> >> ___ >> Analytics mailing list -- analytics@lists.wikimedia.org >> To unsubscribe send an email to analytics-le...@lists.wikimedia.org >> > -- > Joshua Haecker > CEO, Co-F

[Analytics] Re: Access Wikipedia Metadata - API/Dumps/Query Replicas?

2021-09-17 Thread Marcel Ruiz Forns
wiki/Event_Platform/EventStreams > ___ > Analytics mailing list -- analytics@lists.wikimedia.org > To unsubscribe send an email to analytics-le...@lists.wikimedia.org > -- *Marcel Ruiz Forns** (he/him)* Senior Software Engineer ___ Analytics mailing list -- analytics@lists.wikimedia.org To unsubscribe send an email to analytics-le...@lists.wikimedia.org

[Analytics] [Data Release] Editors by Country in AQS

2020-09-22 Thread Marcel Ruiz Forns
/Public> . As a next step, we'll add the corresponding visualization to Wikistats2 <http://stats.wikimedia.org>. Cheers! On behalf of the Analytics team, -- *Marcel Ruiz Forns** (he/him)* Senior Software Engineer ___ Analytics mailing list An

Re: [Analytics] Computed Edit Counts vs Wikistats Edit Counts

2020-09-11 Thread Marcel Ruiz Forns
t; thanks, thorsten > > [1] https://meta.wikimedia.org/wiki/Research:Wikistats_metrics/Edits > > -- > Thorsten Ruprechter > > Institute of Interactive Systems and Data Science (ISDS) > Graz University of Technology, Austria > > _______

Re: [Analytics] [Research-Internal] Tutorials on disk space usage for notebook/stat boxes

2020-02-18 Thread Marcel Ruiz Forns
t; > Thanks! > > Luca (on behalf of the Analytics team) > > > ___ > Research-Internal mailing list > research-inter...@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/research-internal

Re: [Analytics] Pageviews API missing data for some pages and dates?

2020-01-02 Thread Marcel Ruiz Forns
s able to get results. > I'll share more information on this in a separate email if I'm able to > reproduce. > > Thank you, > > Vipul > ___________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > -- *Mar

[Analytics] Hive and Oozie unavailable due to maintenance on Tue Jul 30th 10am CEST

2019-07-29 Thread Marcel Ruiz Forns
interrupted (we'll let the outstanding ones finish, if possible). If this will break some important job that you have running, please let us know in the Phabricator task above or via IRC (#wikimedia-analytics). Cheers! Marcel (on behalf of the Analytics team) -- *Marcel Ruiz Forns** (he/him)* Anal

Re: [Analytics] WMF API update

2019-05-06 Thread Marcel Ruiz Forns
___ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > -- *Marcel Ruiz Forns** (he/him)* Analytics Developer @ Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] Please add Chinese Wikiversity into the WikiStats database

2019-01-11 Thread Marcel Ruiz Forns
provements (ex. 2018-08) also be visible? > > Thanks for your help! > > Marcel Ruiz Forns 於 2019年1月11日 週五,02:36寫道: > >> If the analytics team add the data of Chinese Wikiversity into the >>> database (base source), will WikiStats and WikiStats 2 both get updated? If

Re: [Analytics] Please add Chinese Wikiversity into the WikiStats database

2019-01-10 Thread Marcel Ruiz Forns
e to navigate and collect raw data. Yet, I prefer to > use WikiStats 1 than WikiStats 2 as the reference for statistics.) > > Again, thanks for your precious answer! It’s really helpful for both me > and the Chinese Wikiversity community. > > Marcel Ruiz Forns 於 2019年1月10日 週四,23:09

Re: [Analytics] Please add Chinese Wikiversity into the WikiStats database

2019-01-10 Thread Marcel Ruiz Forns
/wiki/Analytics/AQS/Unique_Devices On Thu, Jan 10, 2019 at 10:34 AM Eric Liu wrote: > Are WikiStats 1 and WikiStats 2’s database the same? And, is WikiScan a > part of WikiStats? > > Thanks for your help! > > Marcel Ruiz Forns 於 2019年1月9日 週三,23:34寫道: > >> [Adding Eric Li

Re: [Analytics] Please add Chinese Wikiversity into the WikiStats database

2019-01-09 Thread Marcel Ruiz Forns
alytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > -- *Marcel Ruiz Forns** (he/him)* Analytics Developer @ Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] EventLogging Hive Refine currently stalled for some Schemas

2018-11-15 Thread Marcel Ruiz Forns
ences >>>>>>> MobileWikiAppOfflineLibrary >>>>>>> MobileWikiAppOnboarding >>>>>>> MobileWikiAppOnThisDay >>>>>>> MobileWikiAppPageScroll >>>>>>> MobileWikiAppProtectedEditAttempt >>>>>>> Mobi

Re: [Analytics] Question about the "Page Views" tool

2018-03-13 Thread Marcel Ruiz Forns
imedia.org/wiki/Analytics>), or write something from >> scratch that would expose data with the same API format. >> >> Federico >> > > > _______ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lis

Re: [Analytics] Question about the "Page Views" tool

2018-03-07 Thread Marcel Ruiz Forns
__ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] PageView

2018-03-02 Thread Marcel Ruiz Forns
be grabbing stats from: https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews https://dumps.wikimedia.org/ Cheers! [1] https://wikimediafoundation.org/wiki/Privacy_policy On Fri, Mar 2, 2018 at 5:16 PM, Marcel Ruiz Forns wrote: > Hi Angelina, > > I don't think there's any

Re: [Analytics] Pageview dumps lagging behind

2018-02-20 Thread Marcel Ruiz Forns
man/listinfo/analytics >>>> >>> >>> >> >> ___ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >&

Re: [Analytics] Undocumented project code in pagecounts-ez

2017-11-14 Thread Marcel Ruiz Forns
_____ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] Anybody know about stats.grok.se going down?

2017-08-21 Thread Marcel Ruiz Forns
___ >>>>>>> Analytics mailing list >>>>>>> Analytics@lists.wikimedia.org >>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Dan Garry >>>>>> Senior Product Manager, Editing >>>>>> Wikimedia Foundation >>>>>> >>>>>> ___ >>>>>> Analytics mailing list >>>>>> Analytics@lists.wikimedia.org >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> >>>>>> >>>>> >>>>> ___ >>>>> Analytics mailing list >>>>> Analytics@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>> ___ >>>> Analytics mailing list >>>> Analytics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>> >>> ___ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> >> ___ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > ___ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] Request for analytics data

2017-03-06 Thread Marcel Ruiz Forns
> > ___ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] On Wikipedia edits archive per county.

2017-01-25 Thread Marcel Ruiz Forns
>> >> >> If you only consider the numbers above 3 % or so, these tend to be rather >> >> >> stable, so the 2013 data is still a good approximation. >> >> >> >> >> >> > In >> >> >> > addition not all qu

Re: [Analytics] private learning (collaboration) project

2016-12-20 Thread Marcel Ruiz Forns
l produce some > policy-relevant research. > > > Best regards, > Alexander Ugarov, > Ph. D. Candidate. > Sam M. Walton College of Business > Department of Economics > University of Arkansas > Office: ECOB260 > E-mail: auga...@uark.edu. > > ___

Re: [Analytics] Making Charts More Interactive

2016-11-16 Thread Marcel Ruiz Forns
list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] High number of pageviews on page with single hyphen as title

2016-11-16 Thread Marcel Ruiz Forns
Issa >>>> >>>> >>>> ___ >>>> Analytics mailing list >>>> Analytics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>

Re: [Analytics] ensuring reader anonymity

2016-11-11 Thread Marcel Ruiz Forns
lve any anticipated need? > > _______ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] Parsing user agents in EventLogging data

2016-09-15 Thread Marcel Ruiz Forns
ogging/UserAgentSanitiz >>> ation >>> >>> Nemo >>> >>> >>> ___ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/a

[Analytics] EventLogging new auto-purging strategies are about to be activated

2016-07-27 Thread Marcel Ruiz Forns
tor.wikimedia.org/T108850 -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] [Wiki-research-l] question about Pageviews dumps

2016-07-01 Thread Marcel Ruiz Forns
t; Thanks for the answers, Nuria and Marcel! :) > Cheers, > > Marc > > El dj., 30 juny 2016 a les 14:16, Marcel Ruiz Forns () > va escriure: > >> Marc, I also see what Nuria says. Also please consider that the majority >> of Wikipedia sessions have only one pageview.

Re: [Analytics] [Wiki-research-l] question about Pageviews dumps

2016-06-30 Thread Marcel Ruiz Forns
t;>>>>>>> data and >>>>>>>> if there would be any implication because of privacy concerns. >>>>>>>> >>>>>>>> Thank you very much! >>>>>>>> >>>>>>>> Best,

Re: [Analytics] analytics-store unscheduled maintenance

2016-05-27 Thread Marcel Ruiz Forns
t; > ___________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] statistics about user agents per page or per namespace

2016-04-29 Thread Marcel Ruiz Forns
ch. > > -- > Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי > http://aharoni.wordpress.com > ‪“We're living in pieces, > I want to live in peace.” – T. Moore‬ > > ___ > Analytics mailing list > Analytics@lists.wikimedia.org &g

Re: [Analytics] Analytics Digest, Vol 50, Issue 21

2016-04-26 Thread Marcel Ruiz Forns
> need to know is that >> > is a good proxy metric to measure Unique Users, more info below. >> > >> > Since 2009, the Wikimedia Foundation used comScore to report data about >> > unique web visitors. In January 2016, however, we decided to sto

Re: [Analytics] Edit-Analysis Dashboard back on track

2016-03-22 Thread Marcel Ruiz Forns
ce file is big (23M), and filtering it just takes > forever. We'll try to think of ways to split that up into maybe the last > three months and "all". > > On Tue, Mar 22, 2016 at 9:35 AM, Marcel Ruiz Forns > wrote: > >> Hi editing, >> >> Jus

[Analytics] Edit-Analysis Dashboard back on track

2016-03-22 Thread Marcel Ruiz Forns
Hi editing, Just to let you know that after the modifications to the Edit table in EL database, the reports have been able to catch up and back-fill until today. So https://edit-analysis.wmflabs.org/compare/ is working again. Cheers! -- *Marcel Ruiz Forns* Analytics Developer Wikimedia

Re: [Analytics] WikimediaBot convention

2016-02-03 Thread Marcel Ruiz Forns
d probably raise technical issues, but seems that we can benefit from it. https://phabricator.wikimedia.org/T125731 Cheers! On Wed, Feb 3, 2016 at 11:43 PM, Marcel Ruiz Forns wrote: > John, thank you a lot for taking the time to answer my question. My > responses inline (I rearran

Re: [Analytics] WikimediaBot convention

2016-02-03 Thread Marcel Ruiz Forns
d insight. Thanks. Currently, the User-Agent policy is not implemented in our regular expressions, meaning: it does not match emails, nor user pages or other mediawiki urls. It could also, as you suggest, implement matching github accounts, or tools.wmflabs.org. We Analytics should tackle that. I will

Re: [Analytics] WikimediaBot convention

2016-02-02 Thread Marcel Ruiz Forns
)]] Sr Software EngineerBoise, ID USA > irc: bd808 v:415.839.6885 x6855 > > ___ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] WikimediaBot convention

2016-02-01 Thread Marcel Ruiz Forns
ator.wikimedia.org/T99373#1859170). >>> >>> There are a lot of clients that need to be upgraded or be >>> decommissioned for this 'add bot' strategy to be effective in the near >>> future. see https://www.mediawiki.org/wiki/API:Client_code >&g

Re: [Analytics] WikimediaBot convention

2016-02-01 Thread Marcel Ruiz Forns
amended) > User-Agent policy. > > Without that plan, successfully implemented, you will not get quality > data (i.e. using 'Netscape' in the U-A to guess 'human' would perform > better). > > On Tue, Feb 2, 2016 at 1:24 AM, Marcel Ruiz Forns > wrote: > > S

Re: [Analytics] WikimediaBot convention

2016-02-01 Thread Marcel Ruiz Forns
nks all for the feedback! On Mon, Feb 1, 2016 at 3:16 PM, Marcel Ruiz Forns wrote: > Clearly Wikipedia et al. uses bot to refer to automated software that >> edits the site but it seems like you are using the term bot to refer to all >> automated software and it might be goo

Re: [Analytics] WikimediaBot convention

2016-02-01 Thread Marcel Ruiz Forns
quot; > at all? I don't think we need to differentiate between "spiders" and "bots". The most important question we want to respond is: how much of the traffic we consider "human" today is actually "bot". So, +1 "bot" (case-insensitive).

Re: [Analytics] [Engineering] Eventlogging Mysql consumers downtime

2016-01-28 Thread Marcel Ruiz Forns
;>> >>> > >>> >>> > _______ >>> >>> > Engineering mailing list >>> >>> > engineer...@lists.wikimedia.org >>> >>> > https://lists.wikimedia.org/mailman/listinfo/engineering >>> >>> > >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Oliver Keyes >>> >>> Count Logula >>> >>> Wikimedia Foundation >>> >>> >>> >>> ___ >>> >>> Engineering mailing list >>> >>> engineer...@lists.wikimedia.org >>> >>> https://lists.wikimedia.org/mailman/listinfo/engineering >>> >> >>> >> >>> > >>> >>> >>> >>> -- >>> Oliver Keyes >>> Count Logula >>> Wikimedia Foundation >>> >>> ___ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >> >> >> ___ >> Engineering mailing list >> engineer...@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/engineering >> >> > > > -- > Amir Elisha Aharoni‏ ። אָמִיר אֱלִישָׁע אַהֲרוֹנִי > Language Engineering‏ ። הַנְדָּסָה לְשׁוֹנִית > Wikimedia Foundation‏ ። קֶרֶן וִיקִימֶדְיָה > > ___ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] WikimediaBot convention

2016-01-28 Thread Marcel Ruiz Forns
esn't include crawler operators. No, it's not. It is to reach consensus on the convention and identify things that we can do to improve its application. Thanks for pointing that out, it was unclear in the initial email. On Thu, Jan 28, 2016 at 9:18 AM, Federico Leva (Nemo) wrote: >

[Analytics] WikimediaBot convention

2016-01-27 Thread Marcel Ruiz Forns
ion[2] for bots that EDIT Wikimedia content. [1] https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/Webrequest.java#L49 [2] https://www.mediawiki.org/wiki/Manual:Bots -- *Marcel Ruiz Forns* Analytics Devel

Re: [Analytics] [link] Why Big Data Needs Thick Data

2016-01-27 Thread Marcel Ruiz Forns
gine a world in which every single human being can freely share in the > sum of all knowledge. Help us make it a reality! > https://donate.wikimedia.org > > ___ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists

Re: [Analytics] MobileWikiAppShareAFact event stream was: [WikimediaMobile] Stopping eventlogging events into MobileWikiAppShareAFact table

2016-01-04 Thread Marcel Ruiz Forns
imedia.org/T120292#1854136 ; this got lost a > bit among the other schema changes, cf. > https://phabricator.wikimedia.org/T120292#1864549 ). > > On Sun, Jan 3, 2016 at 12:30 PM, Marcel Ruiz Forns > wrote: > > BTW, MobileWebSectionUsage schema is sending a lot of events since De

Re: [Analytics] MobileWikiAppShareAFact event stream was: [WikimediaMobile] Stopping eventlogging events into MobileWikiAppShareAFact table

2016-01-03 Thread Marcel Ruiz Forns
>>> take >>>>> in order to plan for a larger outage window. >>>>> >>>>> >>>>> Let us know if data should be backfilled as it can be, we anticipate >>>>> events will not flow into table for the better part of one day. >>>>>

Re: [Analytics] Top Edits/Views in 2015 per project?

2016-01-03 Thread Marcel Ruiz Forns
tp://www.wikimedia.org.il > >> Imagine a world in which every single human being can freely share in > the > >> sum of all knowledge. That's our commitment! > >> > >> > >> > > > > -- > Tilman Bayer > Senior Analyst > Wikimedia Foundation > IRC (Freenode): HaeB > > ___ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

[Analytics] [Outage] Small data loss in raw_webrequest on 2015-12-15

2015-12-16 Thread Marcel Ruiz Forns
/Data/Webrequest#Changes_and_known_problems_since_2015-03-04 Sorry for the inconvenience. -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/lis

Re: [Analytics] Echo schema eventlogging

2015-12-16 Thread Marcel Ruiz Forns
should > be if we're querying data on a frequent period basis and taking actions > based on the results of those queries. Otherwise it's a waste of resources > and we should allocate that disk space to something else. > > _______ > Ana

Re: [Analytics] Echo schema eventlogging

2015-12-16 Thread Marcel Ruiz Forns
T. Morgan >>>> Senior Design Researcher >>>> Wikimedia Foundation >>>> User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)> >>>> >>>> ___ >>>> Analytics mailing list >>>> Analytics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>>> >>> >>> ___ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> >> -- >> --Madhu :) >> >> ___ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > ___ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] EventLogging database outage next Tuesday 2015-12-15 10:00 UTC (2 hours)

2015-12-15 Thread Marcel Ruiz Forns
The maintenance has finished now. We will follow up the system to ensure events get backfilled. On Fri, Dec 11, 2015 at 6:19 PM, Marcel Ruiz Forns wrote: > Hi Analytics, > > Next Tuesday, Dec 15, between 10:00 and 12:00 UTC > EventLogging's database m4-master will be dow

[Analytics] EventLogging database outage next Tuesday 2015-12-15 10:00 UTC (2 hours)

2015-12-11 Thread Marcel Ruiz Forns
- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] EventLogging outage in progress?

2015-11-30 Thread Marcel Ruiz Forns
e an acceptable ETA. I would say even 48. > > > > On Fri, Nov 27, 2015 at 2:31 AM, Marcel Ruiz Forns > > > wrote: > >> > >> Thanks, Ori, for having a look at this and restarting EL. > >> > >> I understand it was 01:30 UTC on Friday (today), no

Re: [Analytics] EventLogging outage in progress?

2015-11-27 Thread Marcel Ruiz Forns
ng any events from the Kafka brokers. I ran > eventloggingctl stop / eventloggingctl start and they recovered. Needs to > be investigated more thoroughly. Otto, can you follow up? > > > _______ > Analytics mailing list

Re: [Analytics] [Engineering] Pageview API

2015-11-17 Thread Marcel Ruiz Forns
t; Got something (wikipage, doc, something...) a curious being like me > could read ? > > ___ > Engineering mailing list > engineer...@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/engineering > -- *Marcel Ruiz Forns* Analyti

Re: [Analytics] Changes on eventlogging

2015-11-03 Thread Marcel Ruiz Forns
n the short future. >> >> Of course, any migration could have regressions, so please monitor any >> issue you may find (as I am currently doing, and I have not yet found). >> This will hopefully prevent the issue to happen again. >> >> R

Re: [Analytics] Event Logging incident

2015-10-27 Thread Marcel Ruiz Forns
ncident report. >> >> If you had reports run on October 14th, between 06:00 UTC and 21:00 UTC, >> you should re-run them. >> > > > _______ > Analytics mailing list > Analytics@

Re: [Analytics] Article Traffic Statistics down again

2015-10-16 Thread Marcel Ruiz Forns
___ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikime

Re: [Analytics] Canonical location for metrics documentation

2015-10-14 Thread Marcel Ruiz Forns
t;>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> >>>>> >>>>> >>>>> ___ >>>>> Analytics mailing list >>>>> Analytics@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>> >>>> ___ >>>> Analytics mailing list >>>> Analytics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>> >>> >>> -- >>> Jonathan T. Morgan >>> Senior Design Researcher >>> Wikimedia Foundation >>> User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)> >>> >>> >>> ___ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> > > ___ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] Canonical location for metrics documentation

2015-10-14 Thread Marcel Ruiz Forns
; >>> ___ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >> >> >> _______ &g

Re: [Analytics] General framework for updating database reports

2015-10-06 Thread Marcel Ruiz Forns
Also, of course, Aaron you can ask me any question on this and I'll try to help! On Tue, Oct 6, 2015 at 8:44 PM, Marcel Ruiz Forns wrote: > Dan, thanks for the careful explanation. > > I wanted to add that there is a small documentation on Wikitech for the > reportupda

Re: [Analytics] General framework for updating database reports

2015-10-06 Thread Marcel Ruiz Forns
___ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > ___ > Analytics mailing list > Analytics@lists.wikimedia.

Re: [Analytics] [Survey] Pageview API

2015-09-11 Thread Marcel Ruiz Forns
;>>>> ___ >>>>> Analytics mailing list >>>>> Analytics@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>> >>>> ___

Re: [Analytics] [Technical] Pick storage for pageview cubes

2015-06-16 Thread Marcel Ruiz Forns
k we could add Impala in storage technologies to assess. >>> It allows reading / computing straight from HDFS and should be fast >>> enough for not too bad UEx. >>> Maybe ? >>> >>> >>> On Thu, Jun 11, 2015 at 11:11 PM, Marcel Ruiz Forns < >&g

Re: [Analytics] [Technical] Pick storage for pageview cubes

2015-06-11 Thread Marcel Ruiz Forns
what about choosing 2 stores instead of 3, one of each type, say PostgreSQL and Cassandra? Or, anyone with more thoughts or suggestions? On Wed, Jun 10, 2015 at 1:24 PM, Marcel Ruiz Forns wrote: > If we are going to completely denormalize the data sets for anonymization, > and we expect jus

Re: [Analytics] [Technical] Pick storage for pageview cubes

2015-06-10 Thread Marcel Ruiz Forns
If we are going to completely denormalize the data sets for anonymization, and we expect just slice and dice queries to the database, I think we wouldn't take much advantage of a relational DB, because it wouldn't need to aggregate values, slice or dice, all slices and dices would be precomputed, r

[Analytics] [Technical] Pick storage for pageview cubes

2015-06-08 Thread Marcel Ruiz Forns
*This discussion is intended to be a branch of the thread: "[Analytics] Pageview API Status update".* Hi all, We Analytics are trying to *choose a storage technology to keep the pageview data* for analysis. We don't want to get to a final system that covers all our needs yet (there are still thi

Re: [Analytics] [Analytics-internal] Parsing the app version into the user agent map

2015-06-08 Thread Marcel Ruiz Forns
use access_method >> = 'mobile app' :) >> >> On Mon, Jun 8, 2015 at 12:44 PM, Marcel Ruiz Forns >> wrote: >> >>> + analytics internal >>> >>> Hi Jon and Adam, >>> >>> Yes, this totally helps. It confirms the

Re: [Analytics] EventLogging issues 2015-05-06

2015-05-19 Thread Marcel Ruiz Forns
The data for this period has been back-filled with success. Cheers! On Fri, May 8, 2015 at 4:23 PM, Aaron Halfaker wrote: > Thank you! > > On Fri, May 8, 2015 at 5:12 AM, Marcel Ruiz Forns > wrote: > >> EventLogging suffered from performance problems and data loss from

[Analytics] EventLogging issues 2015-05-06

2015-05-08 Thread Marcel Ruiz Forns
EventLogging suffered from performance problems and data loss from Tuesday 2015-05-05 22:00 UTC to Wednesday 2015-05-06 20:00 UTC (22 hours). During that period, an exceptional amount of events were sent to EL server for a given schema. The system could not handle them properly, and this caused da

Re: [Analytics] analytics-store heads up

2015-04-30 Thread Marcel Ruiz Forns
Thanks Sean! On Thu, Apr 30, 2015 at 1:21 AM, Sean Pringle wrote: > Hi > > analytics-store tmp space filled up today with many large temporary > tables (it was ~32G) from many slow research queries. Those had to be > killed, the database process restarted, and tmp space expanded. > > It's back u

Re: [Analytics] [Technical] WMF-Last-Access

2015-04-27 Thread Marcel Ruiz Forns
+1 'last' ___ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

[Analytics] [Technical] EventLogging issues

2015-04-20 Thread Marcel Ruiz Forns
Hi Analytics, We have found a problem that has been affecting the EventLogging data for one month. Since March 22, 2015 there have been several gaps (of around 1-2 hours length each) without data in all schema tables. You can see the details in the following links: Phabricator task: https://phab

Re: [Analytics] [Technical] Strange behavior of EL m4-master

2015-04-17 Thread Marcel Ruiz Forns
Sean and list, I think we found the problem: The data loss is happening within EL consumer code. The error was skillfully dodging the logs, sorry for that. The root cause is that the db insertion takes too long to keep up with the rate of incoming events, and the events buffer gets big. When big

Re: [Analytics] [Technical] Strange behavior of EL m4-master

2015-04-15 Thread Marcel Ruiz Forns
> > Thanks. If possible, can we have: > > - The exact INSERT statements issued by the MySQL consumer - The UUID values generated for those records > I'll try to get them, sure. > > I followed the master-slave replication lag for some hours, and > perceived a > > pattern in the lag: It gets prog

Re: [Analytics] [Technical] Strange behavior of EL m4-master

2015-04-15 Thread Marcel Ruiz Forns
ive" 2 times, so that's definitely not a conclusive statement. But, there's this hypothesis that the two problems are related. Sean, I hope that helps answering your questions. Let us know if you have any idea on this. Thank you! Marcel On Tue, Apr 14, 2015 at 9:15 PM, Marcel Ruiz For

Re: [Analytics] [Technical] Strange behavior of EL m4-master

2015-04-14 Thread Marcel Ruiz Forns
Sean, thanks for the quick response: > We have a binary log on the EL master that holds the last week of > INSERT statements. It can be dumped and grepped, eg looking at > 10-minute blocks around 2015-04-13 16:30: > Good to know! Zero(!) during 10min after 16:30 doesn't look good. This means th

[Analytics] [Technical] Strange behavior of EL m4-master

2015-04-14 Thread Marcel Ruiz Forns
Hi Sean, Here's Marcel from Analytics. I'd like to comment with you some strange behaviors that we've observed on EventLogging database (m4-master.equiad.wmnet). 1) There are some time spans where there is no data in any table. Examples follow: - 2015-04-09 17:20 -> 18:35 - 2015-04-11 03:

Re: [Analytics] Eventlogging outage

2015-04-08 Thread Marcel Ruiz Forns
Dario, All kinds of event logs were affected. I updated the documentation. On Thu, Apr 9, 2015 at 12:42 AM, Dario Taraborelli < dtarabore...@wikimedia.org> wrote: > to clarify: does this affect all logs or client-side logs only? > > On Apr 8, 2015, at 11:13 AM, Aaron Halfaker > wrote: > > Thank

Re: [Analytics] [Technical] pageviews definition undercounting app requests

2015-03-23 Thread Marcel Ruiz Forns
Awesome Oliver, thanks! Having a look. On Sat, Mar 21, 2015 at 4:32 PM, Oliver Keyes wrote: > Alrighty; existing implementation updated with > https://gerrit.wikimedia.org/r/#/c/198489/ which also exposes the > isAppPageview method and associates a UDF with it (Marcel, this means > you'll be abl

Re: [Analytics] Welcome Joseph

2015-02-19 Thread Marcel Ruiz Forns
Welcome! \o/ On Thu, Feb 19, 2015 at 6:40 PM, Jonathan Morgan wrote: > Welcome, Joseph! Glad to have you onboard. > > Cheers, > Jonathan > > On Thu, Feb 19, 2015 at 8:57 AM, Aaron Halfaker > wrote: > >> Glad to have you, sir. >> >> On Thu, Feb 19, 2015 at 10:08 AM, Dan Andreescu > > wrote: >> >

[Analytics] Udp2log TSVs missing 1 hour of data

2015-01-15 Thread Marcel Ruiz Forns
Hi, On Jan 13th 2015 between 22:20 and 23:18 UTC (~1 hour) stat1002 ceased receiving TSV data from udp2log for the following data streams: - Mobile requests stream - Pagecounts-raw - Requests stream - Zero requests stream The reason for that were routing problems in the firewall introduced by a