www.predata.com/>
>
> ___
> Analytics mailing list -- analytics@lists.wikimedia.org
> To unsubscribe send an email to analytics-le...@lists.wikimedia.org
>
--
*Marcel Ruiz Forns** (he/him)*
Senior Software Engineer
___
_
> Analytics mailing list -- analytics@lists.wikimedia.org
> To unsubscribe send an email to analytics-le...@lists.wikimedia.org
>
--
*Marcel Ruiz Forns** (he/him)*
Senior Software Engineer
___
Analytics mailing list -- ana
end an email to analytics-le...@lists.wikimedia.org
>>>
>> ___
>> Analytics mailing list -- analytics@lists.wikimedia.org
>> To unsubscribe send an email to analytics-le...@lists.wikimedia.org
>>
> --
> Joshua Haecker
> CEO, Co-F
wiki/Event_Platform/EventStreams
> ___
> Analytics mailing list -- analytics@lists.wikimedia.org
> To unsubscribe send an email to analytics-le...@lists.wikimedia.org
>
--
*Marcel Ruiz Forns** (he/him)*
Senior Software Engineer
___
Analytics mailing list -- analytics@lists.wikimedia.org
To unsubscribe send an email to analytics-le...@lists.wikimedia.org
/Public>
.
As a next step, we'll add the corresponding visualization to Wikistats2
<http://stats.wikimedia.org>.
Cheers!
On behalf of the Analytics team,
--
*Marcel Ruiz Forns** (he/him)*
Senior Software Engineer
___
Analytics mailing list
An
t; thanks, thorsten
>
> [1] https://meta.wikimedia.org/wiki/Research:Wikistats_metrics/Edits
>
> --
> Thorsten Ruprechter
>
> Institute of Interactive Systems and Data Science (ISDS)
> Graz University of Technology, Austria
>
> _______
t;
> Thanks!
>
> Luca (on behalf of the Analytics team)
>
>
> ___
> Research-Internal mailing list
> research-inter...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/research-internal
s able to get results.
> I'll share more information on this in a separate email if I'm able to
> reproduce.
>
> Thank you,
>
> Vipul
> ___________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
--
*Mar
interrupted (we'll let the
outstanding ones finish, if possible).
If this will break some important job that you have running, please let us
know in the Phabricator task above or via IRC (#wikimedia-analytics).
Cheers!
Marcel (on behalf of the Analytics team)
--
*Marcel Ruiz Forns** (he/him)*
Anal
___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
--
*Marcel Ruiz Forns** (he/him)*
Analytics Developer @ Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
provements (ex. 2018-08) also be visible?
>
> Thanks for your help!
>
> Marcel Ruiz Forns 於 2019年1月11日 週五,02:36寫道:
>
>> If the analytics team add the data of Chinese Wikiversity into the
>>> database (base source), will WikiStats and WikiStats 2 both get updated? If
e to navigate and collect raw data. Yet, I prefer to
> use WikiStats 1 than WikiStats 2 as the reference for statistics.)
>
> Again, thanks for your precious answer! It’s really helpful for both me
> and the Chinese Wikiversity community.
>
> Marcel Ruiz Forns 於 2019年1月10日 週四,23:09
/wiki/Analytics/AQS/Unique_Devices
On Thu, Jan 10, 2019 at 10:34 AM Eric Liu wrote:
> Are WikiStats 1 and WikiStats 2’s database the same? And, is WikiScan a
> part of WikiStats?
>
> Thanks for your help!
>
> Marcel Ruiz Forns 於 2019年1月9日 週三,23:34寫道:
>
>> [Adding Eric Li
alytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
--
*Marcel Ruiz Forns** (he/him)*
Analytics Developer @ Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
ences
>>>>>>> MobileWikiAppOfflineLibrary
>>>>>>> MobileWikiAppOnboarding
>>>>>>> MobileWikiAppOnThisDay
>>>>>>> MobileWikiAppPageScroll
>>>>>>> MobileWikiAppProtectedEditAttempt
>>>>>>> Mobi
imedia.org/wiki/Analytics>), or write something from
>> scratch that would expose data with the same API format.
>>
>> Federico
>>
>
>
> _______
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lis
__
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
be grabbing stats from:
https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews
https://dumps.wikimedia.org/
Cheers!
[1] https://wikimediafoundation.org/wiki/Privacy_policy
On Fri, Mar 2, 2018 at 5:16 PM, Marcel Ruiz Forns
wrote:
> Hi Angelina,
>
> I don't think there's any
man/listinfo/analytics
>>>>
>>>
>>>
>>
>> ___
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>&
_____
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
___
>>>>>>> Analytics mailing list
>>>>>>> Analytics@lists.wikimedia.org
>>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Dan Garry
>>>>>> Senior Product Manager, Editing
>>>>>> Wikimedia Foundation
>>>>>>
>>>>>> ___
>>>>>> Analytics mailing list
>>>>>> Analytics@lists.wikimedia.org
>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>>
>>>>>>
>>>>>
>>>>> ___
>>>>> Analytics mailing list
>>>>> Analytics@lists.wikimedia.org
>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>
>>>>>
>>>> ___
>>>> Analytics mailing list
>>>> Analytics@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>
>>> ___
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>>
>>
>>
>> ___
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>> If you only consider the numbers above 3 % or so, these tend to be rather
>>
>>
>> stable, so the 2013 data is still a good approximation.
>>
>>
>>
>>
>>
>> > In
>>
>>
>> > addition not all qu
l produce some
> policy-relevant research.
>
>
> Best regards,
> Alexander Ugarov,
> Ph. D. Candidate.
> Sam M. Walton College of Business
> Department of Economics
> University of Arkansas
> Office: ECOB260
> E-mail: auga...@uark.edu.
>
> ___
list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
Issa
>>>>
>>>>
>>>> ___
>>>> Analytics mailing list
>>>> Analytics@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>>
>>
lve any anticipated need?
>
> _______
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
ogging/UserAgentSanitiz
>>> ation
>>>
>>> Nemo
>>>
>>>
>>> ___
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/a
tor.wikimedia.org/T108850
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
t; Thanks for the answers, Nuria and Marcel! :)
> Cheers,
>
> Marc
>
> El dj., 30 juny 2016 a les 14:16, Marcel Ruiz Forns ()
> va escriure:
>
>> Marc, I also see what Nuria says. Also please consider that the majority
>> of Wikipedia sessions have only one pageview.
t;>>>>>>> data and
>>>>>>>> if there would be any implication because of privacy concerns.
>>>>>>>>
>>>>>>>> Thank you very much!
>>>>>>>>
>>>>>>>> Best,
t;
> ___________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
ch.
>
> --
> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
> http://aharoni.wordpress.com
> “We're living in pieces,
> I want to live in peace.” – T. Moore
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
&g
> need to know is that
>> > is a good proxy metric to measure Unique Users, more info below.
>> >
>> > Since 2009, the Wikimedia Foundation used comScore to report data about
>> > unique web visitors. In January 2016, however, we decided to sto
ce file is big (23M), and filtering it just takes
> forever. We'll try to think of ways to split that up into maybe the last
> three months and "all".
>
> On Tue, Mar 22, 2016 at 9:35 AM, Marcel Ruiz Forns
> wrote:
>
>> Hi editing,
>>
>> Jus
Hi editing,
Just to let you know that after the modifications to the Edit table in EL
database, the reports have been able to catch up and back-fill until today.
So https://edit-analysis.wmflabs.org/compare/ is working again.
Cheers!
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia
d probably raise technical issues, but seems
that we can benefit from it. https://phabricator.wikimedia.org/T125731
Cheers!
On Wed, Feb 3, 2016 at 11:43 PM, Marcel Ruiz Forns
wrote:
> John, thank you a lot for taking the time to answer my question. My
> responses inline (I rearran
d insight. Thanks. Currently, the User-Agent policy is
not implemented in our regular expressions, meaning: it does not match
emails, nor user pages or other mediawiki urls. It could also, as you
suggest, implement matching github accounts, or tools.wmflabs.org. We
Analytics should tackle that. I will
)]] Sr Software EngineerBoise, ID USA
> irc: bd808 v:415.839.6885 x6855
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
ator.wikimedia.org/T99373#1859170).
>>>
>>> There are a lot of clients that need to be upgraded or be
>>> decommissioned for this 'add bot' strategy to be effective in the near
>>> future. see https://www.mediawiki.org/wiki/API:Client_code
>&g
amended)
> User-Agent policy.
>
> Without that plan, successfully implemented, you will not get quality
> data (i.e. using 'Netscape' in the U-A to guess 'human' would perform
> better).
>
> On Tue, Feb 2, 2016 at 1:24 AM, Marcel Ruiz Forns
> wrote:
> > S
nks all for the feedback!
On Mon, Feb 1, 2016 at 3:16 PM, Marcel Ruiz Forns
wrote:
> Clearly Wikipedia et al. uses bot to refer to automated software that
>> edits the site but it seems like you are using the term bot to refer to all
>> automated software and it might be goo
quot;
> at all?
I don't think we need to differentiate between "spiders" and "bots". The
most important question we want to respond is: how much of the traffic we
consider "human" today is actually "bot". So, +1 "bot" (case-insensitive).
;>> >>> >
>>> >>> > _______
>>> >>> > Engineering mailing list
>>> >>> > engineer...@lists.wikimedia.org
>>> >>> > https://lists.wikimedia.org/mailman/listinfo/engineering
>>> >>> >
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Oliver Keyes
>>> >>> Count Logula
>>> >>> Wikimedia Foundation
>>> >>>
>>> >>> ___
>>> >>> Engineering mailing list
>>> >>> engineer...@lists.wikimedia.org
>>> >>> https://lists.wikimedia.org/mailman/listinfo/engineering
>>> >>
>>> >>
>>> >
>>>
>>>
>>>
>>> --
>>> Oliver Keyes
>>> Count Logula
>>> Wikimedia Foundation
>>>
>>> ___
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>
>>
>> ___
>> Engineering mailing list
>> engineer...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/engineering
>>
>>
>
>
> --
> Amir Elisha Aharoni ። אָמִיר אֱלִישָׁע אַהֲרוֹנִי
> Language Engineering ። הַנְדָּסָה לְשׁוֹנִית
> Wikimedia Foundation ። קֶרֶן וִיקִימֶדְיָה
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
esn't include crawler operators.
No, it's not. It is to reach consensus on the convention and identify
things that we can do to improve its application. Thanks for pointing that
out, it was unclear in the initial email.
On Thu, Jan 28, 2016 at 9:18 AM, Federico Leva (Nemo)
wrote:
>
ion[2] for bots that EDIT Wikimedia
content.
[1]
https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/Webrequest.java#L49
[2] https://www.mediawiki.org/wiki/Manual:Bots
--
*Marcel Ruiz Forns*
Analytics Devel
gine a world in which every single human being can freely share in the
> sum of all knowledge. Help us make it a reality!
> https://donate.wikimedia.org
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists
imedia.org/T120292#1854136 ; this got lost a
> bit among the other schema changes, cf.
> https://phabricator.wikimedia.org/T120292#1864549 ).
>
> On Sun, Jan 3, 2016 at 12:30 PM, Marcel Ruiz Forns
> wrote:
> > BTW, MobileWebSectionUsage schema is sending a lot of events since De
>>> take
>>>>> in order to plan for a larger outage window.
>>>>>
>>>>>
>>>>> Let us know if data should be backfilled as it can be, we anticipate
>>>>> events will not flow into table for the better part of one day.
>>>>>
tp://www.wikimedia.org.il
> >> Imagine a world in which every single human being can freely share in
> the
> >> sum of all knowledge. That's our commitment!
> >>
> >>
> >>
>
>
>
> --
> Tilman Bayer
> Senior Analyst
> Wikimedia Foundation
> IRC (Freenode): HaeB
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
/Data/Webrequest#Changes_and_known_problems_since_2015-03-04
Sorry for the inconvenience.
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/lis
should
> be if we're querying data on a frequent period basis and taking actions
> based on the results of those queries. Otherwise it's a waste of resources
> and we should allocate that disk space to something else.
>
> _______
> Ana
T. Morgan
>>>> Senior Design Researcher
>>>> Wikimedia Foundation
>>>> User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
>>>>
>>>> ___
>>>> Analytics mailing list
>>>> Analytics@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>>
>>>>
>>>
>>> ___
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>>
>>
>>
>> --
>> --Madhu :)
>>
>> ___
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
The maintenance has finished now.
We will follow up the system to ensure events get backfilled.
On Fri, Dec 11, 2015 at 6:19 PM, Marcel Ruiz Forns
wrote:
> Hi Analytics,
>
> Next Tuesday, Dec 15, between 10:00 and 12:00 UTC
> EventLogging's database m4-master will be dow
-
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
e an acceptable ETA. I would say even 48.
> >
> > On Fri, Nov 27, 2015 at 2:31 AM, Marcel Ruiz Forns >
> > wrote:
> >>
> >> Thanks, Ori, for having a look at this and restarting EL.
> >>
> >> I understand it was 01:30 UTC on Friday (today), no
ng any events from the Kafka brokers. I ran
> eventloggingctl stop / eventloggingctl start and they recovered. Needs to
> be investigated more thoroughly. Otto, can you follow up?
>
>
> _______
> Analytics mailing list
t; Got something (wikipage, doc, something...) a curious being like me
> could read ?
>
> ___
> Engineering mailing list
> engineer...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/engineering
>
--
*Marcel Ruiz Forns*
Analyti
n the short future.
>>
>> Of course, any migration could have regressions, so please monitor any
>> issue you may find (as I am currently doing, and I have not yet found).
>> This will hopefully prevent the issue to happen again.
>>
>> R
ncident report.
>>
>> If you had reports run on October 14th, between 06:00 UTC and 21:00 UTC,
>> you should re-run them.
>>
>
>
> _______
> Analytics mailing list
> Analytics@
___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikime
t;>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>>
>>>>>
>>>>>
>>>>> ___
>>>>> Analytics mailing list
>>>>> Analytics@lists.wikimedia.org
>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>
>>>>>
>>>>
>>>> ___
>>>> Analytics mailing list
>>>> Analytics@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>>
>>>
>>>
>>> --
>>> Jonathan T. Morgan
>>> Senior Design Researcher
>>> Wikimedia Foundation
>>> User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
>>>
>>>
>>> ___
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>>
>>
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
--
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
;
>>> ___
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>
>>
>> _______
&g
Also, of course, Aaron you can ask me any question on this and I'll try to
help!
On Tue, Oct 6, 2015 at 8:44 PM, Marcel Ruiz Forns
wrote:
> Dan, thanks for the careful explanation.
>
> I wanted to add that there is a small documentation on Wikitech for the
> reportupda
___
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.
;>>>> ___
>>>>> Analytics mailing list
>>>>> Analytics@lists.wikimedia.org
>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>
>>>>>
>>>>
>>>> ___
k we could add Impala in storage technologies to assess.
>>> It allows reading / computing straight from HDFS and should be fast
>>> enough for not too bad UEx.
>>> Maybe ?
>>>
>>>
>>> On Thu, Jun 11, 2015 at 11:11 PM, Marcel Ruiz Forns <
>&g
what about choosing 2 stores instead of 3, one of each
type, say PostgreSQL and Cassandra?
Or, anyone with more thoughts or suggestions?
On Wed, Jun 10, 2015 at 1:24 PM, Marcel Ruiz Forns
wrote:
> If we are going to completely denormalize the data sets for anonymization,
> and we expect jus
If we are going to completely denormalize the data sets for anonymization,
and we expect just slice and dice queries to the database,
I think we wouldn't take much advantage of a relational DB,
because it wouldn't need to aggregate values, slice or dice,
all slices and dices would be precomputed, r
*This discussion is intended to be a branch of the thread: "[Analytics]
Pageview API Status update".*
Hi all,
We Analytics are trying to *choose a storage technology to keep the
pageview data* for analysis.
We don't want to get to a final system that covers all our needs yet (there
are still thi
use access_method
>> = 'mobile app' :)
>>
>> On Mon, Jun 8, 2015 at 12:44 PM, Marcel Ruiz Forns
>> wrote:
>>
>>> + analytics internal
>>>
>>> Hi Jon and Adam,
>>>
>>> Yes, this totally helps. It confirms the
The data for this period has been back-filled with success.
Cheers!
On Fri, May 8, 2015 at 4:23 PM, Aaron Halfaker
wrote:
> Thank you!
>
> On Fri, May 8, 2015 at 5:12 AM, Marcel Ruiz Forns
> wrote:
>
>> EventLogging suffered from performance problems and data loss from
EventLogging suffered from performance problems and data loss from Tuesday
2015-05-05 22:00 UTC to Wednesday 2015-05-06 20:00 UTC (22 hours).
During that period, an exceptional amount of events were sent to EL server
for a given schema. The system could not handle them properly, and this
caused da
Thanks Sean!
On Thu, Apr 30, 2015 at 1:21 AM, Sean Pringle
wrote:
> Hi
>
> analytics-store tmp space filled up today with many large temporary
> tables (it was ~32G) from many slow research queries. Those had to be
> killed, the database process restarted, and tmp space expanded.
>
> It's back u
+1 'last'
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
Hi Analytics,
We have found a problem that has been affecting the EventLogging data for
one month. Since March 22, 2015 there have been several gaps (of around 1-2
hours length each) without data in all schema tables.
You can see the details in the following links:
Phabricator task:
https://phab
Sean and list,
I think we found the problem:
The data loss is happening within EL consumer code.
The error was skillfully dodging the logs, sorry for that.
The root cause is that the db insertion takes too long to keep up
with the rate of incoming events, and the events buffer gets big.
When big
>
> Thanks. If possible, can we have:
>
> - The exact INSERT statements issued by the MySQL consumer
- The UUID values generated for those records
>
I'll try to get them, sure.
> > I followed the master-slave replication lag for some hours, and
> perceived a
> > pattern in the lag: It gets prog
ive" 2 times, so that's
definitely not a conclusive statement. But, there's this hypothesis that
the two problems are related.
Sean, I hope that helps answering your questions.
Let us know if you have any idea on this.
Thank you!
Marcel
On Tue, Apr 14, 2015 at 9:15 PM, Marcel Ruiz For
Sean, thanks for the quick response:
> We have a binary log on the EL master that holds the last week of
> INSERT statements. It can be dumped and grepped, eg looking at
> 10-minute blocks around 2015-04-13 16:30:
>
Good to know!
Zero(!) during 10min after 16:30 doesn't look good. This means th
Hi Sean,
Here's Marcel from Analytics.
I'd like to comment with you some strange behaviors that we've observed on
EventLogging database (m4-master.equiad.wmnet).
1) There are some time spans where there is no data in any table. Examples
follow:
- 2015-04-09 17:20 -> 18:35
- 2015-04-11 03:
Dario,
All kinds of event logs were affected.
I updated the documentation.
On Thu, Apr 9, 2015 at 12:42 AM, Dario Taraborelli <
dtarabore...@wikimedia.org> wrote:
> to clarify: does this affect all logs or client-side logs only?
>
> On Apr 8, 2015, at 11:13 AM, Aaron Halfaker
> wrote:
>
> Thank
Awesome Oliver, thanks!
Having a look.
On Sat, Mar 21, 2015 at 4:32 PM, Oliver Keyes wrote:
> Alrighty; existing implementation updated with
> https://gerrit.wikimedia.org/r/#/c/198489/ which also exposes the
> isAppPageview method and associates a UDF with it (Marcel, this means
> you'll be abl
Welcome! \o/
On Thu, Feb 19, 2015 at 6:40 PM, Jonathan Morgan
wrote:
> Welcome, Joseph! Glad to have you onboard.
>
> Cheers,
> Jonathan
>
> On Thu, Feb 19, 2015 at 8:57 AM, Aaron Halfaker
> wrote:
>
>> Glad to have you, sir.
>>
>> On Thu, Feb 19, 2015 at 10:08 AM, Dan Andreescu > > wrote:
>>
>
Hi,
On Jan 13th 2015 between 22:20 and 23:18 UTC (~1 hour) stat1002 ceased
receiving TSV data from udp2log for the following data streams:
- Mobile requests stream
- Pagecounts-raw
- Requests stream
- Zero requests stream
The reason for that were routing problems in the firewall introduced by a
85 matches
Mail list logo