As Gergo pointed out, these early results may be because our first beta
testers may have some faster connections than average users. But could
there also be some bots or other traffic which could be distorting the
results?
I know that we are working next on histograms that will give us a better
The timestamp at which the current flow through the funnel began
(will need to be stored in a cookie and reset at loads of step 1)
I would strongly advise against using cookies for this purpose. Cookies
will easily get bloated if we set a precedence of using them to
'support' event logging
Thanks Aaron, I will try something along these lines.
This avoids the latency concerns mentioned by Nuria, and
it is very flexible - we'll see how painful it is to aggregate the data on
the backend.
So we agree you do not need to use cookies right? Being a single page
app you should not need
[gerco] - whenever we display geometric means, we weight by sampling rate
(exp(sum(sampling_rate * ln(value)) / sum(sampling_rate)) instead of
exp(avg(ln(value
[gilles] I don't follow the logic here. Like percentiles, averages should be
unaffected by sampling, geometric or not.
Not to hijack the thread, but: to do this in the schema itself confuses the
structure of the data
with the mechanics of its use. I think having a couple of helpers in
JavaScript and PHP
for simple random sampling is sufficient.
Much agree with ori here. We would be bloating schema with
[gerco]From action events, we were getting about 15M a day,
and we only use them to show total counts (daily number of clicks etc).
How do we tell when the sampling ratio is right for that?
[gilles] I think you're overthinking it, you seem to be looking for the
perfect figure. Let's start with an
It would help if limn set up an empty robots.txt instead of returning garbage
to search engines. :)
That might help a very small bit as much of lim is client side
generated. The core problem is that limn is just a visualization tool,
there is no browsing component so either you know the endpoint
If someone could document the reasons why the userName is needed on this
schema it will be great. They can be documented on the schema talk page:
http://meta.wikimedia.org/wiki/Schema_talk:ServerSideAccountCreation
When I looked at this issue early on it was not at all obvious to me why -
if you
Hello,
Just a brief note to let everyone know that the analytics team is hiring,
if you have an an interested in analytics, Wikipedia and its sister
projects we would love to hear from you.
Check our positions and apply:
https://www.mediawiki.org/wiki/Analytics/Research_and_Data#Open_positions
mmm... I am not sure whether 'per schema' reports worked well before. Need
to look at code and see whether the schema counts are being sent.
Overall counts seem to be working well:
(to public list and cc-ing Nemo)
Hello,
Since last time we had an increase in throughput in Even Logging Nemo had
to notify us via e-mail this is just a brief note to the list to say that
we now have throughput monitoring for event logging and it is working.
We had a throughput spike today that
Team:
I have added some info to wikitech on how to troubleshoot issues with EL
and graphite:
https://wikitech.wikimedia.org/wiki/EventLogging#Fix_graphite_counts_not_working.3F
https://wikitech.wikimedia.org/wiki/EventLogging#Graphite
Thanks,
Nuria
Hello,
We have restored per schema monitoring for Event Logging in graphite.
Users of the Event Logging system can use the schema monitoring to see how
big (or small) is the their usage of EventLogging compared to the total
throughput of events.
See for example the overall rate of incoming
Gerco, I was trying to access: http://multimedia-metrics.wmflabs.org/ but
no luck.
There is a third choice as far as I can see (my team needs to double check
me on this).
You could have a metric in wikimetrics that harvest the data you are
interested on from enwiki,eswiki, arwiki databases
[Steven] Considering the pain and suffering Limn causes us, this seems
like an interesting
[Steven] avenue to explore for internal dashboard needs.
So true. It sure causes me pain and suffering seeing every js library known
to mankind being used there. :)
We will definitely take a look at the
, 2014 at 8:03 PM, Steven Walling swall...@wikimedia.org
wrote:
On Thu, Jul 10, 2014 at 10:40 AM, Nuria Ruiz nu...@wikimedia.org wrote:
Please take a look at the prototype of the editor vital signs dashboard
as that makes the point of what is what we are doing in the near term:
http
Hello everyone,
Just an FYI that we gave a talk yesterday about the hadoop infrastructure
we have recently set up in production to receive and store pageview data.
Talk is about 25 minutes long and recording is available here:
https://plus.google.com/u/0/events/c53ho5esd0luccd09a1c30rlrmg
a) aim to track total users who enable/disable Media Viewer, rather than
just events
b) switch to a 3-state preference setting: enabled / disabled / default
c) try to measure the total number of users in each group (instead of
daily events)
I assume we are talking about logging stuff for logged
(sending to public analytics list plus people with whom we have talked
about dashboard technologies in the past)
Team:
As you known we are building a dashboard to showcase editor engagement
metrics and to explore replacement of our current dashboarding technology.
We have spent time researching
Hackathon.
My Hackathon wish is to duplicate and reapply what Nuria Ruiz and
Andrew Otto has done for NARA analytics pilot.
https://commons.wikimedia.org/wiki/Commons:GLAMwiki_Toolset_Project/NARA_analytics_pilot
So to your knowledge, is it feasible to do so, in terms of (a) setting
up
Team,
I have updated the reportcard instructions on how to generate the
reportcard from the files Erik Z sends.
https://www.mediawiki.org/wiki/Analytics/ReportCard
Thanks,
___
Analytics mailing list
Analytics@lists.wikimedia.org
Should we be the ones taking care of it? I'm not sure that the DB
credentials I currently have can delete content.
Neither the ones we have. In the absence of a regular cleanup process
(which is on our team to do) i think we just have to request Sean Pringle
to delete the data.
If anyone knows
Please correct amend as needed:
http://www.mediawiki.org/wiki/Extension:EventLogging/sendBeacon#Meeting_notes_for_10.2F3_meeting
Thanks to everyone attending
___
Analytics mailing list
Analytics@lists.wikimedia.org
We can automate purging using the MariaDB using the Event Scheduler[1] if
you guys want a once-off-set-and-forget solution. Eg
This sounds great for all the tables discussed on the thread. Is easy to
add tables to that procedure?
On Mon, Oct 6, 2014 at 8:22 AM, Sean Pringle
At some point I believe we hope to just, you know. Have a regularly
updated browser matrix somewhere.
I REALLY think this should make it into our goals, if it cannot be done
this quarter it should for sure be done this quarter.
Do we not have more recent data than May?
On Fri, Oct 10, 2014 at
oke...@wikimedia.org wrote:
On 10 October 2014 16:02, Nuria Ruiz nu...@wikimedia.org wrote:
At some point I believe we hope to just, you know. Have a regularly
updated browser matrix somewhere.
I REALLY think this should make it into our goals, if it cannot be done
this quarter it should
to the newly updated version.
On Fri, Oct 10, 2014 at 9:59 PM, Oliver Keyes oke...@wikimedia.org wrote:
Woah! Nice :D How are definitions updates handled?
On 10 October 2014 18:58, Nuria Ruiz nu...@wikimedia.org wrote:
1. A UDF for ua-parser or whatever we decide to use (this will possibly
(with preliminary data) is
that neither 2.1 nor 2.2 amount to 1% of traffic to the mobile site
On Fri, Oct 17, 2014 at 2:37 PM, Christian Aistleitner
christ...@quelltextlich.at wrote:
Hi,
[ leaving other things in this thread aside ]
On Thu, Oct 16, 2014 at 07:15:03PM -0700, Nuria Ruiz wrote:
iOS
The pngs do not render for me but have you seen so-called-treemap plots to
represent screen size in the user base? They are very self descriptive.
Here is a famous one for android devices and screen sizes, scroll down a
bit for device fragmentation:
Hello,
To comply with our privacy policy we are going to purge logs in 1002 that
are older than 90 days. Please let us know whether this is an issue. We
hope to have these changes done by the end of next week.
A concrete example:
Logs in, for example, the eventlogging archiving directory:
there affect what logs we have stored in the DB? Is
this an intermediate log storage place, a canonical one, etc.?
What will we no longer be able to do after it is pruned?
-Aaron
On Thu, Oct 30, 2014 at 2:35 PM, Nuria Ruiz nu...@wikimedia.org wrote:
Also, I'm not clear on the significance of the EL
),
and whether the event validates against the schema. For the sample output
you pasted earlier, or another sample output, can you let us know if
validation section shows Valid?
Leila
On Mon, Nov 10, 2014 at 3:24 PM, Nuria Ruiz nu...@wikimedia.org wrote:
Joel,
For questions like these going forward
come in as of late, which could point to an issue on the
setup. I will look into it some more.
Thanks,
Nuria
On Wed, Nov 12, 2014 at 10:40 AM, Nuria Ruiz nu...@wikimedia.org wrote:
To keep archives happy: Beta setup post events to
http://bits.beta.wmflabs.org/event.gif
http
Foundation
jsahl...@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nu...@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish
instance so the flow of EL events should be the same one that production.
Now I looked on deployment-eventlogging02
...@wikimedia.org
On Nov 13, 2014, at 9:42 AM, Nuria Ruiz nu...@wikimedia.org wrote:
Hello,
Taking last statement back, asked Yuvi and beta does have a varnish
instance so the flow of EL events should be the same one that production.
Now I looked on deployment-eventlogging02, which is the EL
the issue, and the fix is
waiting approval from ops. Let's touch-base tomorrow to see if we see
events.
Leila
On Thu, Nov 13, 2014 at 1:30 PM, Nuria Ruiz nu...@wikimedia.org wrote:
Joel:
I see, I was hoping to set aside the beta issues but if you are not
deploying to prod any time soon I guess
be
appreciated (maybe get the data in a way we could use some quick
d3-based tool http://code.shutterstock.com/rickshaw/?).
Thanks
Pau
On Mon, Nov 17, 2014 at 8:38 AM, Joel Sahleen jsahl...@wikimedia.org
wrote:
On Nov 17, 2014, at 9:13 AM, Nuria Ruiz nu...@wikimedia.org wrote:
Since event
it was not
possible to get things ready in advance. I find this approach could be
problematic, but I'm happy to follow the Analytics advice on this.
In any case, as said before, this is worth checking with product.
Pau
On Mon, Nov 17, 2014 at 12:17 PM, Nuria Ruiz nu...@wikimedia.org wrote:
Joel,
Please
Team:
Besides the ability of testing in beta labs and the monitoring that ori
highlited the incoming raw stream of events is available in 1003/1002 on
port 8600.
From 1002 or 1003 you can run: zsub vanadium.eqiad.wmnet:8600 and see the
incoming stream.
I am not sure that something beyond that
But I see that meanwhile a Phabricator task got added, and I guess I
am alone with my judgement :-)
Actually, I fully agree with you than no more infrastructure in this regard
is needed and I think we were a little fast filing tasks here. I really
think that every time we find ourselves testing in
Also keeping two systems active could lead to requests going into two
places
Yes, this will certainly happen.
On Mon, Dec 15, 2014 at 10:53 AM, Grace Gellerman ggeller...@wikimedia.org
wrote:
Should we talk more about this in our Research staff meeting on Tuesday?
I agree that we need to
QA in beta labs is good but not enough. We still need to do QA when a
feature goes to production and currently
This is true but at the same time, I do not see anything in the description
of your FF events that could not be tested on beta-labs. If we are talking
add-block that can be tested even
(sending to public list)
I have started a doc in wikitech that describes an oozie 101 example and
goes a little into how to troubleshoot oozie jobs. Still WIP.
Will update as work progresses:
https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Oozie
Please edit/correct as needed.
Adding mobile tech so they are aware, I am guessing we need to query for
that data in a more efficient fashion.
On Mon, Dec 22, 2014 at 4:10 AM, Sean Pringle sprin...@wikimedia.org
wrote:
Had to kill queries, lest analytics-store grind to a halt and take even
longer to recover.
These ones:
As Kevin is on vacation I have lower priority to Normal for the task we are
not working on in the immediate future but left the other two at highest.
Note that while those tickets do not have updates related tickets do so.
The updates are visible going through the blocked by section.
Thanks,
Hello,
The more important question is where will your data come from: event
logging? graphite? elsewhere? visualization comes secondary to this.
EventLogging is a good solution for structured, somewhat complex,
application data, graphite is s good solution for plain counters, which is
well
Team,
Christian just let me know about the operator precedence in hive. Everyone
writing queries should read about this as precedence it's not what you
might expect and you query might end up taking fo eve making other
users unhappy.
.
These will be collected and dumped separately, as per
https://www.mediawiki.org/wiki/Requests_for_comment/Media_file_request_counts
.
Erik
From: analytics-boun...@lists.wikimedia.org
[mailto:analytics-boun...@lists.wikimedia.org] On Behalf Of Nuria
Ruiz
Sent: Wednesday
Team:
EL has dropped events for about 8 hours last night. The analytics team
shall work on backfilling that data. Here is the backlog item associated to
that task:
https://phabricator.wikimedia.org/T88692
Thanks,
Nuria
___
Analytics mailing list
/Media_file_request_counts
.
Erik
From: analytics-boun...@lists.wikimedia.org
[mailto:analytics-boun...@lists.wikimedia.org] On Behalf Of Nuria
Ruiz
Sent: Wednesday, February 04, 2015 22:28
To: A mailing list for the Analytics Team at WMF and everybody who
For example, not collecting usage data about certain sections of our
population (e.g. IE10 users where DNT is set by default) means that we
don't know if our software works for them. This isn't free, and in the
long-term, it can have substantial negative effects. If DNT was always
disabled by
that there is a big detachment between user expectations of DNT and
what the protocol actually does, and so we should probably avoid
treating that protocol as a flag.
On 14 January 2015 at 13:45, Nuria Ruiz nu...@wikimedia.org wrote:
For example, not collecting usage data about certain sections
What I find concerning is the idea that a biased subset of our users would
be categorically ignored for this type of evaluation. If you agree with
me that such evaluation is valuable to our users, I think you ought to also
find such categorical exclusions concerning.
Dan has mentioned a possible
in
detail behavior of users that use, say, opera mini (made up example)
On Fri, Jan 16, 2015 at 9:10 AM, Nuria Ruiz nu...@wikimedia.org wrote:
What I find concerning is the idea that a biased subset of our users
would be categorically ignored for this type of evaluation. If you agree
with me
For switchover of writes, we'll need to coordinate an EL consumer restart
to use a new CNAME of m4-master.eqiad.wmnet
This is configuration change on the EL config plus a small downtime and a
re-start (easy). I am not sure how user /passwords are setup on the config
so cc-ing otto to keep him in
Hello,
My 2 cents:
Tracking scrolling issues (jank) down is not easily done and in that case
the API seems that it might actually help you quantify the performance
gains/losses from making the scrolling experience smoother across your user
base (just an example). Still, it seems a pretty low
UA
detection precision in general. Do you think it's worth getting the UA
distribution for CSS requests correlate it with the distribution for page
/ JS loading?
Gabriel
On Wed, Feb 18, 2015 at 7:17 AM, Nuria Ruiz nu...@wikimedia.org wrote:
Sorry I forgot to address this earlier:
Do you
16, 2015 at 6:38 PM, Nuria Ruiz nu...@wikimedia.org wrote:
Gabriel:
I have run through the data and have a rough estimate of how many of our
pageviews are requested from browsers w/o strong javascript support. It is
a preliminary rough estimate but I think is pretty useful.
TL;DR
According
.
Kaldari
On Wed, Jan 7, 2015 at 1:27 PM, Ryan Kaldari rkald...@wikimedia.org
wrote:
Ah, sorry, I was looking on the wrong server (deployment-bastion). Thanks!
On Wed, Jan 7, 2015 at 1:21 PM, Nuria Ruiz nu...@wikimedia.org wrote:
Ahem they are there:
nuria@deployment-eventlogging02:/var/log
to then visualize
the information?
Message: 1
Date: Tue, 30 Dec 2014 07:37:35 -0800
From: Nuria Ruiz nu...@wikimedia.org
To: A mailing list for the Analytics Team at WMF and everybody who
has an interest in Wikipedia and analytics.
analytics@lists.wikimedia.org
Subject: Re
(cc-ing mobile-tech)
Since we do not the details of how wikigrok is used and its throughput of
requests we can not estimate sampling ourselves. I imagine wikigrok is
been deployed to a number of users and it is with that usage the mobile
team could estimate the total throughput expected, with
WikiGrok
to 10 out of every 62 users or ~16% (the userToken is a base 62 number).
That should give us an estimated 27 hits per second. Does that work for
everyone?
Kaldari
On Thu, Jan 8, 2015 at 2:06 PM, Nuria Ruiz nu...@wikimedia.org wrote:
We cannot guarantee that with 60 events a sec things
I am not sure if this is quite what you are asking but just in case:
For streaming is probably easier for you to use the newly created
webrequest tables:
https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hive#Webrequest_Table.28s.29
Those include an isPageview field so requests are
Incident documentation updated:
https://wikitech.wikimedia.org/wiki/Incident_documentation/20150107-EventLogging
On Wed, Jan 7, 2015 at 10:58 AM, Nuria Ruiz nu...@wikimedia.org wrote:
Team:
Issues on event logging have been solved, outage of client side events
(did not affected server side
Ahem they are there:
nuria@deployment-eventlogging02:/var/log/upstart$ ls eventlogging_*log
eventlogging_processor-client-side-events.log
eventlogging_processor-server-side-events.log
On Wed, Jan 7, 2015 at 12:57 PM, Ryan Kaldari rkald...@wikimedia.org
wrote:
It seems the EventLogging
Kaldari:
Expanding a bit to what Dan said:
We took up EL from ori's basically 6 months ago. The operational support
analytics provide is documented here:
https://www.mediawiki.org/wiki/EventLogging/OperationalSupport
EL has several parts and while we have not done much development on the mw
0.45% (1.25/sec)
MobileWikiAppSearch 0.41% (1.13/sec)
CentralAuth 0.40% (1.12/sec)
On Wed, Jan 7, 2015 at 5:12 PM, Nuria Ruiz nu...@wikimedia.org wrote:
We're talking about a total of ~170 events per
, 2015 at 10:32 AM, Nuria Ruiz nu...@wikimedia.org wrote:
I believe there is already an EL-Kafka pipeline and this would make it
easy to integrate page views with our regular processing.
Note that the pipeline was disabled 6 months ago and thus my comment in
the near term
https://github.com
Team:
Issues on event logging have been solved, outage of client side events (did
not affected server side events) lasted about 12 hours.
Please see:
http://picpaste.com/Screen_Shot_2015-01-07_at_10.50.28_AM-NsMSPgHp.png
Thanks,
Nuria
On Wed, Jan 7, 2015 at 3:57 AM, Christian Aistleitner
Roxana: You are correct, the devserver is broken in vagrant at this time.
However that doesn't mean you cannot instrument your code and see events on
console. We shall try to have a patch for the devserver soon but, as I
said, that should not block your development.
Thanks,
Nuria
On Tue, Jan 6,
at 2:09 AM, Andre Klapper aklap...@wikimedia.org
wrote:
On Thu, 2015-02-26 at 11:25 +1000, Sean Pringle wrote:
On Sun, Feb 22, 2015 at 1:20 PM, Nuria Ruiz nu...@wikimedia.org wrote:
Coordination on Monday sounds good.
Did you guys come to any conclusion about vanadium?
Could someone
Ticket for box upgrade is here: https://phabricator.wikimedia.org/T90363
On Mon, Mar 16, 2015 at 10:04 AM, Nuria Ruiz nu...@wikimedia.org wrote:
Did you guys come to any conclusion about vanadium?
Sorry about missing this. Ori has requested two EL hosts, those were
granted two weeks ago
Team:
All work we have been doing thus far with EventLogging is documented in
wikitech:
*- General management of system (restarting, graphite, database)*
https://wikitech.wikimedia.org/wiki/EventLogging
*- Backfilling:*
https://wikitech.wikimedia.org/wiki/EventLogging/Backfilling
*- Beta labs
What I would recommend is using the new data in wmf.webrequests, which
gives you, as you say, about 2.5 months, and filtering the user agent;
there are a couple of UDFs for user agent detection, including
isSpider, which also looks for wikimedia-specific bots that ua-parser
ignores.
So you know
Indeed, we could use this one for a bunch of things.
On Mon, Mar 16, 2015 at 7:25 AM, Andrew Otto ao...@wikimedia.org wrote:
Whoa, kinda cool:
https://github.com/pyr/sqlstream
Maybe useful as a non-intrusive way of getting a change event stream out
of Mediawiki without making application
Thanks much Christian for the writeup.
Should have icinga alarms arround these types of issues? Seems like that
would be the way to go.
Thanks,
Nuria
On Sat, Mar 7, 2015 at 4:00 PM, Andrew Otto ao...@wikimedia.org wrote:
Thanks Christian!
On Mar 7, 2015, at 09:14, Christian Aistleitner
Issues were resolved promptly and analytics team shall backfill client
side events that were dropped on the 20th as a result of the outage.
This work is now completed.
On Mon, Mar 23, 2015 at 10:16 AM, Nuria Ruiz nu...@wikimedia.org wrote:
Hello,
Eventlogging had some issues on March 20th
Erik has asked me to write an exploratory app for user-agent data. The
idea is to enable Product Managers and engineers to easily explore
what users use so they know what to support. I've thrown up an example
screenshot at http://ironholds.org/agents_example_screen.png
I cannot speak as to the
there's a fresh start with caching.
/braindump
— Timo
On 18 Feb 2015, at 18:07, Nuria Ruiz nu...@wikimedia.org wrote:
Do you think it's worth getting the UA distribution for CSS requests
correlate it with the distribution for page / JS loading?
Yes, we can do that. I would need to gather
Note that couple days worth of traffic might be more than a 1 billion
requests for javascript on bits.
Sorry, correction. Couple days worth of javascript bits requests comes up
to 100 million requests not a 1000 million.
On Sun, Mar 1, 2015 at 4:35 PM, Nuria Ruiz nu...@wikimedia.org wrote
If I remember correctly, Chris had the maxmind db on github with a script
that update it and commit changes. Thus making possible to play back time
and get the state of the db how it was when than data was calculated.
I think Dan has that script cron running in his homedir, if we could
I favor a URL solution cause I think is easier to parse and maintain.
https://en.wikipedia.org/wiki/ref=app/Barack_Obama;
I also think the supported set of reftags should be very short, for
example: https://aws.amazon.com/marketplace/help/201349870
Note that Varnish supports url rewriting:
Aha, so if we never hit the read-mode Varnishes we can ignore anything
about this? Great.
The answer .. ahem .. would be no. Not really. But you knew that probably.
I think James has a point in saying that is not so easy to see what might
affect requests, I certainly agree given the e-mails I
CC-ing Ori. He mentioned he was given a box today but no further details.
Thanks,
Nuria
On Wed, Feb 25, 2015 at 5:25 PM, Sean Pringle sprin...@wikimedia.org
wrote:
On Sun, Feb 22, 2015 at 1:20 PM, Nuria Ruiz nu...@wikimedia.org wrote:
Coordination on Monday sounds good.
Did you guys come
If there’s no other objection, we can safely fold this under the
discussion of long-term options and go ahead with the proposed
implementation, per Dan.
I think there are some technical issues to be ironed right?
1. How are we doing so a request like:
whether the hits to the beacon URI are picked up by
varnishkafka or not at the moment, since he set up the endpoint.
On Wed, Mar 18, 2015 at 3:42 PM, Nuria Ruiz nu...@wikimedia.org wrote:
Gilles:
And we know this data is coming via varnishkafka into the cluster, right?
Did we checked
look like everything is making it to the DB, I'll keep
investigating tomorrow.
On Wed, Jan 28, 2015 at 5:43 PM, Nuria Ruiz nu...@wikimedia.org
wrote:
Gilles:
This event has a pretty constant rate of input:
http://graphite.wikimedia.org/render/?width=588height=311_salt=1422494956.516from=00
Hello,
Eventlogging had some issues on March 20th due to an inflow of client side
events higher than the system can support. Inflow was due to the new
instrumentation deployed for Wikitext to be able to compare Wikitext usage
with Visual editor usage.
Issues were resolved promptly and analytics
Sorry, this should be:
Mobile web beta does not have any special url. It is triggered by a
cookie.
If the COOKIE that identifies 'mobile-web-beta' is stripped off in varnish
(something you can ask your devs about)...
On Thu, Apr 2, 2015 at 5:05 PM, Nuria Ruiz nu...@wikimedia.org wrote:
(cc-ing
Please cc analytics@ so the whole team sees this requests.
On Wed, Apr 22, 2015 at 3:09 PM, Dan Garry dga...@wikimedia.org wrote:
Hey Kevin,
Task for your attention: T96926 https://phabricator.wikimedia.org/T96926
The following patches are ready to be merged in the iOS and Android apps
Wednesday to write
this
up.
-Adam
On Mon, Apr 20, 2015 at 8:14 PM, Nuria Ruiz nu...@wikimedia.org
wrote:
Ping ...
On Fri, Apr 17, 2015 at 7:45 AM, Adam Baso ab...@wikimedia.org
wrote:
Sure thing. Dan and Bernd I'll sync up with you on this.
On Fri, Apr
This sounds like the fixes we did last quarter to the batch insertion
basically hid the problem instead of making it go away.
I think we are mixing things here, when we had issues with batching code we
never saw a pattern of no-events-whatsoever-in-any-table for an hour. We
saw events dropped in
Some things to have in mind:
1) Bots
AND user_agent_map['device_family'] Spider
Doesn't remove all bots, only very prominent ones, so stats still include
traffic from say, wmf robots, for example.
2) Sampling:
Strangely, the event-logs for specific actions showed much higher traffic
for
Given that batching code is been deployed since earlier (March16th) than
the 1st event listed by Marcel (April 9th) and since then we have swapped
the EL box (April 3rd/4th) we probably want to look at system issues.
On my opinion it is probably easier to see with tcpdump whether inserts are
Anyone know what powers, or more correctly what *should* power, the
MediaWiki: MobileFrontend dashboard [0]? I'm hoping that it's data from
the NavigationTiming extension but I've been known to be wrong.
I do not think sot. You can see is data reported to graphite if you look at
the network
>Are there any related Phabricator task IDs to share?
Any analytics task marked as {slug}. For example:
https://phabricator.wikimedia.org/search/query/lVxHj15dmctY/
On Fri, Oct 23, 2015 at 10:28 AM, Andre Klapper
wrote:
> On Fri, 2015-10-23 at 09:01 -0400, Dan Andreescu
y.
>
> On Wed, Oct 21, 2015 at 1:49 PM, Nuria Ruiz <nu...@wikimedia.org> wrote:
>
>> >What was the motivation for this change? Just looking for possible
>> automata?
>> Right.The motivation was to see if the absence of cookies works as a
>> cheap
Team:
As of today incoming request data includes an extra bit of information on
the X-analytics header.
If an incoming request to any wikipedia project had no cookies whatsoever
it will be tagged with nocookie=1. A requests without any cookies could
correspond to a fresh browser session, a user
Hello!
The analytics team is planning to give a presentation about the Pageview
API we are working on on the developer summit (we are hoping to announce
the API pretty soon)
Please feel free to add to the ticket use cases you would like to talk
about regarding pageView API or any discussion
Hello!
The analytics team wishes to announce that we have finally transitioned
several of the pageview reports in stats.wikimedia.org to the new pageview
definition [1]. This means that we should no longer have two conflicting
sources of pageview numbers.
While we are not not fully done
1 - 100 of 330 matches
Mail list logo