I have been asking this question informally for too long, so here goes the
formal request:
Metrics about the external use of the Wikimedia APIs
https://phabricator.wikimedia.org/T102079
We need them and, in fact, an outsider would be very surprised by the fact
that we don't have them today and we
Cross-posting to analytics. Props to Vibha for asking for the data.
-- Forwarded message --
From: *Adam Baso*
Date: Wednesday, June 10, 2015
Subject: Some data on apps and web
To: mobile-l
Hi all, thought I'd share some data from a few queries around apps uniques
and apps + web
On Mon, Jun 8, 2015 at 3:57 PM, Erik Bernhardson wrote:
> Searching around I saw some discussion about this almost a year ago, in
> may 2014, before sendBeacon support was added (in nov 2014),
> titled "[Analytics] Using EventLogging for funnel analysis". There it was
> proposed to push the event
Probably, on the Discovery team mailing list.
On 10 June 2015 at 14:56, Pine W wrote:
> Question about "the budget this year has ensured, at least for Discovery,
> that ops and hardware support are slashed to the bone." I'm trying to figure
> out the paradox of hiring more peope for Discovery at
Question about "the budget this year has ensured, at least for Discovery,
that ops and hardware support are slashed to the bone." I'm trying to
figure out the paradox of hiring more peope for Discovery at the same time
that ops and hardware support are reduced. Can someone explain?
Thanks,
Pine
__
At the moment I don't have specific questions because we're trying to
just get the thing set up. But, wider context and a prediction:
The budget this year has ensured, at least for Discovery, that ops and
hardware support are slashed to the bone. Because of this we're
deploying bigger and bigger t
I think this thread is a bit too vague. If piwik is woefully inadequate,
then what kind of analysis is needed for the use cases you're talking
about? It doesn't seem obvious that we need endlessly scalable systems
like Hadoop to analyze data gathered by small and fairly limited virtual
machines.
On 10 June 2015 at 12:00, Andrew Otto wrote:
> HmMmm.
>
> here’s no reason we couldn’t maintain beta level Kafka + Hadoop clusters in
> labs. We probably should! I don’t really want to maintain them myself, but
> they should be pretty easy to set up using hiera now. I could maintain them
> if n
On 10 June 2015 at 11:35, Dan Andreescu wrote:
>
>
> On Wed, Jun 10, 2015 at 11:02 AM, Oliver Keyes wrote:
>>
>> On 10 June 2015 at 10:53, Dan Andreescu wrote:
>> > I see three ways for data to get into the cluster:
>> >
>> > 1. request stream, handled already, we're working on ways to pump the
HmMmm.
here’s no reason we couldn’t maintain beta level Kafka + Hadoop clusters in
labs. We probably should! I don’t really want to maintain them myself, but
they should be pretty easy to set up using hiera now. I could maintain them if
no on else wants to.
Thought two:
> "so
> when does n
On Mon, 2015-04-27 at 11:28 -0700, Dan Andreescu wrote:
> Sounds to me like the nuance we were trying to go for is causing
> confusion. This is unintended and my opinion is that we should
> remove maybe-analytics and just tell everyone to use blocked-on
> -analytics as liberally as they wish.
I
On Wed, Jun 10, 2015 at 11:02 AM, Oliver Keyes wrote:
> On 10 June 2015 at 10:53, Dan Andreescu wrote:
> > I see three ways for data to get into the cluster:
> >
> > 1. request stream, handled already, we're working on ways to pump the
> data
> > back out through APIs
>
> Awesome, and it'd end u
On 10 June 2015 at 10:53, Dan Andreescu wrote:
> I see three ways for data to get into the cluster:
>
> 1. request stream, handled already, we're working on ways to pump the data
> back out through APIs
Awesome, and it'd end up in the Hadoop cluster in a table? How...do we
kick that off most easi
I see three ways for data to get into the cluster:
1. request stream, handled already, we're working on ways to pump the data
back out through APIs
2. Event Logging. We're making this scale arbitrarily by moving it to
Kafka. Once that's done, we should be able to instrument pretty much
anything
Hey all,
We're building a lot of tools out on Labs. From a RESTful API to a
Wikidata Query Service, we're making neat things and Labs is proving
the perfect place to prototype them - in all-but-one-respects.
A crucial part of these tools being not just useful but measurably
useful is the logs bei
If we are going to completely denormalize the data sets for anonymization,
and we expect just slice and dice queries to the database,
I think we wouldn't take much advantage of a relational DB,
because it wouldn't need to aggregate values, slice or dice,
all slices and dices would be precomputed, r
16 matches
Mail list logo