Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-25 Thread Dan Andreescu
Theoretically we should be able to request the Event Logging endpoint URI from anywhere. But I don't know how CORS is set up on that endpoint after this recent change. On Thu, Jun 25, 2015 at 1:01 PM, Oliver Keyes wrote: > Gotcha. And we can put EL on labs? > > On 25 June 2015 at 09:56, Dan And

Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-25 Thread Oliver Keyes
Gotcha. And we can put EL on labs? On 25 June 2015 at 09:56, Dan Andreescu wrote: > Update on this: > > * Piwik is not finding a lot of love. The readership team is working on > puppetizing it and we theoretically have hardware to run it, but we haven't > decided it's a good idea for Analytics t

Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-25 Thread Dan Andreescu
Update on this: * Piwik is not finding a lot of love. The readership team is working on puppetizing it and we theoretically have hardware to run it, but we haven't decided it's a good idea for Analytics to support this yet. * We're a (bit?) more optimistic about parallel Event Logging processors.

Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-10 Thread Oliver Keyes
Probably, on the Discovery team mailing list. On 10 June 2015 at 14:56, Pine W wrote: > Question about "the budget this year has ensured, at least for Discovery, > that ops and hardware support are slashed to the bone." I'm trying to figure > out the paradox of hiring more peope for Discovery at

Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-10 Thread Pine W
Question about "the budget this year has ensured, at least for Discovery, that ops and hardware support are slashed to the bone." I'm trying to figure out the paradox of hiring more peope for Discovery at the same time that ops and hardware support are reduced. Can someone explain? Thanks, Pine __

Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-10 Thread Oliver Keyes
At the moment I don't have specific questions because we're trying to just get the thing set up. But, wider context and a prediction: The budget this year has ensured, at least for Discovery, that ops and hardware support are slashed to the bone. Because of this we're deploying bigger and bigger t

Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-10 Thread Dan Andreescu
I think this thread is a bit too vague. If piwik is woefully inadequate, then what kind of analysis is needed for the use cases you're talking about? It doesn't seem obvious that we need endlessly scalable systems like Hadoop to analyze data gathered by small and fairly limited virtual machines.

Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-10 Thread Oliver Keyes
On 10 June 2015 at 12:00, Andrew Otto wrote: > HmMmm. > > here’s no reason we couldn’t maintain beta level Kafka + Hadoop clusters in > labs. We probably should! I don’t really want to maintain them myself, but > they should be pretty easy to set up using hiera now. I could maintain them > if n

Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-10 Thread Oliver Keyes
On 10 June 2015 at 11:35, Dan Andreescu wrote: > > > On Wed, Jun 10, 2015 at 11:02 AM, Oliver Keyes wrote: >> >> On 10 June 2015 at 10:53, Dan Andreescu wrote: >> > I see three ways for data to get into the cluster: >> > >> > 1. request stream, handled already, we're working on ways to pump the

Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-10 Thread Andrew Otto
HmMmm. here’s no reason we couldn’t maintain beta level Kafka + Hadoop clusters in labs. We probably should! I don’t really want to maintain them myself, but they should be pretty easy to set up using hiera now. I could maintain them if no on else wants to. Thought two: > "so > when does n

Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-10 Thread Dan Andreescu
On Wed, Jun 10, 2015 at 11:02 AM, Oliver Keyes wrote: > On 10 June 2015 at 10:53, Dan Andreescu wrote: > > I see three ways for data to get into the cluster: > > > > 1. request stream, handled already, we're working on ways to pump the > data > > back out through APIs > > Awesome, and it'd end u

Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-10 Thread Oliver Keyes
On 10 June 2015 at 10:53, Dan Andreescu wrote: > I see three ways for data to get into the cluster: > > 1. request stream, handled already, we're working on ways to pump the data > back out through APIs Awesome, and it'd end up in the Hadoop cluster in a table? How...do we kick that off most easi

Re: [Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-10 Thread Dan Andreescu
I see three ways for data to get into the cluster: 1. request stream, handled already, we're working on ways to pump the data back out through APIs 2. Event Logging. We're making this scale arbitrarily by moving it to Kafka. Once that's done, we should be able to instrument pretty much anything

[Analytics] "If it didn't happen in HDFS, it didn't happen"

2015-06-10 Thread Oliver Keyes
Hey all, We're building a lot of tools out on Labs. From a RESTful API to a Wikidata Query Service, we're making neat things and Labs is proving the perfect place to prototype them - in all-but-one-respects. A crucial part of these tools being not just useful but measurably useful is the logs bei