Re: [Analytics] db1047 & one box to rule them all

2014-05-02 Thread Dario Taraborelli
Hi Gilles, you shouldn’t use “research_prod” if you simply need to perform read-only queries against the slaves (the “research” user is the one you should use instead, at least until we revisit the policy of SQL credentials with ops). I’ll drop you a line off-list with instructions on the crede

Re: [Analytics] db1047 & one box to rule them all

2014-05-02 Thread Gilles Dubuc
Where might I find the credentials of the research user? I only have research_prod's password, which I was using to connect to db1047. That one doesn't seem to work on analytics-store.eqiad.wmnet On Wed, Apr 30, 2014 at 11:48 PM, Ori Livneh wrote: > On Wed, Apr 30, 2014 at 12:58 PM, Ori Livneh

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Ori Livneh
On Wed, Apr 30, 2014 at 12:58 PM, Ori Livneh wrote: > > As something of a consolation prize, "analytics-store.eqiad.wmnet" is >> now open for SELECT queries from the 'research' user. This box: [...] >> - Can replicate eventlogging too (but doesn't yet). >> > > Could we please? :) > (Just realiz

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Ori Livneh
Sean, thanks so much! On Tue, Apr 29, 2014 at 8:20 AM, Sean Pringle wrote: > 1. db1048 has had the eventlogging uuid fields made formally UNIQUE KEY. I > gather Ori will now run some validation against logs to check for remaining > gaps. > This should finish in 16 hours or so. As something of

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Oliver Keyes
Not quite there yet - just pointing to it as a potentially blocker to the "let's move everything to Hadoop!" idea (which I fully support). If the goal is to enable research using unified data, but the unified data is more difficult to access than the non-unified data, we probably haven't moved the

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Toby Negrin
I think we'll put everything on Hadoop at some point but we're focusing on the page views now. Regarding the bug - if you're ready to use it I can see if Andrew can install the java package. -Toby > On Apr 30, 2014, at 9:34 AM, Oliver Keyes wrote: > > > > >> On 30 April 2014 06:59, Dan

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Oliver Keyes
On 30 April 2014 06:59, Dan Andreescu wrote: > This is awesome, thank you Sean > >> *This is probably my bad, but I understood the goal to be having a >>> single db containing unified, core tablets. So, we'd have one db, with one >>> revision table, that'd have an extra column of "wiki" that den

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Leila Zia
Hi Sean, I am very excited about this. Thank you. :-) Re unified views: On Wed, Apr 30, 2014 at 6:59 AM, Dan Andreescu wrote: > This is awesome, thank you Sean > >> *This is probably my bad, but I understood the goal to be having a >>> single db containing unified, core tablets. So, we'd ha

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Dan Andreescu
This is awesome, thank you Sean > *This is probably my bad, but I understood the goal to be having a single >> db containing unified, core tablets. So, we'd have one db, with one >> revision table, that'd have an extra column of "wiki" that denoted the >> project the entry referred to. This would

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Dario Taraborelli
~ 30 hours of replag as I write but this is very exciting, thanks Sean! In case you’re wondering, the EventLogging DB is called “log” as the previous one. On Apr 30, 2014, at 11:49 AM, Oliver Keyes wrote: > Whee! > > > On 30 April 2014 02:48, Sean Pringle wrote: > On Wed, Apr 30, 2014 at 1

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Dario Taraborelli
Oliver asked me to confirm that he’s not hallucinating and I too am seeing tables that were not previously visible to the research user on the slaves. Not a big deal and probably a legacy permission issue. On Apr 30, 2014, at 12:05 PM, Sean Pringle wrote: > On Wed, Apr 30, 2014 at 7:17 PM, Oli

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Dario Taraborelli
On Apr 30, 2014, at 8:40 AM, Sean Pringle wrote: > On Wed, Apr 30, 2014 at 12:44 PM, Oliver Keyes wrote: > Okay, so, have tested (to a limited degree. The work I'm doing that involves > the dbs involves eventlogging, so this is mostly me making up excuses to run > queries). Thoughts: > > *We

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Sean Pringle
On Wed, Apr 30, 2014 at 7:17 PM, Oliver Keyes wrote: > There's also "prefstats", the last entry in which is dated 20120321195103. > Either I'm mad or these are legacy tables that were previously excluded > from replication to the analytics slaves (it's probable that I'm mad). > At least log_sear

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Oliver Keyes
Whee! On 30 April 2014 02:48, Sean Pringle wrote: > On Wed, Apr 30, 2014 at 12:44 PM, Oliver Keyes wrote: > >> Okay, so, have tested (to a limited degree. The work I'm doing that >> involves the dbs involves eventlogging, so this is mostly me making up >> excuses to run queries). >> > > eventlo

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Sean Pringle
On Wed, Apr 30, 2014 at 12:44 PM, Oliver Keyes wrote: > Okay, so, have tested (to a limited degree. The work I'm doing that > involves the dbs involves eventlogging, so this is mostly me making up > excuses to run queries). > eventlogging should now be accessible from the One Box. It will still

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Oliver Keyes
There's also "prefstats", the last entry in which is dated 20120321195103. Either I'm mad or these are legacy tables that were previously excluded from replication to the analytics slaves (it's probable that I'm mad). On 30 April 2014 02:13, Oliver Keyes wrote: > Hurrh; that's...weird. So, look

Re: [Analytics] db1047 & one box to rule them all

2014-04-30 Thread Oliver Keyes
Hurrh; that's...weird. So, looking at s1, I now see the same tables as on the One Box, but I swear I haven't seen some of them there before. Examples are exorphans, log_search and filejournal. Aaron, Dario, Leila: am I crazy, or were these not previously there? On 29 April 2014 23:42, Sean Pring

Re: [Analytics] db1047 & one box to rule them all

2014-04-29 Thread Sean Pringle
On Wed, Apr 30, 2014 at 3:39 PM, Oliver Keyes wrote: > Also, where are you replicating from? Only there are kind of a lot of > tables here I don't recognise. > >From each shard using the same replication streams as the sX-analytics-slaves use. Which tables? -- DBA @ WMF _

Re: [Analytics] db1047 & one box to rule them all

2014-04-29 Thread Sean Pringle
On Wed, Apr 30, 2014 at 12:44 PM, Oliver Keyes wrote: > Okay, so, have tested (to a limited degree. The work I'm doing that > involves the dbs involves eventlogging, so this is mostly me making up > excuses to run queries). Thoughts: > > *We should probably put in some kind of restrictions around

Re: [Analytics] db1047 & one box to rule them all

2014-04-29 Thread Oliver Keyes
Also, where are you replicating from? Only there are kind of a lot of tables here I don't recognise. On 29 April 2014 19:44, Oliver Keyes wrote: > Okay, so, have tested (to a limited degree. The work I'm doing that > involves the dbs involves eventlogging, so this is mostly me making up > excus

Re: [Analytics] db1047 & one box to rule them all

2014-04-29 Thread Oliver Keyes
Okay, so, have tested (to a limited degree. The work I'm doing that involves the dbs involves eventlogging, so this is mostly me making up excuses to run queries). Thoughts: *We should probably put in some kind of restrictions around what we care about. For example, I see the tables relating to th

Re: [Analytics] db1047 & one box to rule them all

2014-04-29 Thread Oliver Keyes
One word: YAY! Thank you so much for this, Sean :D On 29 April 2014 17:13, Sean Pringle wrote: > On Wed, Apr 30, 2014 at 6:01 AM, Dario Taraborelli < > dtarabore...@wikimedia.org> wrote: > >> Sean, consolation prizes are understated, this is terrific. >> >> I just noticed that centralauth is n

Re: [Analytics] db1047 & one box to rule them all

2014-04-29 Thread Sean Pringle
On Wed, Apr 30, 2014 at 6:01 AM, Dario Taraborelli < dtarabore...@wikimedia.org> wrote: > Sean, consolation prizes are understated, this is terrific. > > I just noticed that centralauth is not included, after EventLogging data > this is the most useful database to have replicated on the big one bo

Re: [Analytics] db1047 & one box to rule them all

2014-04-29 Thread Dario Taraborelli
Sean, consolation prizes are understated, this is terrific. I just noticed that centralauth is not included, after EventLogging data this is the most useful database to have replicated on the big one box. Dario On Apr 29, 2014, at 6:31 PM, Sean Pringle wrote: > On Wed, Apr 30, 2014 at 1:20 AM

Re: [Analytics] db1047 & one box to rule them all

2014-04-29 Thread Sean Pringle
On Wed, Apr 30, 2014 at 1:20 AM, Sean Pringle wrote: > > As something of a consolation prize, "analytics-store.eqiad.wmnet" is now > open for SELECT queries from the 'research' user. This box: > > - Is a CNAME for dbstore1002.eqaid.wmnet. > Just to be contrary I've already messed with the CNAME t

[Analytics] db1047 & one box to rule them all

2014-04-29 Thread Sean Pringle
Hi! The speed bumps from the eventlogging migration are almost ironed out: 1. db1048 has had the eventlogging uuid fields made formally UNIQUE KEY. I gather Ori will now run some validation against logs to check for remaining gaps. 2. db1046 which died mid-migration has been restored and is catc