Re: [Analytics] [Cluster] Monitoring the impact Hive jobs have on the Analytics cluster

2015-03-09 Thread Christian Aistleitner
Hi Pine,

On Sat, Mar 07, 2015 at 08:15:18PM -0800, Pine W wrote:
 Chris, may I quote your email on BASH?

They take emails too?

Regardless ... feel free to quote or forward any of my emails wherever
you seem fit.

Have fun,
Christian



-- 
 quelltextlich e.U.  \\  Christian Aistleitner 
   Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email:  christ...@quelltextlich.at
4293 Gutau, Austria  Phone:  +43 7946 / 20 5 81
 Fax:+43 7946 / 20 5 81
 Homepage: http://quelltextlich.at/
---


signature.asc
Description: Digital signature
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Cluster] Monitoring the impact Hive jobs have on the Analytics cluster

2015-03-09 Thread Joseph Allemandou
Thanks a lot Christian :)
I had not meant by any mean last Friday to overload the cluster ... I did
it nonetheless.
Your page on how to 'keep an eye on it' will really be useful!
Cheers
Joseph


On Sun, Mar 8, 2015 at 8:26 PM, Leila Zia le...@wikimedia.org wrote:

 This is really useful, Christian. Thanks for explaining and documenting it.

 Leila

 On Sat, Mar 7, 2015 at 6:14 AM, Christian Aistleitner 
 christ...@quelltextlich.at wrote:

 Hi,

 around running jobs on the Analytics cluster, I've sometime seen
 people say in IRC: “Let's run this heavy job. I'll keep an eye on it”.

 But more often than not, this seems to have meant:
 “Let's just run this heavy job and wait. If QChris joins IRC, let's
 hope he doesn't ping us about having overloaded the cluster.”

 That's not nice^Wscalable ;-)

 So just in case someone is vague on how to “keep an eye on it”, I did
 a short write-up at:

   https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load

 which details on detecting how the cluster is doing on a very high
 level.
 Especially, it allows you to detect if the cluster got stalled, and if
 it did, it tells you what to do.

 Have fun,
 Christian

 P.S.: The above URL has diagrams! Click the URL!

 --
  quelltextlich e.U.  \\  Christian Aistleitner 
Companies' registry: 360296y in Linz
 Christian Aistleitner
 Kefermarkterstrasze 6a/3 Email:  christ...@quelltextlich.at
 4293 Gutau, Austria  Phone:  +43 7946 / 20 5 81
  Fax:+43 7946 / 20 5 81
  Homepage: http://quelltextlich.at/
 ---

 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics



 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics




-- 
*Joseph Allemandou*
Data Engineer @ Wikimedia Foundation
IRC: joal
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Cluster] Monitoring the impact Hive jobs have on the Analytics cluster

2015-03-09 Thread Andrew Otto
 Should have icinga alarms arround these types of issues?  Seems like that 
 would be the way to go. 
Aside from this, I get daily emails about webrequest partition statuses, and I 
would at least notice the morning after that something is wrong. 



 On Mar 7, 2015, at 21:20, Nuria Ruiz nu...@wikimedia.org wrote:
 
 Thanks much Christian for the writeup.
 
 Should have icinga alarms arround these types of issues?  Seems like that 
 would be the way to go. 
 
 Thanks, 
 
 Nuria
 
 On Sat, Mar 7, 2015 at 4:00 PM, Andrew Otto ao...@wikimedia.org 
 mailto:ao...@wikimedia.org wrote:
 Thanks Christian!
 
 
  On Mar 7, 2015, at 09:14, Christian Aistleitner christ...@quelltextlich.at 
  mailto:christ...@quelltextlich.at wrote:
 
  Hi,
 
  around running jobs on the Analytics cluster, I've sometime seen
  people say in IRC: “Let's run this heavy job. I'll keep an eye on it”.
 
  But more often than not, this seems to have meant:
  “Let's just run this heavy job and wait. If QChris joins IRC, let's
  hope he doesn't ping us about having overloaded the cluster.”
 
  That's not nice^Wscalable ;-)
 
  So just in case someone is vague on how to “keep an eye on it”, I did
  a short write-up at:
 
   https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load 
  https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load
 
  which details on detecting how the cluster is doing on a very high
  level.
  Especially, it allows you to detect if the cluster got stalled, and if
  it did, it tells you what to do.
 
  Have fun,
  Christian
 
  P.S.: The above URL has diagrams! Click the URL!
 
  --
   quelltextlich e.U.  \\  Christian Aistleitner 
Companies' registry: 360296y in Linz
  Christian Aistleitner
  Kefermarkterstrasze 6a/3 Email:  christ...@quelltextlich.at 
  mailto:christ...@quelltextlich.at
  4293 Gutau, Austria  Phone:  +43 7946 / 20 5 81 
  tel:%2B43%207946%20%2F%2020%205%2081
  Fax:+43 7946 / 20 5 81 
  tel:%2B43%207946%20%2F%2020%205%2081
  Homepage: http://quelltextlich.at/ 
  http://quelltextlich.at/
  ---
  ___
  Analytics mailing list
  Analytics@lists.wikimedia.org mailto:Analytics@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/analytics 
  https://lists.wikimedia.org/mailman/listinfo/analytics
 
 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org mailto:Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics 
 https://lists.wikimedia.org/mailman/listinfo/analytics
 
 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics

___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Cluster] Monitoring the impact Hive jobs have on the Analytics cluster

2015-03-09 Thread Christian Aistleitner
Hi Andrew,

On Mon, Mar 09, 2015 at 11:54:56AM -0400, Andrew Otto wrote:
  https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load
 Christian, may I move this page into the Cluster/Hadoop/Administration page?

I think a separate page is worth it as the target audience is
different from the Cluster/Hadoop/Administration page.

But sure. Be Bold. Move wherever you seem fit. :-)

Have fun,
Christian



-- 
 quelltextlich e.U.  \\  Christian Aistleitner 
   Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email:  christ...@quelltextlich.at
4293 Gutau, Austria  Phone:  +43 7946 / 20 5 81
 Fax:+43 7946 / 20 5 81
 Homepage: http://quelltextlich.at/
---


signature.asc
Description: Digital signature
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Cluster] Monitoring the impact Hive jobs have on the Analytics cluster

2015-03-08 Thread Leila Zia
This is really useful, Christian. Thanks for explaining and documenting it.

Leila

On Sat, Mar 7, 2015 at 6:14 AM, Christian Aistleitner 
christ...@quelltextlich.at wrote:

 Hi,

 around running jobs on the Analytics cluster, I've sometime seen
 people say in IRC: “Let's run this heavy job. I'll keep an eye on it”.

 But more often than not, this seems to have meant:
 “Let's just run this heavy job and wait. If QChris joins IRC, let's
 hope he doesn't ping us about having overloaded the cluster.”

 That's not nice^Wscalable ;-)

 So just in case someone is vague on how to “keep an eye on it”, I did
 a short write-up at:

   https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load

 which details on detecting how the cluster is doing on a very high
 level.
 Especially, it allows you to detect if the cluster got stalled, and if
 it did, it tells you what to do.

 Have fun,
 Christian

 P.S.: The above URL has diagrams! Click the URL!

 --
  quelltextlich e.U.  \\  Christian Aistleitner 
Companies' registry: 360296y in Linz
 Christian Aistleitner
 Kefermarkterstrasze 6a/3 Email:  christ...@quelltextlich.at
 4293 Gutau, Austria  Phone:  +43 7946 / 20 5 81
  Fax:+43 7946 / 20 5 81
  Homepage: http://quelltextlich.at/
 ---

 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics


___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Cluster] Monitoring the impact Hive jobs have on the Analytics cluster

2015-03-07 Thread Federico Leva (Nemo)

Christian Aistleitner, 07/03/2015 15:14:

P.S.: The above URL has diagrams! Click the URL!


And with colours! So it's like checking heartbeats, cute. :)

Nemo

___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


[Analytics] [Cluster] Monitoring the impact Hive jobs have on the Analytics cluster

2015-03-07 Thread Christian Aistleitner
Hi,

around running jobs on the Analytics cluster, I've sometime seen
people say in IRC: “Let's run this heavy job. I'll keep an eye on it”.

But more often than not, this seems to have meant:
“Let's just run this heavy job and wait. If QChris joins IRC, let's
hope he doesn't ping us about having overloaded the cluster.”

That's not nice^Wscalable ;-)

So just in case someone is vague on how to “keep an eye on it”, I did
a short write-up at:

  https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load

which details on detecting how the cluster is doing on a very high
level.
Especially, it allows you to detect if the cluster got stalled, and if
it did, it tells you what to do.

Have fun,
Christian

P.S.: The above URL has diagrams! Click the URL!

-- 
 quelltextlich e.U.  \\  Christian Aistleitner 
   Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email:  christ...@quelltextlich.at
4293 Gutau, Austria  Phone:  +43 7946 / 20 5 81
 Fax:+43 7946 / 20 5 81
 Homepage: http://quelltextlich.at/
---


signature.asc
Description: Digital signature
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Cluster] Monitoring the impact Hive jobs have on the Analytics cluster

2015-03-07 Thread Andrew Otto
Thanks Christian!


 On Mar 7, 2015, at 09:14, Christian Aistleitner christ...@quelltextlich.at 
 wrote:
 
 Hi,
 
 around running jobs on the Analytics cluster, I've sometime seen
 people say in IRC: “Let's run this heavy job. I'll keep an eye on it”.
 
 But more often than not, this seems to have meant:
 “Let's just run this heavy job and wait. If QChris joins IRC, let's
 hope he doesn't ping us about having overloaded the cluster.”
 
 That's not nice^Wscalable ;-)
 
 So just in case someone is vague on how to “keep an eye on it”, I did
 a short write-up at:
 
  https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load
 
 which details on detecting how the cluster is doing on a very high
 level.
 Especially, it allows you to detect if the cluster got stalled, and if
 it did, it tells you what to do.
 
 Have fun,
 Christian
 
 P.S.: The above URL has diagrams! Click the URL!
 
 -- 
  quelltextlich e.U.  \\  Christian Aistleitner 
   Companies' registry: 360296y in Linz
 Christian Aistleitner
 Kefermarkterstrasze 6a/3 Email:  christ...@quelltextlich.at
 4293 Gutau, Austria  Phone:  +43 7946 / 20 5 81
 Fax:+43 7946 / 20 5 81
 Homepage: http://quelltextlich.at/
 ---
 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics

___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Cluster] Monitoring the impact Hive jobs have on the Analytics cluster

2015-03-07 Thread Nuria Ruiz
Thanks much Christian for the writeup.

Should have icinga alarms arround these types of issues?  Seems like that
would be the way to go.

Thanks,

Nuria

On Sat, Mar 7, 2015 at 4:00 PM, Andrew Otto ao...@wikimedia.org wrote:

 Thanks Christian!


  On Mar 7, 2015, at 09:14, Christian Aistleitner 
 christ...@quelltextlich.at wrote:
 
  Hi,
 
  around running jobs on the Analytics cluster, I've sometime seen
  people say in IRC: “Let's run this heavy job. I'll keep an eye on it”.
 
  But more often than not, this seems to have meant:
  “Let's just run this heavy job and wait. If QChris joins IRC, let's
  hope he doesn't ping us about having overloaded the cluster.”
 
  That's not nice^Wscalable ;-)
 
  So just in case someone is vague on how to “keep an eye on it”, I did
  a short write-up at:
 
   https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load
 
  which details on detecting how the cluster is doing on a very high
  level.
  Especially, it allows you to detect if the cluster got stalled, and if
  it did, it tells you what to do.
 
  Have fun,
  Christian
 
  P.S.: The above URL has diagrams! Click the URL!
 
  --
   quelltextlich e.U.  \\  Christian Aistleitner 
Companies' registry: 360296y in Linz
  Christian Aistleitner
  Kefermarkterstrasze 6a/3 Email:  christ...@quelltextlich.at
  4293 Gutau, Austria  Phone:  +43 7946 / 20 5 81
  Fax:+43 7946 / 20 5 81
  Homepage: http://quelltextlich.at/
  ---
  ___
  Analytics mailing list
  Analytics@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/analytics

 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics

___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Cluster] Monitoring the impact Hive jobs have on the Analytics cluster

2015-03-07 Thread Pine W
Chris, may I quote your email on BASH?

Pine
On Mar 7, 2015 6:14 AM, Christian Aistleitner christ...@quelltextlich.at
wrote:

 Hi,

 around running jobs on the Analytics cluster, I've sometime seen
 people say in IRC: “Let's run this heavy job. I'll keep an eye on it”.

 But more often than not, this seems to have meant:
 “Let's just run this heavy job and wait. If QChris joins IRC, let's
 hope he doesn't ping us about having overloaded the cluster.”

 That's not nice^Wscalable ;-)

 So just in case someone is vague on how to “keep an eye on it”, I did
 a short write-up at:

   https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load

 which details on detecting how the cluster is doing on a very high
 level.
 Especially, it allows you to detect if the cluster got stalled, and if
 it did, it tells you what to do.

 Have fun,
 Christian

 P.S.: The above URL has diagrams! Click the URL!

 --
  quelltextlich e.U.  \\  Christian Aistleitner 
Companies' registry: 360296y in Linz
 Christian Aistleitner
 Kefermarkterstrasze 6a/3 Email:  christ...@quelltextlich.at
 4293 Gutau, Austria  Phone:  +43 7946 / 20 5 81
  Fax:+43 7946 / 20 5 81
  Homepage: http://quelltextlich.at/
 ---

 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics


___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics