> Should have icinga alarms arround these types of issues?  Seems like that 
> would be the way to go. 
Aside from this, I get daily emails about webrequest partition statuses, and I 
would at least notice the morning after that something is wrong. 



> On Mar 7, 2015, at 21:20, Nuria Ruiz <nu...@wikimedia.org> wrote:
> 
> Thanks much Christian for the writeup.
> 
> Should have icinga alarms arround these types of issues?  Seems like that 
> would be the way to go. 
> 
> Thanks, 
> 
> Nuria
> 
> On Sat, Mar 7, 2015 at 4:00 PM, Andrew Otto <ao...@wikimedia.org 
> <mailto:ao...@wikimedia.org>> wrote:
> Thanks Christian!
> 
> 
> > On Mar 7, 2015, at 09:14, Christian Aistleitner <christ...@quelltextlich.at 
> > <mailto:christ...@quelltextlich.at>> wrote:
> >
> > Hi,
> >
> > around running jobs on the Analytics cluster, I've sometime seen
> > people say in IRC: “Let's run this heavy job. I'll keep an eye on it”.
> >
> > But more often than not, this seems to have meant:
> > “Let's just run this heavy job and wait. If QChris joins IRC, let's
> > hope he doesn't ping us about having overloaded the cluster.”
> >
> > That's not nice^Wscalable ;-)
> >
> > So just in case someone is vague on how to “keep an eye on it”, I did
> > a short write-up at:
> >
> >  https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load 
> > <https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load>
> >
> > which details on detecting how the cluster is doing on a very high
> > level.
> > Especially, it allows you to detect if the cluster got stalled, and if
> > it did, it tells you what to do.
> >
> > Have fun,
> > Christian
> >
> > P.S.: The above URL has diagrams! Click the URL!
> >
> > --
> > ---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
> >                           Companies' registry: 360296y in Linz
> > Christian Aistleitner
> > Kefermarkterstrasze 6a/3     Email:  christ...@quelltextlich.at 
> > <mailto:christ...@quelltextlich.at>
> > 4293 Gutau, Austria          Phone:          +43 7946 / 20 5 81 
> > <tel:%2B43%207946%20%2F%2020%205%2081>
> >                             Fax:            +43 7946 / 20 5 81 
> > <tel:%2B43%207946%20%2F%2020%205%2081>
> >                             Homepage: http://quelltextlich.at/ 
> > <http://quelltextlich.at/>
> > ---------------------------------------------------------------
> > _______________________________________________
> > Analytics mailing list
> > Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org>
> > https://lists.wikimedia.org/mailman/listinfo/analytics 
> > <https://lists.wikimedia.org/mailman/listinfo/analytics>
> 
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/analytics 
> <https://lists.wikimedia.org/mailman/listinfo/analytics>
> 
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to