Ok, I have now installed munin (http://munin.projects.linpro.no/) great
Norwegian product! :)

>From my initial looks it does not do the "per process" accounting but
everything else is there. Even hw interrupts. 

    *  gonzo.barmen.nu
          o Disk
                + Filesystem usage (in %)
                + Inode usage (in %)
                + IOstat
          o Mysql
                + MySQL throughput
                + MySQL queries
                + MySQL slow queries
                + MySQL threads
          o Network
                + eth0 errors
                + eth0 traffic
                + Netstat
          o Processes
                + Fork rate
                + Number of Processes
                + VMstat
          o System
                + CPU usage
                + Available entropy
                + Interrupts & context switches
                + Individual interrupts
                + Load average
                + Memory usage
                + File table usage
                + Inode table usage
                + Swap in/out

I can see there is some plugins etc too, so I will investigate further what
I can gather of information, but this is a great start at least. 

-stian


-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Norman
Rasmussen
Sent: 23. oktober 2006 12:44
To: PyTransports Discussion
Subject: Re: [py-transports] Unstable servers

On 10/23/06, Stian B. Barmen <[EMAIL PROTECTED]> wrote:
> So now I need help to find the problem. I must admit I suspect one of the
> transports eating cpu away to the point where I cannot use my server no
> more. How can this happen, and how can I control it?

More importantly, which one is acting up!

I seriously suggest setting up something like MRTG [1] or Cacti [2] to
monitor as many aspects of your server as you can.  It will allow you
to track your system resources over time, and possibly see which ones
are getting out of hand.

A suggested sample of stuff to collect:
 - Load average for the server (should be < ~1.0 most of the time)
 - CPU Usage, (not as useful as load average, but might as well)
 - Memory Usage/Free
 - Disk Usage
 - Network Interface Bandwidth (this is what MRTG was designed to do,
but a lot of the other stuff can be squashed into MRTG)
 - Per process (ejabberd, transports, etc):
   - CPU Usage
   - Memory Usage/Free
   - Disk Usage
   - Number of connections

This way you should be able to figure out when it died (you'll
probably have to reboot anyways, but you can check the graphs
retroactively)

[1] http://oss.oetiker.ch/mrtg/
[2] http://cacti.net/

-- 
- Norman Rasmussen
 - Email: [EMAIL PROTECTED]
 - Home page: http://norman.rasmussen.co.za/
_______________________________________________
py-transports mailing list
[email protected]
http://www.modevia.com/cgi-bin/mailman/listinfo/py-transports

_______________________________________________
py-transports mailing list
[email protected]
http://www.modevia.com/cgi-bin/mailman/listinfo/py-transports

Reply via email to