We had the same/a similar issue

OPS-Master was a VMWare Server VM with 1.5 GB ram monitoring total 380 hosts (2 
slaves monitoring more than half of that total) and the I/O was TERRIBLE. 
Switched to OpenVZ containers for all 3 and added a few hundred more hosts, no 
complaints at all.

www.openvz.org - Give it a shot.

-Nick

Date: Thu, 21 Jan 2010 14:19:33 -0500
From: [email protected]
To: [email protected]
Subject: Re: [opsview-users] Believe Opsview master server is getting bogged    
down but not sure what resources to give it.

Hi James,
We also used to run our master on VMWare (ESXi in our case). Our set up is 
probably not as big as yours, we monitor 658 hosts/6496 services with 5 slaves. 
Our problem was IO. After trying to optimize mysql (helped but didn't solve the 
issue), we decided to move it to a physical box.

More recently we had to move the DB to a separate box, as IO was becoming a 
problem again. We do, however, graph most of the service checks we have (rrd 
updates = IO) and run Cacti on the same box.
I would definitely recommend you install sysstat and use iostat for a while to 
see if that is your issue.


Hope this helps,
Rafael


On Thu, Jan 21, 2010 at 12:35 PM, James Whittington <[email protected]> 
wrote:














Hello fellow Opsview users.

My company has had Opsview in use for a little over a year
now and I think we are starting to see some growing pains as more stuff gets
added.

We are currently using version 3.5.0 on Ubuntu 8.04 ,
reverse tunnel master/slave setup.

MySQL is running on the master server.

We only have 466 hosts and 1557 services but those checks
are distributed across 19 slave servers.

 

The master server is running on a VMWare ESX 3.5 host.

                ESX
Server CPU hovers between 20-35 %

                ESX
Server RAM is around 80%

The master Opsview server currently has 2 GB RAM, although
it appears mysql is using about half of it.

Here is a partial listing of top

 

top - 12:17:25 up 1 day,  2:46,  2 users, 
load average: 0.25, 0.88, 1.08

Tasks: 186 total,   3 running, 183
sleeping,   0 stopped,   0 zombie

Cpu(s):  1.6%us,  1.7%sy,  0.0%ni, 77.5%id,
19.1%wa,  0.0%hi,  0.1%si,  0.0%st

Mem:   2112044k total,  1928028k
used,   184016k free,   103416k buffers

Swap:  1341388k total,   776816k
used,   564572k free,   466224k cached

 

  PID USER      PR 
NI  VIRT  RES  SHR S %CPU %MEM    TIME+ 
COMMAND

 4557 mysql     20   0
1596m 1.1g 4336 S    1 53.7  44:47.91 mysqld

 4849 root      20  
0  2564  920  772 S    1  0.0  
5:54.77 vmware-guestd

25640 nagios    20   0  5764
2948 1596 S    1  0.1   0:00.03 update_snmptrap

 2480 root      15 
-5     0    0    0
S    1  0.0   4:46.58 kjournald

 5066 nagios    20   0
23336  12m 1328 S    1  0.6  18:21.65 nagios

 

-         
One core issue we see other than general slowness is
some of cgi’s just time out.  We have some people that still like
the old statusmap.cgi but the full map never renders and the CPU spikes at 100%.


 

-         
Also on the new Events View we always get script time
out errors in the browser and the load time of events is pretty slow.


 


-         
Also I see a fair number of timeout errors in the
ospview log


 


[2010/01/20 13:56:46] [exec_and_log] [WARN]
ssh_exchange_identification: Timeout waiting for version information.

[2010/01/20 15:32:22] [import_ndologsd] [WARN] Import of
1264019531.129820, size=2685022, took 11.43 seconds > 5 seconds

[2010/01/20 15:33:00] [import_ndologsd] [WARN] Import of
1264019546.119956, size=13666, took 34.00 seconds > 5 seconds

[2010/01/20 15:36:15] [import_ndologsd] [WARN] Import of
1264019759.068638, size=2692963, took 15.83 seconds > 5 seconds

[2010/01/20 15:36:45] [import_ndologsd] [WARN] Import of
1264019774.998388, size=18311, took 29.89 seconds > 5 seconds

[2010/01/21 09:28:08] [import_ndologsd] [WARN] Import of
1264084077.245904, size=2696437, took 11.33 seconds > 5 seconds

[2010/01/21 09:28:46] [import_ndologsd] [WARN] Import of
1264084092.143017, size=8507, took 34.52 seconds > 5 seconds

 

 

I’m looking for ideas on what the core bottleneck
might be? I could add more memory to the server, I could move MySQL to a
different server, maybe move some files to a RAM disk? 

I’m willing to make upgrades where needed I just want to
make sure I use resources wisely..

Thanks for any advice you all may have.

 

James Whittington

VC3, Inc.

 

 

 







_______________________________________________

Opsview-users mailing list

[email protected]

http://lists.opsview.org/lists/listinfo/opsview-users



                                          
_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
http://clk.atdmt.com/GBL/go/196390706/direct/01/
_______________________________________________
Opsview-users mailing list
[email protected]
http://lists.opsview.org/lists/listinfo/opsview-users

Reply via email to