Re: [Nagios-users] status.cgi very high cpu usage => Problem gone away

2008-02-22 Thread Steve Kieu
Not slow because of many requests, slow because cpu execution time is like
30 times slower, effectively running like 100Mh i486 cpu (benchmark on the
console only, disconnect all network NIC. So no ; not because of Google :-)



Obviously vmware is doing something nasty. When the box is slow and high
load, the host load is still rather low.
Cheers,


On Fri, Feb 22, 2008 at 7:26 PM, Hugo van der Kooij <
[EMAIL PROTECTED]> wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Steve Kieu wrote:
>
> | The most scary thing is, as suddenly like when it came, it suddenly went
> | away this morning. Not any changes to the nagios system or vmware guest
> | and hosts.
> |
> | I have been doing the similar benchmark of status.cgi on the real host
> | with the same status.dat file with the host having problem. It takes 0.1
> | sec to process, and in the vmware host it took 3.9 second. I have a
> | quick look of how status.cgi parse the text file and see lots of  memory
> | cmp (strcmp call to libc) and move. So my wild guess is that memory
> | operation on the vm is terribly slow for some reason. And suddenly as
> | come from nowhere, it just become as fast as normal.
> |
> | Any one has a bright idea of what is going on ?
>
> You wouldn't happen to own a google indexing device now would you? THat
> could hammer down a webbased server like nagios.
>
> Hugo.
>
> - --
> [EMAIL PROTECTED]   http://hugo.vanderkooij.org/
> PGP/GPG ? Use:
> http://hugo.vanderkooij.org/0x58F19981.asc
>
>A: Yes.
>>Q: Are you sure?
>>>A: Because it reverses the logical flow of conversation.
>>>>Q: Why is top posting frowned upon?
>
> Bored? Click on http://spamornot.org/ and rate those images.
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.7 (GNU/Linux)
>
> iD8DBQFHvmsXBvzDRVjxmYERAi0qAKC3ZCvsFQ24Hz4BvQj7M+MMAzEiJwCgnuMV
> emfj+Gf5tZjQxS0zZEdnyzM=
> =YkCx
> -END PGP SIGNATURE-
>
> -
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>



-- 
Steve Kieu
Mob: (+64) 021 250 6437
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] status.cgi very high cpu usage => Problem gone away

2008-02-21 Thread Hugo van der Kooij
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Steve Kieu wrote:

| The most scary thing is, as suddenly like when it came, it suddenly went
| away this morning. Not any changes to the nagios system or vmware guest
| and hosts.
|
| I have been doing the similar benchmark of status.cgi on the real host
| with the same status.dat file with the host having problem. It takes 0.1
| sec to process, and in the vmware host it took 3.9 second. I have a
| quick look of how status.cgi parse the text file and see lots of  memory
| cmp (strcmp call to libc) and move. So my wild guess is that memory
| operation on the vm is terribly slow for some reason. And suddenly as
| come from nowhere, it just become as fast as normal.
|
| Any one has a bright idea of what is going on ?

You wouldn't happen to own a google indexing device now would you? THat
could hammer down a webbased server like nagios.

Hugo.

- --
[EMAIL PROTECTED]   http://hugo.vanderkooij.org/
PGP/GPG? Use: http://hugo.vanderkooij.org/0x58F19981.asc

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?

Bored? Click on http://spamornot.org/ and rate those images.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFHvmsXBvzDRVjxmYERAi0qAKC3ZCvsFQ24Hz4BvQj7M+MMAzEiJwCgnuMV
emfj+Gf5tZjQxS0zZEdnyzM=
=YkCx
-END PGP SIGNATURE-

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] status.cgi very high cpu usage => Problem gone away

2008-02-20 Thread Steve Kieu
Hi,

The most scary thing is, as suddenly like when it came, it suddenly went
away this morning. Not any changes to the nagios system or vmware guest and
hosts.

I have been doing the similar benchmark of status.cgi on the real host with
the same status.dat file with the host having problem. It takes 0.1 sec to
process, and in the vmware host it took 3.9 second. I have a quick look of
how status.cgi parse the text file and see lots of  memory cmp (strcmp call
to libc) and move. So my wild guess is that memory operation on the vm is
terribly slow for some reason. And suddenly as come from nowhere, it just
become as fast as normal.

Any one has a bright idea of what is going on ?

Thanks


-- 
Steve Kieu
Mob: (+64) 021 250 6437
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] status.cgi very high cpu usage

2008-02-19 Thread Steve Kieu
Hello,

[EMAIL PROTECTED] nagios]$ time ./sbin/status.cgi >foo
>
> real0m1.390s
> user0m1.300s
> sys 0m0.090s
> [EMAIL PROTECTED] nagios]$ du -sh foo
> 4.4Mfoo
>


Similar benchmark  in my case :
nagtst01:/usr/local/nagios/sbin # time ./status.cgi > testdata

real0m3.553s
user0m3.316s
sys 0m0.044s
nagtst01:/usr/local/nagios/sbin # ls -l ../var/status.dat
-rw-rw-r-- 1 nagios nagios 849578 2008-02-20 11:02 ../var/status.dat


Compare with another box we have which is 30 times slower, and every part
(sys usr ; is around 20 times slower. We suspect there are some thing wrong
with the vmware config in this case. The other (good) box has around 460
services

In another box:

illuminati:/usr/local/nagios/sbin # time ./status.cgi > testdata

real0m0.364s
user0m0.104s
sys 0m0.002s
illuminati:/usr/local/nagios/sbin #  ls -l ../var/status.dat
-rw-rw-r--  1 nagios nagcmd 622377 Feb 20 11:03 ../var/status.dat



> What status view specifically is the issue for you? What version of
> nagios? Unless you're still using nagios-1.x, my gut feeling is that


Nagios 2.9 having some custom code I made but  just change the some of the
html tag output only, trivvial and 100% not affect performance. The view is
requested by nagvis. I do not do the nagvis config then I am not sure how
many request that nagvis generate but even the main page of nagios it still
takes too much cpu for status.cgi to run

Thanks for your help

Cheers




>
> there is something outside of nagios causing the issue (i.e. disk IO,
> memory pressure, etc). Nagios-2 should be able to easily handle the
> status data for 650 services. That was an area of significant focus and
> improvement with the current version.
>
> --
> Marc
>
> -
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>



-- 
Steve Kieu
Mob: (+64) 021 250 6437
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] status.cgi very high cpu usage

2008-02-19 Thread Marc Powell


> -Original Message-
> From: Steve Kieu [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, February 19, 2008 2:32 PM
> To: Marc Powell
> Cc: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] status.cgi very high cpu usage
> 
>   >
>   > Many of the "Monitoring" reports don't work well at volume,
I've
> been
>   > asking users to only use "Unhandled" reports.  You may get
better
>   > response in Mozilla, but 'status.cgi' can kill Internet
Explorer
>   > because of how it's loading everything in one large list.
> 
> 
>   This is a browser rendering issue, nothing to do with the
nagios'
> speed
>   at reading and parsing its status file. To get the html to
display
>   status for all 3800 of my services takes under 1.5 seconds --
> 
> 
> 
> 
> In my case it is not the browser issue. The problem is that status,cgi
> does not return data rather than the data it generates is so big.
> 

4.4M of data in 1.39s --

[EMAIL PROTECTED] nagios]$ export REQUEST_METHOD=GET; export
QUERY_STRING="host=all"; export REMOTE_USER="mpowell";
[EMAIL PROTECTED] nagios]$ time ./sbin/status.cgi >foo

real0m1.390s
user0m1.300s
sys 0m0.090s
[EMAIL PROTECTED] nagios]$ du -sh foo
4.4Mfoo

> And status.cgi hog cpu time at the monitoring host (not the desktop
> viewing using a browser)
> 
> I guess there are something wrong with status.cgi and even the
extinfo.cgi
> take considerable amount of cpu as well

What status view specifically is the issue for you? What version of
nagios? Unless you're still using nagios-1.x, my gut feeling is that
there is something outside of nagios causing the issue (i.e. disk IO,
memory pressure, etc). Nagios-2 should be able to easily handle the
status data for 650 services. That was an area of significant focus and
improvement with the current version.

--
Marc

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] status.cgi very high cpu usage

2008-02-19 Thread Steve Kieu
>
> >
> > Many of the "Monitoring" reports don't work well at volume, I've been
> > asking users to only use "Unhandled" reports.  You may get better
> > response in Mozilla, but 'status.cgi' can kill Internet Explorer
> > because of how it's loading everything in one large list.
>
> This is a browser rendering issue, nothing to do with the nagios' speed
> at reading and parsing its status file. To get the html to display
> status for all 3800 of my services takes under 1.5 seconds --
>


In my case it is not the browser issue. The problem is that status,cgi  does
not return data rather than the data it generates is so big.

And status.cgi hog cpu time at the monitoring host (not the desktop viewing
using a browser)

I guess there are something wrong with status.cgi and even the
extinfo.cgitake considerable amount of cpu as well

Thanks

-- 
Steve Kieu
Mob: (+64) 021 250 6437
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] status.cgi very high cpu usage

2008-02-19 Thread Marc Powell


> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Justin Hitt
> Sent: Tuesday, February 19, 2008 9:15 AM
> To: Steve Kieu
> Cc: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] status.cgi very high cpu usage
> 
> Steve,
> 
> On Feb 18, 2008 10:51 PM, Steve Kieu <[EMAIL PROTECTED]> wrote:
> > I have a problem with status.cgi taking up too much cpu so the page
is
> very
> > slow to  render. Is there any way to find out where the problem is?
> > We have about 650 services monitored. The output os nagios -s
command is
> 
> Many of the "Monitoring" reports don't work well at volume, I've been
> asking users to only use "Unhandled" reports.  You may get better
> response in Mozilla, but 'status.cgi' can kill Internet Explorer
> because of how it's loading everything in one large list.

This is a browser rendering issue, nothing to do with the nagios' speed
at reading and parsing its status file. To get the html to display
status for all 3800 of my services takes under 1.5 seconds --

$ time wget http:///cgi-bin/status.cgi?host=all
--12:59:50--  http:///cgi-bin/status.cgi?host=all
   => `status.cgi?host=all'
Resolving 
Connecting to |:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]

[ <=>
] 4,540,227 23.92M/s 

12:59:51 (23.84 MB/s) - `status.cgi?host=all' saved [4540227]


real0m1.364s
user0m0.010s
sys 0m0.000s
$ wc -l status.cgi\?host\=all 
 128601 status.cgi?host=all

> 
> Nagios is at the point where it needs an SQL back end with a more
> modular look at how it stores site data.  Perhaps, rolling status up

Perhaps, but not for speed/performance reasons (outside of long-duration
archive reports), IMHO.

> into summary reports that are queried to create reports then go into
> host tables only when someone drills down into host information.

Isn't this already available? Status Summary, various links from
Tactical Overview, Service Problems, etc. 

> In production you'll want to be on a multi-core multi-threaded
> machine; 2 cores won't do it if you'll have more than one user in the
> system.  Until then, keep users in the "Unhandled" menus around
> "{Service,Host} Problems"

This is best from a workflow perspective but saying that you need to
have dual cores if you have more than one nagios user is a bit a dubious
statement. My own experience is that the above test used less than 3%
cpu for the duration of a 3.1Ghz Xeon cpu, even when viewed with a
browser. Granted it's not a vmware box but if vmware introduces that
significant of a performance impact, I'd abandon it without prejudice.

--
Marc

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] status.cgi very high cpu usage

2008-02-19 Thread Justin Hitt
Steve,

On Feb 18, 2008 10:51 PM, Steve Kieu <[EMAIL PROTECTED]> wrote:
> I have a problem with status.cgi taking up too much cpu so the page is very
> slow to  render. Is there any way to find out where the problem is?
> We have about 650 services monitored. The output os nagios -s command is

Many of the "Monitoring" reports don't work well at volume, I've been
asking users to only use "Unhandled" reports.  You may get better
response in Mozilla, but 'status.cgi' can kill Internet Explorer
because of how it's loading everything in one large list.

Nagios is at the point where it needs an SQL back end with a more
modular look at how it stores site data.  Perhaps, rolling status up
into summary reports that are queried to create reports then go into
host tables only when someone drills down into host information.

In production you'll want to be on a multi-core multi-threaded
machine; 2 cores won't do it if you'll have more than one user in the
system.  Until then, keep users in the "Unhandled" menus around
"{Service,Host} Problems"

Best,

Justin
-- 
Attention Sales And Marketing Professionals Who Serve B2B Executives
   http://hittpublishingdirect.com/

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null