No problem Mike, Disk IO is actually easy to see, try "iostat 2". iostat gives io-statistics (durh :-) ) for the disks in a system. the 2 tells it to wait 2 seconds between runs and it runs until you cancel it. You can also do "iostat 2 5", which is 5 runs 2 seconds apart. It's not easily parse-able in it's normal output, but it does make it easy for human eyes.
--Donald On Wed, Feb 15, 2012 at 1:20 PM, Dean, Mike <[email protected]> wrote: > Donald, > > Thanks for the tips. Disk IO was my first thought, but I'm not sure of the > best way to keep an eye on that. Is there a util/command that I can run/log > to see the disk IO? > > Mike > > > On Wed, Feb 15, 2012 at 1:16 PM, The Donald Cowart <[email protected]> > wrote: >> >> Hmmm... based on this I have two ideas about what it might be, >> >> It could be something doing a bunch of DNS queries and so processes >> are waiting for dig results or TCP timeouts, not sure how to prove >> that though. >> >> It might be something compressing/rotating log files all at once, just >> enough to slow the system down while it's happening. Probably >> triggered within on of the applications for monitoring. You may want >> to record some iostat values too just to see if disk activity is >> spiking during an event. >> >> I hope this helps! >> >> --Donald >> >> On Wed, Feb 15, 2012 at 12:57 PM, Dean, Mike <[email protected]> wrote: >> > Unfortunately, I have not been able to get a snapshot from top when it >> > is >> > running slow. The slowdowns typically only last a few seconds and do >> > not >> > occur with any sort of regularity (that I've been able to determine, >> > anyway. >> > >> > I started a "top -b" piped to a file to see if I can catch a snapshot >> > during >> > a slow period. >> > >> > As for the box itself (and apologies for not including this info >> > originally), it is used as a network monitor/management station. Two of >> > the >> > applications that run on it are Nagios (up/down and other monitoring) >> > which >> > has various checks that are shell based, Perl and compiled, and >> > Smokeping, >> > which sends out TCP and ICMP probes every 5 minutes to some hosts (less >> > than >> > 50) to record round trip times. >> > >> > It is also one of our syslog machines with a script that runs every 5 >> > minutes parsing the log files (none of the files have grown to a large >> > size >> > or increased in syslog input/output). >> > >> > And, none of these systems has been changed in the last week. >> > >> > So, other than top, is there other things to check or monitors to set? >> > >> > Thanks again, in advance! >> > >> > On Wed, Feb 15, 2012 at 11:02 AM, The Donald Cowart <[email protected]> >> > wrote: >> >> >> >> Can you get the output from top during a slowdown or just after? >> >> Also, is the boxes' function a webserver, fileserver, mathematical >> >> processing, etc? Was the box rebooted after the patching? >> >> >> >> Something that may help run "top -b" in a while loop (with a sleep in >> >> between runs) and dump it to a file or series of files, so you've got >> >> snapshots over time of the system performance to help troubleshoot >> >> this. >> >> >> >> --Donald >> >> >> >> On Wed, Feb 15, 2012 at 10:49 AM, Dean, Mike <[email protected]> >> >> wrote: >> >> > Hello all, hoping you can help. We have a RedHat box that two days >> >> > ago >> >> > starting having periods of slow performance. The slow down is bad >> >> > enough >> >> > that you can see it when trying to type at a terminal and some >> >> > processes, >> >> > such as SNMP, don't respond. Users have also been disconnected. >> >> > >> >> > The last change that was made was applying the normal monthly patches >> >> > on >> >> > 2/7 (the problem only started showing up yesterday). According to >> >> > the >> >> > information from 'top', the system seems to be fine. A typical >> >> > snapshot >> >> > looks like this: >> >> > >> >> > Tasks: 282 total, 8 running, 274 sleeping, 0 stopped, 0 zombie >> >> > Cpu(s): 1.1%us, 0.3%sy, 0.0%ni, 98.5%id, 0.0%wa, 0.0%hi, >> >> > 0.0%si, >> >> > 0.0%st >> >> > Mem: 3909268k total, 1669580k used, 2239688k free, 231832k >> >> > buffers >> >> > Swap: 6094840k total, 0k used, 6094840k free, 972756k >> >> > cached >> >> > >> >> > Any ideas on where to look? >> >> > >> >> > Thanks! >> >> > >> >> > Mike >> >> >> >> >> >> >> >> -- >> >> Donald Cowart >> >> http://www.rdex.net/ >> > >> > >> >> >> >> -- >> Donald Cowart >> http://www.rdex.net/ > > -- Donald Cowart http://www.rdex.net/ --------------------------------------------------------------------- Archive http://marc.info/?l=jaxlug-list&r=1&w=2 RSS Feed http://www.mail-archive.com/[email protected]/maillist.xml Unsubscribe [email protected]

