There are two different servers....

I was online, because I had noticed that our nagios had complained that our 
loghost was full.  After fixing that I looked to see why loghost had filled up. 
 And, found that it was getting log messages of:

Jun 22 13:35:32 <system> tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File 
system
 full, swap space limit exceeded

It was growing that log file at ~1164 lines per second (took a 10 second chunk 
and there were 11643 lines for 1,315,659 bytes.)

At that rate it would take about another 8 hours to refill loghost again.

There has been recent talk of upgrading loghost....but not sure how many years 
it'll be for that to actually happen.  The project to move cacti from the test 
instance on our old RHEL3 build box started in 2007, and it still hasn't moved. 
 (its been assigned to several new hires along the way...and I've tried a few 
times to just make it move on my own ... at one time I had my own cacti, to do 
the things that our ancient version couldn't.)

For me, my first project was to move nagios, which I did. (from a former SA's 
workstation, where it was so bogged down that it was constantly reporting 
things down that weren't)  Though its in need of a refresh, and there have been 
a number of failed attempts.  Now I'm working on moving it again.  Check 
latencies approaching 20 minutes has caused some issues.


----- Original Message -----
> Thanks for the detailed writeup-- I'd like to see more posts like
> this.
> 
> One question I had was that you were talking about full disks, but
> then started talking about being unable to call fork() due to RAM
> constraints. That threw me for a bit of a loop-- how did a full disk
> completely exhaust your RAM?
> 
> --Corey
> 
> On Jun 23, 2013, at 11:11 AM, "Lawrence K. Chen, P.Eng."
> <[email protected]> wrote:
> 
> > Yesterday there turned out to be two unresponsive servers at work.
> >  I wasn't on call, so I didn't immediately know about the first
> > one.  But, nagios had complained about the second one.
> > 

-- 
Who: Lawrence K. Chen, P.Eng. - W0LKC - Senior Unix Systems Administrator
For: Enterprise Server Technologies (EST) -- & SafeZone Ally
Snail: Computing and Telecommunications Services (CTS)
Kansas State University, 109 East Stadium, Manhattan, KS 66506-3102
Phone: (785) 532-4916 - Fax: (785) 532-3515 - Email: [email protected]
Web: http://www-personal.ksu.edu/~lkchen - Where: 11 Hale Library
_______________________________________________
Discuss mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to