Hi all, The CGI issues seem to be better after bumping the ulimits the other day, but some problems still persist. Specifically, once in a while today I noticed "end of CGI header" errors upon loading certain wiki pages.
I looked into the issue more, and tried to correlate the increasing number of blocked processes in vmstat with info from top. What I found after watching for a while is that the number of blocked processes (I think waiting for disk IO) seems to increase when spamd and sa-learn are running. It seems like it could be that spamd and sa-learn are able to "wedge" other processes, thus bumping up the number of processes blocked while waiting for IO. 1) Can we try, as an experiment, either lowering the priority of spamd and other associated processes or otherwise making it less aggressive? I have a hunch that this might help us out. 2) Can we check to see if IO errors are ocurring in dmesg? I can't see it because of my permissions, but I would like to make sure that nothing strange is occurring that would be throwing off the rest of what I was seeing. 3) Also, the second biggest consumer of resources after spamd seems to be users hitting IMAP mailboxes. Perhaps we could set the priority of imapd just below apache to reduce the amount of time that Apache has to wait for disk IO? 4) Finally, can we get "sar" (from the sysstat package) working on that system? It can be very useful in troubleshooting with historical data and it seems like it's already installed anyway... might just have to turn it on? Just some tuning ideas that might help us out... it seems that even though we're going to a different setup soon, we should still try to keep fyodor as reliable as possible. Also, this exercise (I think) has indicated that fyodor is more disk-bound than we thought, which makes me feel more strongly that we are making a good choice in setting up the new fileserver with RAID 10 instead of 5. It also proved to me that our system is much busier processing mail (both with spamd and IMAP) than serving web requests, which may be important to keep in mind when setting up our new servers. Would love to hear other ideas if anyone feels inclined to look at these issues. Justin _______________________________________________ HCoop-SysAdmin mailing list [email protected] http://hcoop.net/cgi-bin/mailman/listinfo/hcoop-sysadmin
