[AMaViS-user] Climbing queues issue

Matt Juszczak Fri, 09 Dec 2005 12:30:13 -0800

Hi all,

I've been having a problem recently. We have three relay servers(relay1, relay2, and relay3) that are round robin MX for the most part.We have a cisco local director hooked up to them and some domains use itin DNS.

Anyway, the servers run fine for the most part, with 20-30 messagesqueues on each. But on random days on random servers (sometimes itsrelay3, sometimes its relay1), the queues get gigantic ... 50,000+.

I keep thinking these are spam attacks of some sorts, and since we haveIDE hard drives, once it starts writing to the queues, it can't readback fast enough and the system gets bogged down.

Relay3 had a nice queue load today. It only got up to 5,000 messagesbefore we realized the problem. A reboot will ALWAYS fix this problem.In other words, if a server has 10,000 messages in the queue, and Ireboot it, the queue is immediately flushed the second the machine comesback up... usually about 1000 messages every 2 minutes (so a queue of5,000 clears out in about 10 minutes).

Its just odd that we have to reboot the box in order for this problem tobe solved. I have a graph of what is going on and I can hand out theURL if that will assist in anyone trying to guess the problem. Maybe myIDE drive idea isn't the best idea in the world.

For the record, I just did a top and got this on relay3. If you notice,the CPU is 0% idle (even though its a 3.06 ghz). There are three vscanprocesses which seem to be using a LOT of cpu time... maybe this is whatis occuring, and it gets bad and eventually causes the queue to rise?Anyway, any ideas would be appreciated!


-Matt

----snip----
last pid: 31870;  load averages:  4.35,  4.26,  4.47
93 processes:  5 running, 88 sleeping

CPU states: 93.8% user, 0.0% nice, 5.4% system, 0.8% interrupt, 0.0%idle

Mem: 266M Active, 381M Inact, 173M Wired, 976K Cache, 110M Buf, 168M Free
Swap: 2007M Total, 2007M Free

 PID USERNAME PRI NICE   SIZE    RES STATE    TIME   WCPU    CPU COMMAND
17979 vscan    129    0 45336K 41692K RUN     49:13 20.17% 20.17% perl5.8.6
4903 vscan    129    0 45724K 42080K RUN     84:35 19.97% 19.97% perl5.8.6
26248 vscan    129    0 44500K 40884K RUN     12:44 19.68% 19.68% perl5.8.6
31822 vscan     20    0 45352K 42088K lockf    0:01  5.55%  3.61% perl5.8.6
31656 vscan      4    0 46000K 42740K select   0:03  2.50%  2.49% perl5.8.6
31604 vscan     20    0 47076K 43808K lockf    0:03  2.30%  2.29% perl5.8.6
31606 vscan      4    0 47516K 43808K accept   0:04  1.86%  1.86% perl5.8.6
31690 vscan     20    0 46140K 42840K lockf    0:02  1.52%  1.51% perl5.8.6
31670 vscan     20    0 47624K 44088K lockf    0:03  1.47%  1.46% perl5.8.6
31616 vscan     20    0 46516K 43256K lockf    0:03  1.47%  1.46% perl5.8.6
31773 vscan     20    0 45184K 41920K lockf    0:01  1.51%  1.42% perl5.8.6
31601 vscan    105    0 46572K 43292K RUN      0:03  1.07%  1.07% perl5.8.6
31703 vscan     20    0 46008K 42756K lockf    0:02  0.49%  0.49% perl5.8.6
 432 clamav    20    0 13348K 12692K kserel   3:35  0.00%  0.00% clamd
----snip----





-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

[AMaViS-user] Climbing queues issue

Reply via email to