Re: SA Timed out..
Hi, > The reported stackdump just shows a routine that was executing at > the time when timer expired. It often shows the problem section > (like a database access), but may just show an innocent bystander > code which happened to be normally executing at the time. Great, thanks. How about from a poorly-written rule that's consuming CPU cycles or deadlocked in some way? Thanks, Alex
Re: SA Timed out..
Alex, > I have a server that's frequently pretty busy during the day, and just > started to notice these messages periodically: > > Jun 8 13:35:39 mail01 amavis[28784]: (28784-272) SA TIMED OUT, > backtrace: at /usr/lib/perl5/5.6.0/i386-linux/IO/Select.pm line > 104\n\tIO::Select::can_read('IO::Select=ARRAY(0xe3ea068)', 10) called > at /usr/lib/perl5/site_perl/5.6.0/i386-linux/Net/DNS/Resolver/Base.pm > line > 668\n\tNet::DNS::Resolver::Base::send_udp('Net::DNS::Resolver=HASH(0xe392b > a4)', 'Net::DNS::Packet=HASH(0xe39f5d0)', > '(m-v...@^@^...@^@^...@^@^...@^@^fpro164^itenaccave^cc...@^@^...@^a') [...] > > Is this a result of the server being too busy to process the request, > or one of the services that it uses, such as a DNSBL, not responding > in time? How can I troubleshoot this? The reported stackdump just shows a routine that was executing at the time when timer expired. It often shows the problem section (like a database access), but may just show an innocent bystander code which happened to be normally executing at the time. If it happens often, check which code sections are reported often. If it usually shows the same or closely related routine, then a problem may indeed be there or close. If the reported routines appear random, then your host is just generally overloaded. Slow or dead DNSBL and URIBL servers/blacklists are normally not a problem, as the time limiting on DNS replies works very well in SA 3.3.* and 3.2.5. Regardless, a locally running caching DNS server is almost a must with any substantial amount of mail traffic. Mark
Re: SA timed out
On Thu, 2007-11-01 at 16:28 -0400, Daryl C. W. O'Shea wrote: > Robert Fitzpatrick wrote: > > I have the following error message in the logs, didn't even notice until > > tracking down an email for a user today, but been happening in all my > > logs back the last week. All three servers running mail filtering to > > pgsql db have this error including the server which hosts the db. I find > > no problems with filtering and BAYES scoring seems to be working and is > > tagging messages fine. So, I assume this means the learning part is not > > working? However, looking at bayes_var in the db, I see token, spam and > > ham counts all updating from AWL I assume. Can someone offer feedback to > > help determine what exactly is the issue at hand? Thanks in advance. > > I don't have the time to compare the backtrace to the actual code, so > I'll guess instead. Disable bayes_auto_expire and see if the errors go > away. It's probably bayes expiries taking longer than the amavis > timeout limit. Thanks for the response. I did not have the setting defined in local.cf, I added 'bayes_auto_expire 0' and it is still happening. I am using Postfix + Maia mailguard, which is a amavisd-new 2.2 product. I made the change and restarted amavisd. -- Robert
Re: SA timed out
Robert Fitzpatrick wrote: I have the following error message in the logs, didn't even notice until tracking down an email for a user today, but been happening in all my logs back the last week. All three servers running mail filtering to pgsql db have this error including the server which hosts the db. I find no problems with filtering and BAYES scoring seems to be working and is tagging messages fine. So, I assume this means the learning part is not working? However, looking at bayes_var in the db, I see token, spam and ham counts all updating from AWL I assume. Can someone offer feedback to help determine what exactly is the issue at hand? Thanks in advance. I don't have the time to compare the backtrace to the actual code, so I'll guess instead. Disable bayes_auto_expire and see if the errors go away. It's probably bayes expiries taking longer than the amavis timeout limit. Daryl
Re: SA TIMED OUT
On Tue, Dec 05, 2006 at 06:42:01PM +0100, Stefan Jakobs wrote: > Here an other hint: > Every day I execute the following command and force an expire of the Bayes DB: > /usr/bin/sa-learn --dbpath /var/amavis/.spamassassin > -p /var/amavis/.spamassassin/user_prefs -u vscan --force-expire > > In local.cf I have the following entries: > bayes_auto_expire 1 > > Can this be the reason for the Time out? If you run an expire daily via cron, I would disable the auto-expiry. Expire runs can definitely cause timeouts since they take a while to run (SQL is faster than DBM fwiw), but your messages indicated the problem was locking as opposed to expiry. -- Randomly Selected Tagline: "You can use a morphing program and morph someone you don't like into satan or something..." - From C|Net pgpsvsBF4Mr97.pgp Description: PGP signature
Re: SA TIMED OUT
Am Dienstag, 5. Dezember 2006 18:16 schrieb Theo Van Dinter: > On Tue, Dec 05, 2006 at 06:11:56PM +0100, Stefan Jakobs wrote: > > > > 71\n\tMail::SpamAssassin::Locker::jittery_one_second_sleep('Mail::Spa > > > >mAss assin::Locker::UnixNFSSafe=HASH(0x9747010)') > > > > > > Are you using NFS? If not, switch to flock. > > > > No, I don't use NFS. What do you mean with "switch to flock"? > > This doesn't necessarily solve your problem, but you should switch the SA > lock method to flock ala: > > lock_method flock Good, I will try this. > it's better, but only works on non-network FS. Here an other hint: Every day I execute the following command and force an expire of the Bayes DB: /usr/bin/sa-learn --dbpath /var/amavis/.spamassassin -p /var/amavis/.spamassassin/user_prefs -u vscan --force-expire In local.cf I have the following entries: bayes_auto_expire 1 bayes_expiry_max_db_size 300 bayes_journal_max_size 102400 Can this be the reason for the Time out? Thanks, Stefan pgpysCC8j5XQM.pgp Description: PGP signature
Re: SA TIMED OUT
On Tue, Dec 05, 2006 at 06:11:56PM +0100, Stefan Jakobs wrote: > > > 71\n\tMail::SpamAssassin::Locker::jittery_one_second_sleep('Mail::SpamAss > > >assin::Locker::UnixNFSSafe=HASH(0x9747010)') > > > > Are you using NFS? If not, switch to flock. > > No, I don't use NFS. What do you mean with "switch to flock"? This doesn't necessarily solve your problem, but you should switch the SA lock method to flock ala: lock_method flock it's better, but only works on non-network FS. -- Randomly Selected Tagline: "The programmer needs the machine to run long enough to destroy it." - Prof. Michaelson pgpbDCRDRdAUg.pgp Description: PGP signature
Re: SA TIMED OUT
Am Dienstag, 5. Dezember 2006 16:12 schrieb Theo Van Dinter: > On Tue, Dec 05, 2006 at 04:06:17PM +0100, Stefan Jakobs wrote: > > Dec 5 15:32:58 server amavis[23505]: (23505-01-24) SA TIMED OUT, > > backtrace: at /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Locker.pm > > line 71\n\teval {...} called at > > /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Locker.pm > > That's not very nice. > > > 71\n\tMail::SpamAssassin::Locker::jittery_one_second_sleep('Mail::SpamAss > >assin::Locker::UnixNFSSafe=HASH(0x9747010)') > > Are you using NFS? If not, switch to flock. No, I don't use NFS. What do you mean with "switch to flock"? > > Does anybody know where the problem is? > > Amavis decides that it's tired of waiting for SA, which is waiting to write > to the Bayes DB. Bye Stefan pgpbmlm2zrA6T.pgp Description: PGP signature
Re: SA TIMED OUT
On Tue, Dec 05, 2006 at 04:06:17PM +0100, Stefan Jakobs wrote: > Dec 5 15:32:58 server amavis[23505]: (23505-01-24) SA TIMED OUT, backtrace: > at /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Locker.pm line 71\n\teval > {...} called at /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Locker.pm That's not very nice. > 71\n\tMail::SpamAssassin::Locker::jittery_one_second_sleep('Mail::SpamAssassin::Locker::UnixNFSSafe=HASH(0x9747010)') > Are you using NFS? If not, switch to flock. > Does anybody know where the problem is? Amavis decides that it's tired of waiting for SA, which is waiting to write to the Bayes DB. -- Randomly Selected Tagline: "Before you criticize someone, you should walk a mile in their shoes. That way, when you criticize them, you're a mile away and you have their shoes." - Zen Musings pgp1AYZPiOWwN.pgp Description: PGP signature
Re: SA TIMED OUT message debian sarge (new error)
Simon, > Looks like ive solved one issue, and another crops up!... I think that > i may need to move to a mysql storage engine here? approx 17,000 > messages a day incoming on this server. > Any pointers here? - Thanks!! > > Nov 4 11:39:40 mx1 amavis[32148]: (32148-07) SA TIMED OUT, backtrace: > at /usr/share/perl5/Mail/SpamAssassin/DBBasedAddrList.pm line 171 > ... /usr/share/perl5/Mail/SpamAssassin/AutoWhitelist.pm line 134 > ... /usr/share/perl5/Mail/SpamAssassin/Plugin/AWL.pm line 355 Move AWL to SQL, if you haven't already. It is not too bad to start from scratch with an empty AWL database, it is probably not worth salvaging your existing AWL. Mark
RE: SA TIMED OUT message debian sarge (new error)
Hi There, Looks like ive solved one issue, and another crops up!... I think that i may need to move to a mysql storage engine here? approx 17,000 messages a day incoming on this server. Any pointers here? - Thanks!! Nov 4 11:39:40 mx1 amavis[32148]: (32148-07) SA TIMED OUT, backtrace: at /usr/share/perl5/Mail/SpamAssassin/DBBasedAddrList.pm line 171\n\teval {...} called at /usr/share/perl5/Ma il/SpamAssassin/DBBasedAddrList.pm line 171\n\tMail::SpamAssassin::DBBasedAddrList::remove_entry('Mail::SpamAssassin::DBBasedAddrList=HASH(0xa881df0)', 'HASH(0xa6bc474)') called at /usr/share/perl5/Mail/SpamAssassin/AutoWhitelist.pm line 134\n\tMail::SpamAssassin::AutoWhitelist::check_address('Mail::SpamAssassin::AutoWhitelist=HASH(0xa87eba8)', '[EMAIL PROTECTED] adv.com', 82.227.79.148) called at /usr/share/perl5/Mail/SpamAssassin/Plugin/AWL.pm line 355\n\teval {...} called at /usr/share/perl5/Mail/SpamAssassin/Plugin/AWL.pm line 351\n\tMa il::SpamAssassin::Plugin::AWL::check_from_in_auto_whitelist('Mail::SpamAssassin::Plugin::AWL=HASH(0xa09da08)', 'Mail::SpamAssassin::PerMsgStatus=HASH(0xa67060c)') called at (eval 2 80) line 7\n\tMail::SpamAssassin::PerMsgStatus::check_f... This could be simply what spamassassin was doing at the point you ran out of time. One possible reason for timeouts is sa-learn is running an expiry, and possibly learning a message at the same time. The Debian package of amavisd-new has a cron entry that runs --force-expire once a day (/etc/cron.daily/amavisd-new). You can disable opportunistic expiry by setting: bayes_auto_expire 0 in local.cf, but MAKE SURE the script works or Bayes will grow forever. Simply run it. If it takes a minute to run, it's very likely working. The script may be outdated also. The important part should read something like: su - amavis -c '/usr/bin/sa-learn --sync --force-expire >/dev/null' Moving to MySQL helps considerably: http://www200.pair.com/mecham/spam/debian-spamassassin-sql.html Gary V _ Add a Yahoo! contact to Windows Live Messenger for a chance to win a free trip! http://www.imagine-windowslive.com/minisites/yahoo/default.aspx?locale=en-us&hmtagline
Re: SA TIMED OUT message debian sarge
On 11/3/06, Mark Martinec <[EMAIL PROTECTED]> wrote: On Friday November 3 2006 05:23, Matt Kettler wrote: > I believe the option is $sa_timeout > Not sure what the default is, probably 30. Which should be enough to > prevent that problem, unless you have a LOT of sa instances contending > for the AWL database. > Try adding a $sa_timeout = 60 to your Amavisd.conf and lock_method > flock to your spamassassin/local.cf (if you don't use NFS for DB storage.) Thanks for all the replies on this topic.. With a combination of the answers, i *seem* to have it sorted as well as a couple of good hints to increase speed etc. Thanks again.
Re: SA TIMED OUT message debian sarge
On Friday November 3 2006 05:23, Matt Kettler wrote: > I believe the option is $sa_timeout > Not sure what the default is, probably 30. Which should be enough to > prevent that problem, unless you have a LOT of sa instances contending > for the AWL database. > Try adding a $sa_timeout = 60 to your Amavisd.conf and lock_method > flock to your spamassassin/local.cf (if you don't use NFS for DB storage.) A note for the archive: $sa_timeout is relevant primarily for versions older than 2.4.0. SA allowed time is now controlled primarily through $child_timeout, which defaults to 8 minutes, 2/3 of that is 5+ minutes. The 2.4.0 release notes say: - added ability to kill externally running decoder process or a command-line virus scanner process if running for too long; ...; allowed time is calculated as 2/3 of the remaining time (initially at $child_timeout), but at least 10 seconds; - use the same timeout calculation as above for calls to SA, taking $sa_timeout instead if that value is bigger than the calculated time, thus making $sa_timeout pretty much redundant; Mark
Re: SA TIMED OUT message debian sarge
Simon wrote: > On 11/3/06, Matt Kettler <[EMAIL PROTECTED]> wrote: >> Simon wrote: >> > Hi There, >> > >> > Using spamassassin 3.1.3-0bpo1 from backports.org on debian sarge. We >> > did have the standard 3.0.x sarge package. Using amavis-new to call >> > spamassassin and after upgrading spamassassin we are now getting these >> > messages in the mail.log. Would someone please be able to assist in >> > where to go with this one? >> >> Looks like for some reason a SA instance couldn't get a lock on the AWL >> database to update it before amavis killed it. >> >> Provided you don't have your bayes or AWL stored on an NFS share, you >> might consider switching to lock_method flock. That will speed up >> lock/release operations. >> >> However, it's very strange that it timed out locking the AWL.. Normally >> SA processes aren't in the AWL very long. >> >> Is your amavis set with an abnormally short timeout for SA? > > Hmm.. Where do find this setting in my amavis conf file? These are the > current settings: I believe the option is $sa_timeout Not sure what the default is, probably 30. Which should be enough to prevent that problem, unless you have a LOT of sa instances contending for the AWL database. Try adding a $sa_timeout = 60 to your Amavisd.conf and lock_method flock to your spamassassin/local.cf (if you don't use NFS for DB storage.)
Re: SA TIMED OUT message debian sarge
Is your amavis set with an abnormally short timeout for SA? Hmm.. Where do find this setting in my amavis conf file? The default is 30 seconds (at least in older versions of amavisd-new). You can add: $sa_timeout = 50; As Matt says, 'lock_method flock' will also help. Are you using Pyzor? If so, changing to the mirror will also help: echo "82.94.255.100:24441" > /var/lib/amavis/.pyzor/servers Gary V _ Get today's hot entertainment gossip http://movies.msn.com/movies/hotgossip?icid=T002MSN03A07001
Re: SA TIMED OUT message debian sarge
On 11/3/06, Matt Kettler <[EMAIL PROTECTED]> wrote: Simon wrote: > Hi There, > > Using spamassassin 3.1.3-0bpo1 from backports.org on debian sarge. We > did have the standard 3.0.x sarge package. Using amavis-new to call > spamassassin and after upgrading spamassassin we are now getting these > messages in the mail.log. Would someone please be able to assist in > where to go with this one? Looks like for some reason a SA instance couldn't get a lock on the AWL database to update it before amavis killed it. Provided you don't have your bayes or AWL stored on an NFS share, you might consider switching to lock_method flock. That will speed up lock/release operations. However, it's very strange that it timed out locking the AWL.. Normally SA processes aren't in the AWL very long. Is your amavis set with an abnormally short timeout for SA? Hmm.. Where do find this setting in my amavis conf file? These are the current settings: $sa_tag_level_deflt = 0.0; # add spam info headers if at, or above that level $sa_tag2_level_deflt = 4.0; # add 'spam detected' headers at that level $sa_kill_level_deflt = 5.0; # triggers spam evasive actions $sa_dsn_cutoff_level = 999; # spam level beyond which a DSN is not sent $sa_mail_body_size_limit = 200*1024; # don't waste time on SA if mail is larger $sa_local_tests_only = 0;# only tests which do not require internet access? $sa_auto_whitelist = 1; # turn on AWL in SA 2.63 or older (irrelevant # for SA 3.0, cf option is 'use_auto_whitelist') Thanks!
Re: SA TIMED OUT message debian sarge
Simon wrote: > Hi There, > > Using spamassassin 3.1.3-0bpo1 from backports.org on debian sarge. We > did have the standard 3.0.x sarge package. Using amavis-new to call > spamassassin and after upgrading spamassassin we are now getting these > messages in the mail.log. Would someone please be able to assist in > where to go with this one? Looks like for some reason a SA instance couldn't get a lock on the AWL database to update it before amavis killed it. Provided you don't have your bayes or AWL stored on an NFS share, you might consider switching to lock_method flock. That will speed up lock/release operations. However, it's very strange that it timed out locking the AWL.. Normally SA processes aren't in the AWL very long. Is your amavis set with an abnormally short timeout for SA?
Re: SA TIMED OUT
Gary V wrote: spamassassin -D --lint is giving me an error: [2533] warn: config: failed to parse line, skipping: dcc_timeout 18 BTW, as Matt says, your DNS may be slow. If DCC doesn't respond within 10 seconds, I would imagine it's unlikely it will respond - so I wouldn't waste time waiting around another 8 seconds. Many people find a local caching DNS server really helps on net tests. Gary V Yes, I have been using a caching NS prior to rebuilding the machine yesterday. I simply forgot to turn it on this time. Duh. Thanks, Mike -- IBM: Icons Bygones My Mom's 22:50:01 up 3:18, 0 users, load average: 0.53, 0.30, 0.22 Linux Registered User #241685 http://counter.li.org
Re: SA TIMED OUT
Gary V wrote: I upgraded to SA 3.1.4 last night and now I have two issues that I'm trying to resolve: (1) spamassassin -D --lint is giving me an error: [2533] warn: config: failed to parse line, skipping: dcc_timeout 18 You need to enable (uncomment) the DCC plugin in v310.pre Done and the error is gone now. (2) In the logs I'm seeing a good number of the following type of entry: Oct 27 15:40:21 moe amavis[2548]: (02548-01-2) (!)SA TIMED OUT, backtrace: at <...> I've checked the archives and maybe I missed something, but I wasn't able to find anything that seemed relavent. Thanks for any pointers. Mike The newer version takes longer to scan (quite noticable on a low powered system). Newer versions of amavisd-new allow scans to take longer without timomg out where older versions have a default of $sa_timeout = 30; which should be included in amavisd.conf and raised to something like 60 seconds. I also suggest moving Bayes to SQL, and if not, then set lock_method = flock in local.cf if appropriate. http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html#miscellaneous_options Thanks Gary for the explanation. I will check into all of these. Thanks, Mike _ Try Search Survival Kits: Fix up your home and better handle your cash with Live Search! http://imagine-windowslive.com/search/kits/default.aspx?kit=improve&locale=en-US&source=hmtagline -- May the bugs of many programs nest on your hard drive. 22:45:01 up 3:13, 0 users, load average: 0.10, 0.17, 0.17 Linux Registered User #241685 http://counter.li.org
Re: SA TIMED OUT
Matt Kettler wrote: M. Lewis wrote: I upgraded to SA 3.1.4 last night and now I have two issues that I'm trying to resolve: (1) spamassassin -D --lint is giving me an error: [2533] warn: config: failed to parse line, skipping: dcc_timeout 18 If you've not edited /etc/mail/spamassassin/v310.pre to load the dcc plugin, dcc is disabled by default (it's not free for everyone to use, so disabled pending your decision that your use falls under DCC's license.. most folks do, but check the license. Without any DCC support loaded, the dcc_timeout option is meaningless to SA. This was indeed the problem. Error gone now. (2) In the logs I'm seeing a good number of the following type of entry: Oct 27 15:40:21 moe amavis[2548]: (02548-01-2) (!)SA TIMED OUT, backtrace: at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/DnsResolver.pm line 363\n\teval {...} called at Sounds like your DNS is slow, and you've got a short $sa_timeout in your amavis configs. But I'm no amavis expert. Actually I rebuilt this machine last night and forgot to turn on the cacheing NS. That made a difference! Thanks Matt! -- May the bugs of many programs nest on your hard drive. 22:45:01 up 3:13, 0 users, load average: 0.10, 0.17, 0.17 Linux Registered User #241685 http://counter.li.org
RE: SA TIMED OUT
spamassassin -D --lint is giving me an error: [2533] warn: config: failed to parse line, skipping: dcc_timeout 18 BTW, as Matt says, your DNS may be slow. If DCC doesn't respond within 10 seconds, I would imagine it's unlikely it will respond - so I wouldn't waste time waiting around another 8 seconds. Many people find a local caching DNS server really helps on net tests. Gary V _ Stay in touch with old friends and meet new ones with Windows Live Spaces http://clk.atdmt.com/MSN/go/msnnkwsp007001msn/direct/01/?href=http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us
RE: SA TIMED OUT
I upgraded to SA 3.1.4 last night and now I have two issues that I'm trying to resolve: (1) spamassassin -D --lint is giving me an error: [2533] warn: config: failed to parse line, skipping: dcc_timeout 18 You need to enable (uncomment) the DCC plugin in v310.pre (2) In the logs I'm seeing a good number of the following type of entry: Oct 27 15:40:21 moe amavis[2548]: (02548-01-2) (!)SA TIMED OUT, backtrace: at <...> I've checked the archives and maybe I missed something, but I wasn't able to find anything that seemed relavent. Thanks for any pointers. Mike The newer version takes longer to scan (quite noticable on a low powered system). Newer versions of amavisd-new allow scans to take longer without timomg out where older versions have a default of $sa_timeout = 30; which should be included in amavisd.conf and raised to something like 60 seconds. I also suggest moving Bayes to SQL, and if not, then set lock_method = flock in local.cf if appropriate. http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html#miscellaneous_options _ Try Search Survival Kits: Fix up your home and better handle your cash with Live Search! http://imagine-windowslive.com/search/kits/default.aspx?kit=improve&locale=en-US&source=hmtagline
Re: SA TIMED OUT
M. Lewis wrote: > > I upgraded to SA 3.1.4 last night and now I have two issues that I'm > trying to resolve: > > (1) > spamassassin -D --lint is giving me an error: > [2533] warn: config: failed to parse line, skipping: dcc_timeout 18 If you've not edited /etc/mail/spamassassin/v310.pre to load the dcc plugin, dcc is disabled by default (it's not free for everyone to use, so disabled pending your decision that your use falls under DCC's license.. most folks do, but check the license. Without any DCC support loaded, the dcc_timeout option is meaningless to SA. > > > (2) > In the logs I'm seeing a good number of the following type of entry: > Oct 27 15:40:21 moe amavis[2548]: (02548-01-2) (!)SA TIMED OUT, > backtrace: at > /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/DnsResolver.pm line > 363\n\teval {...} called at > Sounds like your DNS is slow, and you've got a short $sa_timeout in your amavis configs. But I'm no amavis expert.