Re: excessive scan time
On 22-Jan-2009, at 13:57, Brian J. Murrell wrote: Now users need to know how to edit SQL records, or I need to install a web interface for that. The ROI here for that is just not high enough. Really? A webface to edit user configuration options in an SQL database is trivial. I know its trivial because *I* can do it. -- "Whose motorcycle is this?" "It's chopper, baby." "Whose chopper is this?" "It's Zed's." "Who's Zed?" "Zed' dead, baby. Zed's dead."
Re: excessive scan time
Brian J. Murrell wrote: I'd also suggest using SQL for user preferences. The user interface (i.e. editing a file) for user preferences is a different story. Now users need to know how to edit SQL records, or I need to install a web interface for that. Or you use a small script that reads the users preferences from file (when the file has been modified) and updates the SQL database. Regards /Jonas -- Jonas Eckerman, FSDB & Fruktträdet http://whatever.frukt.org/ http://www.fsdb.org/ http://www.frukt.org/
Re: excessive scan time
On Thu, 22 Jan 2009 12:37:09 +, Justin Mason wrote: > you should definitely investigate ways to avoid doing NFS reads/writes > of the bayes files -- that is extremely I/O intensive, and NFS deals > with it very badly. OK. Noted. Maybe I will push the bayes database into MySQL as previously suggested. Thanx! b.
Re: excessive scan time
On Thu, 22 Jan 2009 13:27:57 +0100, Jonas Eckerman wrote: > > If you're not allready using a SQL database for bayes and AWL I'd > suggest you do that. Those two I might be willing to consider, however... > I'd also suggest using SQL for user preferences. The user interface (i.e. editing a file) for user preferences is a different story. Now users need to know how to edit SQL records, or I need to install a web interface for that. The ROI here for that is just not high enough. > With bayes, AWL and user prefs in a SQL database that problem ought to > be avoided. (Maybe there's more than those that should be moved from > ~/.spamassassin though). Yeah. I tend to doubt those are the real culprits. I think I have identified a backup process on the same server that does the NFS and mail as being quite expensive in both disk an memory and it's probably what is contending with spamd processes for resources. b.
Re: excessive scan time
you should definitely investigate ways to avoid doing NFS reads/writes of the bayes files -- that is extremely I/O intensive, and NFS deals with it very badly. --j. On Thu, Jan 22, 2009 at 12:27, Jonas Eckerman wrote: > Brian J. Murrell wrote: > >> One thing worth noting is that I have spamassassin using ~/.spamassassin >> here and people's home dirs can be (i.e. NFS) mounted from remote machines >> (i.e. their primary workstations), which do occasionally get shut down. > > If you're not allready using a SQL database for bayes and AWL I'd suggest > you do that. > > I'd also suggest using SQL for user preferences. > >> I wonder what happens in the MTA->SA->local delivery process chain when >> ~/.spamassassin is unavailable, or worse, on a stale mount. > > With bayes, AWL and user prefs in a SQL database that problem ought to be > avoided. (Maybe there's more than those that should be moved from > ~/.spamassassin though). > > /Jonas > -- > Jonas Eckerman, FSDB & Fruktträdet > http://whatever.frukt.org/ > http://www.fsdb.org/ > http://www.frukt.org/ > >
Re: excessive scan time
Brian J. Murrell wrote: One thing worth noting is that I have spamassassin using ~/.spamassassin here and people's home dirs can be (i.e. NFS) mounted from remote machines (i.e. their primary workstations), which do occasionally get shut down. If you're not allready using a SQL database for bayes and AWL I'd suggest you do that. I'd also suggest using SQL for user preferences. I wonder what happens in the MTA->SA->local delivery process chain when ~/.spamassassin is unavailable, or worse, on a stale mount. With bayes, AWL and user prefs in a SQL database that problem ought to be avoided. (Maybe there's more than those that should be moved from ~/.spamassassin though). /Jonas -- Jonas Eckerman, FSDB & Fruktträdet http://whatever.frukt.org/ http://www.fsdb.org/ http://www.frukt.org/
Re: excessive scan time
On Mon, 19 Jan 2009 16:47:24 +0100, Matus UHLAR - fantomas wrote: > > When did you sa-update for last time? Ubuntu appears to install a cron.daily cron job which does this amongst other things. > How many processes are you running > in parallel? I have a pretty low volume system but I did just up it from 5 to 8 yesterday. > Aren't you running out of memory? No. >> a) determine why the scan time is so long, after the fact (i.e. I could >>try to run the same spam through a "spamassassin -D [-t]" but there >>is no guarantee that whatever took so long the first time through >>will again take so long)? > > try running spamasssin with -L option How will -L (local tests only) help me determine which remote tests are taking so long? >> b) reduce some timeouts of some particular tests so that the total test >>time does not exceed a reasonable threshold? > > razor,pyzor,dcc,spf,dkim,rbl have their timeouts (*_timeout), see their > (or SpamAssassin) docs. Indeed. "dpkg -L spamassassin | xargs grep _timeout" shows some very interesting results. Now that I think about it, I wonder if I am barking up the wrong tree. One thing worth noting is that I have spamassassin using ~/.spamassassin here and people's home dirs can be (i.e. NFS) mounted from remote machines (i.e. their primary workstations), which do occasionally get shut down. I wonder what happens in the MTA->SA->local delivery process chain when ~/.spamassassin is unavailable, or worse, on a stale mount. Is there a reasonable timeout built in to trying to read from that dir? Thots? b.
Re: excessive scan time
On 19.01.09 15:33, Brian J. Murrell wrote: > I'm running 3.2.4(-1ubuntu1) of spamassassin here and have been noticing > some excessive scan times. i.e.: > > Jan 18 19:07:28 linux spamd[30216]: spamd: result: Y 14 - > AWL,BAYES_99,DCC_CHECK,DIGEST_MULTIPLE,HTML_IMAGE_ONLY_20,HTML_IMAGE_RATIO_06,HTML_MESSAGE,HTML_SHORT_LINK_IMG_3,MIME_HTML_ONLY,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK,RDNS_NONE,TVD_APPROVED,URIBL_BLACK > scantime=604.3,size=3325,user=brian,uid=1001,required_score=5.5,rhost=localhost,raddr=127.0.0.1,rport=49135,mid=<20090118234025.2fa951cc7...@66v.uwp30.udelmarva.com>,bayes=1.00,autolearn=spam > > The result of this (604 second) scan time is that the MTA ends up giving > up waiting after 600 seconds and the scan result is essentially wasted. > No doubt some kind of "remote" test is taking an excessive amount of time. When did you sa-update for last time? How many processes are you running in parallel? Aren't you running out of memory? > a) determine why the scan time is so long, after the fact (i.e. I could >try to run the same spam through a "spamassassin -D [-t]" but there is >no guarantee that whatever took so long the first time through will >again take so long)? try running spamasssin with -L option > b) reduce some timeouts of some particular tests so that the total test >time does not exceed a reasonable threshold? razor,pyzor,dcc,spf,dkim,rbl have their timeouts (*_timeout), see their (or SpamAssassin) docs. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Microsoft dick is soft to do no harm
excessive scan time
I'm running 3.2.4(-1ubuntu1) of spamassassin here and have been noticing some excessive scan times. i.e.: Jan 18 19:07:28 linux spamd[30216]: spamd: result: Y 14 - AWL,BAYES_99,DCC_CHECK,DIGEST_MULTIPLE,HTML_IMAGE_ONLY_20,HTML_IMAGE_RATIO_06,HTML_MESSAGE,HTML_SHORT_LINK_IMG_3,MIME_HTML_ONLY,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK,RDNS_NONE,TVD_APPROVED,URIBL_BLACK scantime=604.3,size=3325,user=brian,uid=1001,required_score=5.5,rhost=localhost,raddr=127.0.0.1,rport=49135,mid=<20090118234025.2fa951cc7...@66v.uwp30.udelmarva.com>,bayes=1.00,autolearn=spam The result of this (604 second) scan time is that the MTA ends up giving up waiting after 600 seconds and the scan result is essentially wasted. No doubt some kind of "remote" test is taking an excessive amount of time. How can I: a) determine why the scan time is so long, after the fact (i.e. I could try to run the same spam through a "spamassassin -D [-t]" but there is no guarantee that whatever took so long the first time through will again take so long)? b) reduce some timeouts of some particular tests so that the total test time does not exceed a reasonable threshold? Thanx, b.