Re: 3.0.2 and SARE
I am thinking of downloading rules from SARE. However, I am told that some or many of the rules have already been incorporated into 3.0.2 Can someone recommend the best approach to avoid duplicates ? Sure. Read the docs on the SARE rules page relating to each ruleset. We document which ones you should and shouldn't use with various releases. Loren
Re: Phishing attempts getting through.
Can someone expand on the ClamAV detecting phishing attempts. Or direct me some where? Pick up some of the SARE rulesets. I think spoof or fraud is the one that contains an assortment of phishhooks. Won't get 'em all, but will sure cut down on the more common ones. Loren
Re: increasing children
- From 'man spamd': - -m num, --max-children=num Allow maximum num children Just set that as desired in your script that starts up the spamd daemon. HAHAHA!! OMG, I was looking at the wrong man file. ^_^;; Thanks for the help.
Re: my girlfriend is getting ticked :)
Matthew Lenz wrote: X-Spam-Status: No, score=4.1 required=5.0 tests=BAYES_99,HTML_80_90, HTML_FONT_BIG,HTML_MESSAGE,HTML_TITLE_EMPTY,MIME_HTML_ONLY, MSGID_FROM_MTA_ID autolearn=no version=3.0.2 I see your false negative scored 99% on bayes. The BAYES_99 rule has a much lower score in v3 than it did in v2. My users started bitching after the upgrade the 3 because all the sudden spam was starting to get through. Tweaking up the bayes scores a bit helped significantly. Steven
Re: my girlfriend is getting ticked :)
Mike Jackson wrote: Your bayes database looked to be reasonably trained. The false-negative was labeled 99% spam by Bayes. I don't see any RBL checks, which might have made the difference on this one, if it's already been seen and flagged. Do you have Net::DNS installed and the RLB tests enabled? What happens if you feed it through spamassassin with the -D flag? In my experience, it's more efficient to let the MTA handle the RBL checks instead of Spamassassin. I can't remember what MTA the OP was using, but it's trivial to set them up in Sendmail. On my employer's boxes, I use the spamhaus.org lists, but on my personal box (where I can be much more aggressive) I use a few of the rfc-ignorant.org lists and ws.surbl.org. The spamhaus lists are checked first, and they're highly effective. Agreed, I setup my postfix to do the checks and it's made a world of difference. The OP never said what OS/MTA is being used.
Re: my girlfriend is getting ticked :)
On Wednesday, March 30, 2005, 2:20:17 PM, Mike Jackson wrote: Your bayes database looked to be reasonably trained. The false-negative was labeled 99% spam by Bayes. I don't see any RBL checks, which might have made the difference on this one, if it's already been seen and flagged. Do you have Net::DNS installed and the RLB tests enabled? What happens if you feed it through spamassassin with the -D flag? In my experience, it's more efficient to let the MTA handle the RBL checks instead of Spamassassin. I can't remember what MTA the OP was using, but it's trivial to set them up in Sendmail. On my employer's boxes, I use the spamhaus.org lists, but on my personal box (where I can be much more aggressive) I use sbl.spamhaus.org and list.dsbl.org on most of the MTAs I have visibility on. I use a few of the rfc-ignorant.org lists and ws.surbl.org. The spamhaus lists are checked first, and they're highly effective. H, ws.surbl.org shouldn't be used as a regular RBL. It has very few IP addresses, and most of those are probably web servers. So it won't match most of the IP address RBL checks a plain old MTA would do. SURBLs are meant to match message body URIs, not mail senders. Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/
Re: my girlfriend is getting ticked :)
- Original Message - From: AltGrendel To: users@spamassassin.apache.org Sent: Wednesday, March 30, 2005 8:50 PM Subject: Re: my girlfriend is getting ticked :) Mike Jackson wrote: Your bayes database looked to be reasonably trained. The false-negative was labeled 99% spam by Bayes. I don't see any RBL checks, which might have made the difference on this one, if it's already been seen and flagged. Do you have Net::DNS installed and the RLB tests enabled? What happens if you feed it through spamassassin with the -D flag? In my experience, it's more efficient to let the MTA handle the RBL checks instead of Spamassassin. I can't remember what MTA the OP was using, but it's trivial to set them up in Sendmail. On my employer's boxes, I use the spamhaus.org lists, but on my personal box (where I can be much more aggressive) I use a few of the rfc-ignorant.org lists and ws.surbl.org. The spamhaus lists are checked first, and they're highly effective. Agreed, I setup my postfix to do the checks and it's made a world of difference. The OP never said what OS/MTA is being used. actually i did in my first post I'm using 3.0.2 on a debian woody box. Its from www.backports.org (great site)
Re: my girlfriend is getting ticked :)
On Wednesday, March 30, 2005, 2:21:01 PM, Matthew Lenz wrote: I just installed backports perl-libnet-dns (.48, hope that is new enough .49 is the newest). Is there anywhere I can check to see if 'network tests' (what the SURBL says needs to be enabled) are enabled? Set your trust path correctly: (quoteing Matt Kettler:) Please see the Wiki: http://wiki.apache.org/spamassassin/TrustPath/ and look up trusted_networks in man Mail::SpamAssassin::Conf And enable network tests: http://www.surbl.org/faq.html#nettest And things should work much better. Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/
Re: my girlfriend is getting ticked :)
Matthew Lenz wrote: - Original Message - From: AltGrendel To: users@spamassassin.apache.org Sent: Wednesday, March 30, 2005 8:50 PM Subject: Re: my girlfriend is getting ticked :) Mike Jackson wrote: Your bayes database looked to be reasonably trained. The false-negative was labeled 99% spam by Bayes. I don't see any RBL checks, which might have made the difference on this one, if it's already been seen and flagged. Do you have Net::DNS installed and the RLB tests enabled? What happens if you feed it through spamassassin with the -D flag? In my experience, it's more efficient to let the MTA handle the RBL checks instead of Spamassassin. I can't remember what MTA the OP was using, but it's trivial to set them up in Sendmail. On my employer's boxes, I use the spamhaus.org lists, but on my personal box (where I can be much more aggressive) I use a few of the rfc-ignorant.org lists and ws.surbl.org. The spamhaus lists are checked first, and they're highly effective. Agreed, I setup my postfix to do the checks and it's made a world of difference. The OP never said what OS/MTA is being used. actually i did in my first post I'm using 3.0.2 on a debian woody box. Its from www.backports.org (great site) Ok, so you're using Spamassassin 3.0.2 on Debian. Are you using Sendmail, qmail, courier, or postfix? I honestly don't know that Debian uses as a default mailserver.
RE: my girlfriend is getting ticked :)
I'm using 3.0.2 on a debian woody box. Its from www.backports.org (great site) Ok, so you're using Spamassassin 3.0.2 on Debian. Are you using Sendmail, qmail, courier, or postfix? I honestly don't know that Debian uses as a default mailserver. Exim.
Re: my girlfriend is getting ticked :)
Michael Bellears wrote: I'm using 3.0.2 on a debian woody box. Its from www.backports.org (great site) Ok, so you're using Spamassassin 3.0.2 on Debian. Are you using Sendmail, qmail, courier, or postfix? I honestly don't know that Debian uses as a default mailserver. Exim. Ok, so you might want to check out this: http://www.exim.org/howto/rbl.html if you haven't. I've been working with a postfix/amavis-new/spamassassin/clamav setup, so I probably wouldn't be much help. Since I started doing RBL checks at the MTA(Exim for you) level I've seen a radical reduction in spam. It can save on processing too since the spam never gets past the received stage. Be careful though, some lists are much less forgiving than others and can block legit traffic. Good luck.
Rule Design Benchmark/Resource Question
Before i pull my hair out doing bench/resource test, i was wondering if anyone out there knew if there was much of a speed/resource usage difference between the following way of writing the same rule. Method A: bodyrule_a /(?:feh|meh|bleh)/i vs. Method B: bod __rule_a/(?:feh)/i body__rule_b/(?:meh)/i body__rule_c/(?:bleh)/i metarule_d (__rule_a || __rule_b || __rule_c) There probably isn't much difference using just 3 rules, but i'm thinking more along the lines of large(500+) lists and it isn't limited to just body stuff. So if anyone has some realworld benching/experience with what is preferred or if the developers know which is faster for SA, i would love the input. -Rocky -- __ what's with today, today? Email: [EMAIL PROTECTED] PGP:http://rocky.mindphone.org/rocky_mindphone.org.gpg signature.asc Description: Digital signature
Re: Anyway to have SA drop high spam tag other?
Hello Bill, Wednesday, March 30, 2005, 8:15:05 AM, you wrote: B I'm running Spamassassin 3.0.2 on Linux in front of Oracle Collaboration Suite. B I am using B proxsmtp(http://memberwebs.com/nielsen/software/proxsmtp/) to scan B mail and then pass it along to oracle. B My question is... B I know that you can have spamassassin exit with a non-zero code if it detects B spam by using the 'spamassasin -e' option. Does anyone know if it is possible B to have SA tag spam and not exit as usual, but exit with a non-zero code if say B the score is over 10? B The other options I came up with is to either write script to check the level, B or have SA run twice...once to tag and once to drop. Or, since SA is only a filter, have the SA output feed a script which a) copies input to output unchanged, and b) interprets the score from the X-Spam-Status header, and then exits with int(score) (0 if negative). Bob Menschel
can Pyzor run localy?
Hi, I have a few questuions regrding the benefit/use of SA fatures. 1. Can Pyzord runlocaly asSURBL does with rbldnsd(check the messagewith local repository, not with thePyzor web servers) ? 2.I wouldlike toactivate more features toSA (I currently use only SARE rules). We are considering SURBL, DCC and Pyzor. My question is - whatare the preferable features that I can add to SA, that willresult inbetter spam identification, and that will cost the lowest in performance time? Thanks a lot. Alan Do you Yahoo!? Yahoo! Mail - Helps protect you from nasty viruses.
Re: Anyway to have SA drop high spam tag other?
On Wednesday 30 March 2005 08:02 pm, Robert Menschel wrote: Hello Bill, Wednesday, March 30, 2005, 8:15:05 AM, you wrote: B I'm running Spamassassin 3.0.2 on Linux in front of Oracle Collaboration Suite. B I am using B proxsmtp(http://memberwebs.com/nielsen/software/proxsmtp/) to scan B mail and then pass it along to oracle. B My question is... B I know that you can have spamassassin exit with a non-zero code if it detects B spam by using the 'spamassasin -e' option. Does anyone know if it is possible B to have SA tag spam and not exit as usual, but exit with a non-zero code if say B the score is over 10? B The other options I came up with is to either write script to check the level, B or have SA run twice...once to tag and once to drop. Or, since SA is only a filter, have the SA output feed a script which a) copies input to output unchanged, and b) interprets the score from the X-Spam-Status header, and then exits with int(score) (0 if negative). Bob Menschel SA doesn't drop mail. It simply tags it. If you want to dev null mail, that's what procmail is for. -- _ John Andersen pgpnygpNNqhFA.pgp Description: signature
Re: Bigevil file is gone.
Hurray and... can't wait -- Martin Hepworth Snr Systems Administrator Solid State Logic Tel: +44 (0)1865 842300 Chris Santerre wrote: Too much traffic used for a file no longer updated. BigEvil has been removed. I shall replace it with our newest ruleset soon. Its a real corker ;) Chris Santerre System Admin and SARE/SURBL Ninja http://www.rulesemporium.com http://www.surbl.org 'It is not the strongest of the species that survives, not the most intelligent, but the one most responsive to change.' Charles Darwin ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote confirms that this email message has been swept for the presence of computer viruses and is believed to be clean. **
Getting around URIRBLs
h3hLeo BreebaartttTiggletp:Graycat/SOGP/getSeyreniapoLena WilliamsrPatrick DersjantnoSimon WaldmandvSorchad.Guitar HuwcoThe Senior Wranglerm/h3 This looks like an effective way of getting round URIRBLs (though of course it requires the end user to cut and paste). The rule below seems to catch the technique. Any suggestions for improving it or any other rules to suggest? # 2005-03-31 new rule rawbody local_OBFU_HTTP /(?!https?:\/\/)h(?:.+)?t(?:.+)?t(?:.+)?p(?:.+)?s?(?:.+)?:(?:.+)?\/(?:.+)?\/(?:.+)?/im describe local_OBFU_HTTP HTTP obfuscated with tags scorelocal_OBFU_HTTP 1.0 John. -- -- Over 2500 webcams from ski resorts around the world - www.snoweye.com -- Translate your technical documents and web pages- www.tradoc.fr
Re: Anyway to have SA drop high spam tag other?
I'm running Spamassassin 3.0.2 on Linux in front of Oracle Collaboration Suite. I am using proxsmtp(http://memberwebs.com/nielsen/software/proxsmtp/) to scan mail and then pass it along to oracle. My question is... I know that you can have spamassassin exit with a non-zero code if it detects spam by using the 'spamassasin -e' option. Does anyone know if it is possible to have SA tag spam and not exit as usual, but exit with a non-zero code if say the score is over 10? The other options I came up with is to either write script to check the level, or have SA run twice...once to tag and once to drop. If anyone has any ideas that would be great, thanks in advance! --Bill I don't know about proxysmtp, but it should be possible with amavisd that calls spamassassin/clamd/etc. You can decide there at what level (tag2_level) the spam gets marked in the subject and at what level (kill_level) it is handled as spam. So you could set tag2_level to 5 and kill_level to 10. After reaching kill_level you can configure what action has to be taken, pass the mail, bounce it, or discard it. A disadvantage is that the quarantine/archive is triggered by the kill_level so you won't have a spam-archive of the 5 to 10 spam-mails. Menno van Bennekom
Re: Getting around URIRBLs
i wrote something similar to this but instead of of using .+, i used [^]+, supposedly a tad faster, iirc. also writing s?(?:.+)? as (?:s(?:[^]+)?)? should be slightly faster cause if it fails to match on the 's' it won't move on to check for the stuff -Rocky On Thu, Mar 31, 2005 at 11:35:26AM +0200, John Wilcock wrote: h3hLeo BreebaartttTiggletp:Graycat/SOGP/getSeyreniapoLena WilliamsrPatrick DersjantnoSimon WaldmandvSorchad.Guitar HuwcoThe Senior Wranglerm/h3 This looks like an effective way of getting round URIRBLs (though of course it requires the end user to cut and paste). The rule below seems to catch the technique. Any suggestions for improving it or any other rules to suggest? # 2005-03-31 new rule rawbody local_OBFU_HTTP /(?!https?:\/\/)h(?:.+)?t(?:.+)?t(?:.+)?p(?:.+)?s?(?:.+)?:(?:.+)?\/(?:.+)?\/(?:.+)?/im describe local_OBFU_HTTP HTTP obfuscated with tags scorelocal_OBFU_HTTP 1.0 John. -- -- Over 2500 webcams from ski resorts around the world - www.snoweye.com -- Translate your technical documents and web pages- www.tradoc.fr -- __ what's with today, today? Email: [EMAIL PROTECTED] PGP:http://rocky.mindphone.org/rocky_mindphone.org.gpg signature.asc Description: Digital signature
From dollars to pounds... and Nigeria to UK
Nigerian scams are evolving... I just received that one with only 3 rules matching (SA 2.6)... BAYES_90 2.10, RCVD_IN_SBL 1.11, RCVD_IN_SORBS 0.10 Return-Path: [EMAIL PROTECTED] Received: from vsmtp1.tin.it (vsmtp1.tin.it [212.216.176.141]) by (8.12.10/8.12.10) with ESMTP id j2V3snX8020382 for ; Thu, 31 Mar 2005 05:54:49 +0200 Received: from ims1d.cp.tin.it (192.168.70.101) by vsmtp1.tin.it (7.0.027) id 4238611B0060DD46; Thu, 31 Mar 2005 05:51:00 +0200 Received: from [192.168.70.183] by ims1d.cp.tin.it with HTTP; Thu, 31 Mar 2005 05:50:59 +0200 Date: Thu, 31 Mar 2005 05:50:59 +0200 Message-ID: [EMAIL PROTECTED] From: DAVID HESKEY [EMAIL PROTECTED] Subject: YOUR CONCEPT IS NEEDED[reply. Reply-To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: 80.179.243.4 The Auditor/Head of Department Bank of Scotland, United Kingdom. (Great Opportunity Very Urgent and Confidential) ___ Greetings, I am Dr David Heskey, the auditor and computing staff of a bank here in SCOTLAND UNITED KINGDOM. I discovered a dormant account in my office, as an auditor and head of computing department of a bank here in Scotland, United Kingdom.It will be in my interest to transfer this fund worth 15,000,000 million pounds in an account offshore. If you can be a collaborator to this please indicate interest immediately for us to proceed. Your contact phone numbers and name and your account information will be necessary for this effect. Here is my direct phone number (+447040110197) At the conclusion of this business, you will be given 35% of the total amount, 60% will be for me, while 5% will be for expenses both parties might have incurred during this process. More details awaits your positive reply. Regards and respect, Dr David Heskey
Re: HUMOR: 419 pic
On 3/30/2005 10:15 PM +0100, Chris Santerre wrote: For those of you who don't know, there is a group of ppl that lead 419 scammers on wild goose chases. One of the things they do is request pics for proof. THey have them do some funny stuff. (Bread and fish on head) This came accross my mail today. Pretty funny! (Contains the word p enis.) http://www.plus613.com/image/12046 Got this one 1-2 weeks ago, 419 scam, wants to give me millions :) http://asbak.coding-slaves.com/pic.jpg Niek --
Re: negative score from ALL_TRUSTED
Matt Kettler wrote: You have a broken trust path. ALL_TRUSTED should *never* match email from outside your network. But it does anyway, even when trust path is set correctly: http://bugzilla.spamassassin.org/attachment.cgi?id=2508 Happens when spamassassin fails to parse the Received header(s) comtaining the untrusted host(s). Arvinn
Re: can Pyzor run localy?
Alan Shine wrote: Hi, I have a few questuions regrding the benefit/use of SA fatures. 1. Can Pyzord run localy as SURBL does with rbldnsd (check the message with local repository, not with the Pyzor web servers) ? See: http://pyzor.sourceforge.net/ Since the entire system is released under the GPL, people are free to host their own independent servers. Server peering is planned for a future release. 2.I would like to activate more features to SA (I currently use only SARE rules). We are considering SURBL, DCC and Pyzor. My question is - what are the preferable features that I can add to SA, that will result in better spam identification, and that will cost the lowest in performance time? Probably SURBL but if you are going to enable network tests it is best to have as many activated as possible from the start. http://wiki.apache.org/spamassassin/SingleUserUnixInstall
Re: negative score from ALL_TRUSTED
At 09:07 AM 3/31/2005, Arvinn Løkkebakken wrote: You have a broken trust path. ALL_TRUSTED should *never* match email from outside your network. But it does anyway, even when trust path is set correctly: http://bugzilla.spamassassin.org/attachment.cgi?id=2508 Hmm. Well, that happens if and only if SA can't parse your received headers: Ususaly broken AV appliances that insert a by clause in front of the from clause cause this. One of Roy's headers does look strange to me, and might be unparseable because the from and by are in separate headers. I've never seen a working mailserver do that before, but that doesn't mean it's not parsable by SA. However, looking at Roy's headers, it looks like he might have a NATed mailserver too, which would definitely cause a broken trust path. However, you might want to suspect that problem after trying to set trust path. Setting a trust path may or may not fix the problem, but at least it's quick and easy. The patch is also not a sure-fire fix, as it will NOT help anyone suffering from the broken-trust-path problem. It will ONLY help those suffering from a broken mailserver.
sa-learn issues
I recently installed SA 3.0.2 on freeBSD 4.10, and it's working great, except for this one feature: I have things setup so each user has a spam folder that they will put missed spam in. This folder will later be trained from cron jobs using sa-learn. The problem is, it seems that sa-learn is ignoring the -u / --user= flag. No matter what I set it to, it trains for root instead of that user. I am verifying this by checking the /root/.spamassassin/ directory. Each time I run sa-learn, the bayes files in the directory are updated, instead of the files in /usr/home/user/.spamassassin/ bash-2.05b# /usr/local/bin/sa-learn -u userfoo --spam --showdots --mbox /usr/home/chip/SPAM .. Learned from 2 message(s) (2 message(s) examined). bash-2.05b# /usr/local/bin/sa-learn -u userbar --spam --showdots --mbox /usr/home/chip/SPAM .. Learned from 0 message(s) (2 message(s) examined) I have tried the following user flags: --user=userfoo --user=userfoo --user='userfoo' -u userfoo -u userfoo -u 'userfoo' Any idea what I am missing here?
Re: sa-learn issues
Chip wrote: The problem is, it seems that sa-learn is ignoring the -u / --user= flag. Of course it's ignoring it. There is no -u flag in sa-learn. ONLY spamc and spamd accept that flag. sa-learn uses the userid of the user that calls it. Period.
Re: sa-learn issues
Matt Kettler wrote: Chip wrote: The problem is, it seems that sa-learn is ignoring the -u / --user= flag. Of course it's ignoring it. There is no -u flag in sa-learn. ONLY spamc and spamd accept that flag. sa-learn uses the userid of the user that calls it. Period. man sa-learn says differently: -u username, --username=username Override username taken from the runtime environment However if this is the case, how do you use spamc to train spam on a users mbox?
Re: sa-learn issues *RETRACTED*
My bad, apparently 3.0.2 and 3.0.1 do have such a flag in sa-learn. I was looking at the 3.0.0 version, which does not. Matt Kettler wrote: Chip wrote: The problem is, it seems that sa-learn is ignoring the -u / --user= flag. Of course it's ignoring it. There is no -u flag in sa-learn. ONLY spamc and spamd accept that flag. sa-learn uses the userid of the user that calls it. Period.
Re: sa-learn issues
The problem is, it seems that sa-learn is ignoring the -u / --user= flag. (B (B Of course it's ignoring it. There is no -u flag in sa-learn. ONLY spamc (B and spamd accept that flag. (B (B sa-learn uses the userid of the user that calls it. Period. (B (BFrom TFM: (B (B-u username, --username=username Override username taken from the runtime (Benvironment (B (BAndre
redirect output from lint
Hello all Can someone tell me how i can redirect the output from the command: spamassassin --lint to a file or maybe grep / awk spamassassin --lint | grep something does not work! I want to autmate the checks of all rules bruno
.packlist
Anyone know why the /etc/mail/spamassassin and /usr/local/share/spamassassin stuff isn't being included in the .packlist? I realize that there might be some concern about them being removed if the package is uninstalled (unlikely) but its also kind of against everything for which the .packlist stands. -Matt
Re: redirect output from lint
[EMAIL PROTECTED] wrote: Hello all Can someone tell me how i can redirect the output from the command: spamassassin --lint to a file or maybe grep / awk spamassassin --lint | grep something does not work! I want to autmate the checks of all rules bruno Ahh, system admin 201, intermediate pipes and redirection lint's output goes to stderr, therefore you need to do this to redirect it. spamassassin --lint 2 file.out Note that the 2 means redirect filehandle 2 and file handle #2 is stderr. You can also do piping if you redirect stderr back to to stdout (handle 1): spamassassin --lint 21 | grep sometext somefile.out
Autolearn=failed when BAYES_00 is only rule hit
Please forgive me if this is in the archives; I'm having trouble finding it. I've just finished training my Bayes DB using sa-learn (perversely, when I was trying to collect 200 spam messages, the spammers decided to stop sending to me). Now that the DB is usable, it's interesting that while most ham messages produce at least one small rule hit and a negative Bayes score that results in Autolearn=no, when BAYES_00 is the ONLY rule that hits I get Autolearn=failed. Two quick questions: 1) What should I do about this, and 2) Should I worry, or just ignore it? TIA, -Don
Re: sa-learn issues
On Thu, Mar 31, 2005 at 01:23:24PM -0500, Chip wrote: I have things setup so each user has a spam folder that they will put missed spam in. This folder will later be trained from cron jobs using sa-learn. The problem is, it seems that sa-learn is ignoring the -u / --user= flag. No matter what I set it to, it trains for root instead of that user. I am verifying this by checking the /root/.spamassassin/ directory. Each time I run sa-learn, the bayes files in the directory are updated, instead of the files in /usr/home/user/.spamassassin/ This is a feature/shortcoming in the -u option for sa-learn when using non-SQL based bayes storage modules. That is why the documentation states: You can use this option to specify users in a virtual user configuration. Otherwise the bayes path, if unset via dbpath or in a .cf file is expanded to be in $ENV{HOME} which in your case is /root/. I added the -u specifically for BayesSQL users, since it doesn't refer to an actual directory on the filesystem. Feel free to file a bug report, but honestly it might end up being a documentation patch saying that -u is not effective for DBM storage. BTW, you can easily accomplish the same thing as root using su -c or similar mechanisms. Michael pgpbeEbvUpSUj.pgp Description: PGP signature
Re: sa-learn issues
Michael Parker wrote: On Thu, Mar 31, 2005 at 01:23:24PM -0500, Chip wrote: I have things setup so each user has a spam folder that they will put missed spam in. This folder will later be trained from cron jobs using sa-learn. The problem is, it seems that sa-learn is ignoring the -u / --user= flag. No matter what I set it to, it trains for root instead of that user. I am verifying this by checking the /root/.spamassassin/ directory. Each time I run sa-learn, the bayes files in the directory are updated, instead of the files in /usr/home/user/.spamassassin/ This is a feature/shortcoming in the -u option for sa-learn when using non-SQL based bayes storage modules. That is why the documentation states: You can use this option to specify users in a virtual user configuration. Otherwise the bayes path, if unset via dbpath or in a .cf file is expanded to be in $ENV{HOME} which in your case is /root/. I added the -u specifically for BayesSQL users, since it doesn't refer to an actual directory on the filesystem. Feel free to file a bug report, but honestly it might end up being a documentation patch saying that -u is not effective for DBM storage. BTW, you can easily accomplish the same thing as root using su -c or similar mechanisms. Michael Ahh ok. Make sense! I will change to a sql backend, as my users have no shell access and can't run the command as themselves. Thanks for the clarification!
Re: sa-learn issues
On Thu, Mar 31, 2005 at 02:24:23PM -0500, Chip wrote: Ahh ok. Make sense! I will change to a sql backend, as my users have no shell access and can't run the command as themselves. Thanks for the clarification! Not a bad idea. The Bayes SQL modules have proven to be stable and in most cases worth the effort, especially in a virtual user environment. You can find some more information on storing your SpamAssassin user data in a SQL database here: http://people.apache.org/~parker/presentations/ Michael PS Anyone interested in testing a MySQL specific Bayes Storage module? It requires MySQL 4.1, SA 3.1-dev and InnoDB tables if you want rollback on error. It also provides a 30-40% speed up in some cases. If so, shoot me an email and I'll send you a copy of the module. pgppL9kGtH8D9.pgp Description: PGP signature
SA rescore config file
Hi, I try to re-assign score, in masses/config file: SCORESET=3 HAM_PREFERENCE=2.0 THRESHOLD=5.0 EPOCHS=100 NOTE= What the SCORESET here mean? Do i need to change the HAM_PREFERENCE, THRESHOLD and EPOCHS value? Thanks.
Re: SA rescore config file
Lisheng Sun wrote: Hi, I try to re-assign score, in masses/config file: SCORESET=3 HAM_PREFERENCE=2.0 THRESHOLD=5.0 EPOCHS=100 NOTE= What the SCORESET here mean? Do i need to change the HAM_PREFERENCE, THRESHOLD and EPOCHS value? Thanks. umm what is masses/config file? You should only be changing scores in local.cf something like: score RULE_NAME_HERE VALUE_HERE ie: score BAYES_99 5.0 -Jim
Mysql 5.0 with SA 3.0
Hello Has someone tested Mysql 5.0 with SA3.0?
Re: Mysql 5.0 with SA 3.0
On Thu, Mar 31, 2005 at 10:12:38PM +0200, [EMAIL PROTECTED] wrote: Has someone tested Mysql 5.0 with SA3.0? Yes. Are you just asking? or did you find some sort of problem? I haven't found any problems so far, but all of my testing has been focused on BayesSQL. Michael pgpt1rIyLXb2C.pgp Description: PGP signature
Re: SA rescore config file
Jim Maul wrote: umm what is masses/config file? That's the configuration file for the mass-check tools. They impact the perceptron when evolving scoresets (advanced stuff) You should only be changing scores in local.cf something like: score RULE_NAME_HERE VALUE_HERE ie: score BAYES_99 5.0 That's true for the average user. However, it looks like Lisheng is trying to re-evolve an entire scoreset from the ground up, which is a very advanced topic. Lisheng, I think you probably want to leave the masses directory alone until you've got a better understanding of spamassassin in it's default configuration. Certainly you should already have an understanding of what the different scoresets are LONG before you consider trying to evolve one. There are probably less than 100 people in the entire world that ever play with the masses tools. Thousands of users of SA aren't even aware that they exist. These tools are mostly for the developers, and the very advanced user. Even most of the SARE ninjas only run a small number of the tools in here here. They largely use mass-check and hit-frequencies, without using the perceptron. I'd suggest using the default scores that come with SA to start with. The SA developers have gone to the effort of running these tools already to generate a good scoreset that works for most people. Once you've got a good feel for the basics, you can start looking at the highly advanced stuff. To answer your specific questions (bearing in mind that this is really a highly advanced user thing to be playing with) What the SCORESET here mean? SCORESET here is to pick which scoreset you are evolving scores for. See the score description in man Mail::SpamAssassin::Conf for a description of each scoreset. Do i need to change the HAM_PREFERENCE, THRESHOLD and EPOCHS value? No. These adjust the mathematics of how the perceptron runs while generating scoresets. You can run perceptron -h to see a short description of what these do. Unless you've read the perceptron code and have a really good feel for what they do, you probably only want to adjust these for experimental reasons.
Re: Rule Design Benchmark/Resource Question
Rocky Olsen wrote: Before i pull my hair out doing bench/resource test, i was wondering if anyone out there knew if there was much of a speed/resource usage difference between the following way of writing the same rule. Method A: body rule_a /(?:feh|meh|bleh)/i vs. Method B: bod__rule_a/(?:feh)/i body __rule_b/(?:meh)/i body __rule_c/(?:bleh)/i meta rule_d (__rule_a || __rule_b || __rule_c) There probably isn't much difference using just 3 rules, but i'm thinking more along the lines of large(500+) lists and it isn't limited to just body stuff. So if anyone has some realworld benching/experience with what is preferred or if the developers know which is faster for SA, i would love the input. To start with, use perl's regex debugger as your friend: $perl -Mre=debug -e /(?:feh|meh|bleh)/i size 11 Got 92 bytes for offset annotations. $ perl -Mre=debug -e /(?:feh)/i Freeing REx: `,' Compiling REx `(?:feh)' size 3 Got 28 bytes for offset annotations. (repeat 2 times) However, this only deals with part of the story. The cost of the regex itself. It does not deal with the per-rule overhead in SA. In general I'd favor the combined approach, unless for some reason your combined rule is considerably larger than the sum of it's parts. Bigevil ran much better once Chris S did some combining and common subexpression elimination. Also, I'd suggest eliminating the (?:) for the single-text-matches. It does nothing of use, and doesn't change the evaluation of the regex any for a simple single text match. All it does is waste 4 bytes of disk space per rule. body __RULE_A /feh/i instead of: body __RULE_A /(?:feh)/i I leave comparing the two using re=debug as an exercise for the student. Also compare to /(feh)/i and /(feh)\1/i to see how backtracking works.
Re: sa-learn issues
Chip wrote: Michael Parker wrote: On Thu, Mar 31, 2005 at 01:23:24PM -0500, Chip wrote: I have things setup so each user has a spam folder that they will put missed spam in. This folder will later be trained from cron jobs using sa-learn. The problem is, it seems that sa-learn is ignoring the -u / --user= flag. No matter what I set it to, it trains for root instead of that user. I am verifying this by checking the /root/.spamassassin/ directory. Each time I run sa-learn, the bayes files in the directory are updated, instead of the files in /usr/home/user/.spamassassin/ This is a feature/shortcoming in the -u option for sa-learn when using non-SQL based bayes storage modules. That is why the documentation states: You can use this option to specify users in a virtual user configuration. Otherwise the bayes path, if unset via dbpath or in a .cf file is expanded to be in $ENV{HOME} which in your case is /root/. I added the -u specifically for BayesSQL users, since it doesn't refer to an actual directory on the filesystem. Feel free to file a bug report, but honestly it might end up being a documentation patch saying that -u is not effective for DBM storage. BTW, you can easily accomplish the same thing as root using su -c or similar mechanisms. Michael Ahh ok. Make sense! I will change to a sql backend, as my users have no shell access and can't run the command as themselves. Thanks for the clarification! Changing the backend storage driver worked perfectly, well almost. When using DBM storage, the user_prefs file was automatically created when a new user got its first mail. Now using mySQL, the userpref table is empty. Is this the default behavior? Reason I ask is with no examples of what to put in the table, I am unsure of the syntax ;)
Re: sa-learn issues
Chip wrote: Michael Parker wrote: On Thu, Mar 31, 2005 at 01:23:24PM -0500, Chip wrote: I have things setup so each user has a spam folder that they will put missed spam in. This folder will later be trained from cron jobs using sa-learn. The problem is, it seems that sa-learn is ignoring the -u / --user= flag. No matter what I set it to, it trains for root instead of that user. I am verifying this by checking the /root/.spamassassin/ directory. Each time I run sa-learn, the bayes files in the directory are updated, instead of the files in /usr/home/user/.spamassassin/ This is a feature/shortcoming in the -u option for sa-learn when using non-SQL based bayes storage modules. That is why the documentation states: You can use this option to specify users in a virtual user configuration. Otherwise the bayes path, if unset via dbpath or in a .cf file is expanded to be in $ENV{HOME} which in your case is /root/. I added the -u specifically for BayesSQL users, since it doesn't refer to an actual directory on the filesystem. Feel free to file a bug report, but honestly it might end up being a documentation patch saying that -u is not effective for DBM storage. BTW, you can easily accomplish the same thing as root using su -c or similar mechanisms. Michael Ahh ok. Make sense! I will change to a sql backend, as my users have no shell access and can't run the command as themselves. Thanks for the clarification!
Re: Rule Design Benchmark/Resource Question
Thanks On Thu, Mar 31, 2005 at 05:16:25PM -0500, Matt Kettler wrote: Rocky Olsen wrote: Before i pull my hair out doing bench/resource test, i was wondering if anyone out there knew if there was much of a speed/resource usage difference between the following way of writing the same rule. Method A: body rule_a /(?:feh|meh|bleh)/i vs. Method B: bod __rule_a/(?:feh)/i body __rule_b/(?:meh)/i body __rule_c/(?:bleh)/i meta rule_d (__rule_a || __rule_b || __rule_c) There probably isn't much difference using just 3 rules, but i'm thinking more along the lines of large(500+) lists and it isn't limited to just body stuff. So if anyone has some realworld benching/experience with what is preferred or if the developers know which is faster for SA, i would love the input. To start with, use perl's regex debugger as your friend: $perl -Mre=debug -e /(?:feh|meh|bleh)/i size 11 Got 92 bytes for offset annotations. $ perl -Mre=debug -e /(?:feh)/i Freeing REx: `,' Compiling REx `(?:feh)' size 3 Got 28 bytes for offset annotations. (repeat 2 times) However, this only deals with part of the story. The cost of the regex itself. It does not deal with the per-rule overhead in SA. In general I'd favor the combined approach, unless for some reason your combined rule is considerably larger than the sum of it's parts. Bigevil ran much better once Chris S did some combining and common subexpression elimination. Also, I'd suggest eliminating the (?:) for the single-text-matches. It does nothing of use, and doesn't change the evaluation of the regex any for a simple single text match. All it does is waste 4 bytes of disk space per rule. body __RULE_A /feh/i instead of: body __RULE_A /(?:feh)/i I leave comparing the two using re=debug as an exercise for the student. Also compare to /(feh)/i and /(feh)\1/i to see how backtracking works. -- __ what's with today, today? Email: [EMAIL PROTECTED] PGP:http://rocky.mindphone.org/rocky_mindphone.org.gpg signature.asc Description: Digital signature