Pyzor error
Hi, I´m SpamAssassin user logn time ago but new in this list. This is the problem: I've just installed Pyzor (by the way in RH7.3 you need to install in a diferent way the instalation notes says). I was using SpamAssassin inside a perl script (with mod_perl) to check for spam the outgoing mails form our free webmail systema before sendit out. I just realize a bizarre problem with pyzor causing 2 different process in apache execute the same code twice in a single http request (I know.. sound bizarre). This is the error message SpamAssassing sends: Pyzor - check failed: setuid: oops: fileno(STDIN) [1] != 0 at /usr/lib/perl5/site_perl/5.6.1/Mail/SpamAssassin/Util.pm line 1055. This ocurrs usig the next code: do { eval { my $status = $spamtest-check_message_text($entity-as_string); // entity is a MIME::Entity object. The errorocurrs here. $spam_level = $status-get_hits; $spam_report = $status-get_report; # clean up. $status-finish(); undef ($status); }; if ($@){ $spam_report = undef; warn $$ . "ERROR Chequeando SPAM -- Intento $spam_check_try" if ($DEBUG); } $spam_check_try++; } while (!$spam_report $spam_check_try = $MAX_TRY_SPAM); Looks like Pyzor makes mod_perl becomes crazy and without catch the error, from this point a new process is born making the script run twice and loose the conection with apache. I think may be a problem with stderr at SpamAssassin trying Pyzor test with string messages. Someone else having this problem? Google shows only 2 more users having this problem Of coursewithout answers. Thanks, David A. Velásquez R.Gerente FundadorConexiones Colombianas (CONEXCOL)[EMAIL PROTECTED]http://www.conexcol.com/ - http://www.sipo.clTel/Fax. (57)(4) 3122600Cel. (57)(300) 6533517 Cra. 34 No. 7 - 157A.A. 12137 Medellín, Ant. CO.
highly available sitewide bayes, local db vs. sql
What sort of experiences have people had managing a sitewide bayes db that is used by spamassassin (spamd|amavisd) instances on multiple machines? I've got an environment with spamassassin/amavisd-new running in parallel on a pool of two (but possibly more in the future) equally weighted machines. How have you avoided the dreaded Single Point of Failure? I've been experimenting (on a small scale) with an SQL backed bayes db. I can readily have multiple machines talk to single mysql instance, but then I'm stuck trying to make that mysql instance highly available (and I *could* do that on an existing clustered server). I could also have an instance of mysql running on all of the machines, with one master mysql instance replicating to one or more mysql slave instances. I've never set up mysql replication (but it can't be much harder than OpenLDAP replication!). In such an example I'd only enable autolearning on the machine with the master mysql db. I could also ditch the idea of using a mysql backed bayes and simply rsync the bayes db file from the master to the slaves on a regular basis (stopping and starting spamd|amavisd in the process). In such an environment I'd do training only on one master machine and enable autolearning only on that machine. How are other people addressing this issue? Ben
RE: Porno
Title: Re: Porno Okay, SA is working but was about the routing. Did it ever pass through SA? Were there SA headers in the original message? Did SA timeout because the machine is overloaded and just return the original email back to the MTA chain? Granted SA thinks its spam but did SA ever get a chance to see it? Gary From: Robert Fitzpatrick [mailto:[EMAIL PROTECTED]Sent: Wed 2/23/2005 7:04 AMTo: Gary W. SmithCc: SpamAssassinSubject: Re: Porno Yes, I got a hold of one and it score 24 points using 'spamassassin -t'and considered spam as only 5 points were required, but ended up in myinbox. I am running amavisd-new with Postfix, how do I tell ifspamassassin is even working?Gary W. Smith wrote:You might want to collect some additional information from the clientsuch as the headers. This is where I would start.Gary-Original Message-From: Robert Fitzpatrick [mailto:[EMAIL PROTECTED]]Sent: Wednesday, February 23, 2005 5:54 AMTo: SpamAssassinSubject: PornoI have received complaints from two companies since yesterday aboutmessages with porno content getting through. Is this a new variant?Andis anyone else getting hit with these or know of any updates to filterthe messages? I am running Spamassassin 3.0.1 with rules du jour, butneither seem to pick them up.
RE: Low scoring spam
Well all I did was run spamd -D /path/to/message Here is my local.cf. Am I missing something out of here? user_scores_dsn DBI:mysql:spamassassin:localhost:3306 user_scores_sql_password * user_scores_sql_username * user_scores_sql_custom_query SELECT preference, value FROM _TABLE_ WHERE username = _USERNAME_ OR username = '$GLOBAL' OR username = CONCAT('%',_DOMAIN_) ORDER BY username ASC score ALL_TRUSTED 0 report_safe 1 use_bayes 1 bayes_auto_learn0 use_dcc 1 ok_languagesen ok_locales en use_auto_whitelist 0 If I am missing something that would make it check the headers please let me know. The command line that runs in the init.d file is: -q -x -d -m10 -H -v -u spamuser Thanks Robert -Original Message- From: Matt Kettler [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 23, 2005 8:44 AM To: [EMAIL PROTECTED]; users@spamassassin.apache.org Subject: RE: Low scoring spam At 10:31 AM 2/23/2005, Robert Bartlett wrote: Do you suggest until resolved disable this? If to disable it what exactly do I need to disable? Upon closer inspection are you sure you fed SA the actual message with complete headers? Are you sure that's not the output of spamassasin --lint?? It looks like the test message is missing a LOT of headers.. No Subject, no Date, no From:, no Received headers. Note it doesn't look like it failed to parse the Received: headers.. it looks like the are absent entirely.. There's no complaint about an unparsable Received header in the debug..There's no mention of even trying to parse one... This part looks very much like --lint: debug: all '*From' addrs: [EMAIL PROTECTED] Also suspicious: MISSING_DATE,MISSING_SUBJECT
How to purge bayes?
How do I purge my bayes_* files? Especially, my bayes_journal is over 250 MB! I like it to re-init with a fresh start. But when I echo -n the files, and restart SA, I get dbase errors. So, how can I easily go about this? Thanks, - Mark
Re: How to purge bayes?
Mark grabbed a keyboard and wrote: How do I purge my bayes_* files? Especially, my bayes_journal is over 250 MB! I like it to re-init with a fresh start. But when I echo -n the files, and restart SA, I get dbase errors. So, how can I easily go about this? When I had to do it some time ago, I just did a rm bayes_* and poof they were gone. Next time something came in, spamd just recreated them. --Dave
RE: How to purge bayes?
-Original Message- From: David Guntner [mailto:[EMAIL PROTECTED] Sent: donderdag 24 februari 2005 3:02 To: users@spamassassin.apache.org Subject: Re: How to purge bayes? Mark grabbed a keyboard and wrote: How do I purge my bayes_* files? Especially, my bayes_journal is over 250 MB! I like it to re-init with a fresh start. But when I echo -n the files, and restart SA, I get dbase errors. So, how can I easily go about this? When I had to do it some time ago, I just did a rm bayes_* and poof they were gone. Next time something came in, spamd just recreated them. As I just wrote someone (who suggested the same): When I do that, however, I get this in my log: bayes: no dbs present, cannot scan: /var/db/spamassassin/bayes_toks Is that ok? Thanks, - Mark
RE: How to purge bayes?
Mark wrote: -Original Message- From: David Guntner [mailto:[EMAIL PROTECTED] Sent: donderdag 24 februari 2005 3:02 To: users@spamassassin.apache.org Subject: Re: How to purge bayes? Mark grabbed a keyboard and wrote: How do I purge my bayes_* files? Especially, my bayes_journal is over 250 MB! I like it to re-init with a fresh start. But when I echo -n the files, and restart SA, I get dbase errors. So, how can I easily go about this? When I had to do it some time ago, I just did a rm bayes_* and poof they were gone. Next time something came in, spamd just recreated them. As I just wrote someone (who suggested the same): When I do that, however, I get this in my log: bayes: no dbs present, cannot scan: /var/db/spamassassin/bayes_toks Is that ok? Thanks, - Mark This smells like a sitewide bayes, and permissions issues. Check what id SPAMD is running as, and the permissions for /var/db/spamassassin. LER -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: ler@lerctr.org US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
Re: How to purge bayes?
Hello Mark, Wednesday, February 23, 2005, 5:33:43 PM, you wrote: M How do I purge my bayes_* files? Especially, my bayes_journal is over 250 M MB! I like it to re-init with a fresh start. But when I echo -n the M files, and restart SA, I get dbase errors. So, how can I easily go about M this? Don't purge/reinit the files, delete them. SA will recreate them when it tries to check Bayes next, and will update them with new auto-learned emails and/or your next sa-learn. However, IMO the journal file should be tiny, and most of your Bayes data should be in _seen and _toks. The journal only holds data temporarily until _seen and _toks can be updated, I believe. You may have other problems in your Bayes system which is preventing the journal file from being naturally emptied. Deleting all of the files completely might clear that up. Bob Menschel
Re: SA 3.01 eventually stops noticing DNSBLs
Jay Levitt wrote: [SNIP] I tried to create a test harness to see if I can replicate this outside of SA, but for some reason, even though I double-checked the code I copied from Dns.pm, I'm getting weird results - it's always giving me the root nameservers, instead of the name servers for each of the domains. This is true with recurse = 0, recurse = 1, or recurse left out entirely as it is in Dns.pm. I'm no Perl whiz; can anyone see my mistake? Code follows: - #!/usr/bin/perl no strict; no warnings; require Net::DNS; require Net::DNS::Resolver; use strict; use warnings; my @EXISTING_DOMAINS = qw{ adelphia.net akamai.com apache.org cingular.com colorado.edu comcast.net doubleclick.com ebay.com gmx.net google.com intel.com kernel.org linux.org mit.edu motorola.com msn.com sourceforge.net sun.com w3.org yahoo.com }; my $res = Net::DNS::Resolver-new ( recurse = 0, retry = 1, retrans = 0, dnsrch = 0, defnames = 0, tcp_timeout = 3, udp_timeout = 3, persistent_tcp = 1, persistent_udp = 1 ); die unless defined $res; for(;;) { my @domains = @EXISTING_DOMAINS; my $domain = splice(@domains, rand(@domains), 1); print trying '$domain'...\n; lookup_ns($domain); } sub lookup_ns { my ($self, $dom) = @_; Since you're not using this as a Perl Module (OOP) my guess is that $self contains the value you expect to be in $dom and $dom is NULL. Try removing $self from your argument list and make it look like: my ($dom) = @_; and see if that works for you. debug statements are your friend. :) hope this helps alan
Re: How to purge bayes?
Larry Rosenman wrote: Mark wrote: -Original Message- From: David Guntner [mailto:[EMAIL PROTECTED] Sent: donderdag 24 februari 2005 3:02 To: users@spamassassin.apache.org Subject: Re: How to purge bayes? Mark grabbed a keyboard and wrote: How do I purge my bayes_* files? Especially, my bayes_journal is over 250 MB! I like it to re-init with a fresh start. But when I echo -n the files, and restart SA, I get dbase errors. So, how can I easily go about this? When I had to do it some time ago, I just did a rm bayes_* and poof they were gone. Next time something came in, spamd just recreated them. As I just wrote someone (who suggested the same): When I do that, however, I get this in my log: bayes: no dbs present, cannot scan: /var/db/spamassassin/bayes_toks Is that ok? Thanks, - Mark This smells like a sitewide bayes, and permissions issues. Check what id SPAMD is running as, and the permissions for /var/db/spamassassin. Hi, Also smells like someone removed the bayes_* files and nothing has be auto-learned yet, hence the warning message. Regards, Rick
Re: Low scoring spam
Hello Robert, Not directly related to your problem, I don't think, but from your debug listing I see you're using the following rules files: debug: config: read file ... debug: config: read file /etc/mail/spamassassin/70_sare_random.cf ... debug: config: read file /etc/mail/spamassassin/random.current.cf If I remember correctly, random.current.cf is an ancient name for 70_sare_random.cf -- you may be overlaying current rules with ancient ones. Worth looking into. Bob Menschel
Re[2]: ENC: Wet 30 to 40 girls hrony and wants you
Hello Richard, Monday, February 21, 2005, 6:09:49 AM, you wrote: GR Try these on for size: GR header __PORN_WORD01 Subject =~/n(?:ex|xe)t door/i GR header __PORN_WORD02 Subject =~/puss(?:y|ies)/i GR ... header__PORN_WORD01 Subject =~ /n(?:ex|xe)t door/i header__PORN_WORD02 Subject =~ /puss(?:y|ies)/i header__PORN_WORD04 Subject =~ /(?:needs|for) m(?:one|oen|neo|noe|eno|eon)y/i header__PORN_WORD05 Subject =~ /h(?:orn|onr|nro|nor|ron|rno)y/i header__PORN_WORD06 Subject =~ /f(?:ucke|ucek|ukce|ukec|ueck|uekc|cuek|cuke|ckue|ckeu|ceku|ceuk|kuce|kuec|kcue|kceu|kecu|keuc|euck|eukc|ecuk|ecku|ekcu|ekuc)d/i headerPORN_WORD08 Subject =~ /\bMILF\b/i headerPORN_WORD09 Subject =~ /w(?:hor|hro|roh|rho|ohr|orh)e/i headerPORN_WORD20 Subject =~ /w(?:hore|hoer|hroe|hreo|heor|hero|ohre|oher|orhe|oreh|oerh|oehr|rhoe|rhep|roeh|rohe|reho|reoh|ehro|ehor|eorh|eohr|erho|eroh)s/i headerPORN_WORD10 Subject =~ /(?:hstoett|o(?:the|teh|het|hte|eht|eth)r|stpuid|stupid|disgusting|shy|married|brand new|dirty|average|amateur|amatuer|amtauer|real|beautiful|hot|sexy|sxey|n(?:ast|ats|tas|tsa|sta|sat)y|wet|cute).{1,3}(?:(?:step|grand)?[\-_]?(?:mo|om)ms?|house[\-_]?wi[fvr]es?|(?:cow)?girls?|moms?|w(?:om[ae]|o[ae]m|[ae]om|[ae]mo|m[ae]o|mo[ae])n|neigbhour|neighbour|neighbuor|(?:teen|tnee)(?:ager|agre|arge)?s?|s(?:lu|ul)ts?|bitehcs|bitches)/i header__PORN_WORD11 Subject =~ /\bcum(?:shot)?\b/i #error: header __PORN_WORD12 Subject =~ /(?:d(?:ic|ci)k|c(?:|oc|co)k/i header__PORN_WORD12 Subject =~ /(?:d(?:ic|ci)k|c(?:|oc|co)k)/i header__PORN_WORD13 Subject =~ /fucking/i header__PORN_WORD14 Subject =~ /up[\-_]c(?:los|lso|sol|slo|ols|osl)e/i header__PORN_WORD15 Subject =~ /snatch/i header__PORN_WORD16 Subject =~ /(?:pervert|peervrt|prevert|perevrt)/i No ham hits for these: #counts __PORN_WORD017s/0h of 197615 corpus (96830s/100785h RM) 02/22/05 #counts __PORN_WORD0557s/0h of 197615 corpus (96830s/100785h RM) 02/22/05 #counts PORN_WORD08 19s/0h of 197615 corpus (96830s/100785h RM) 02/22/05 #counts __PORN_WORD11914s/0h of 197615 corpus (96830s/100785h RM) 02/22/05 #counts __PORN_WORD162s/0h of 197615 corpus (96830s/100785h RM) 02/22/05 Adequate S/O for these on my system: #counts __PORN_WORD0282s/1h of 197615 corpus (96830s/100785h RM) 02/22/05 #counts __PORN_WORD0653s/1h of 197615 corpus (96830s/100785h RM) 02/22/05 #counts PORN_WORD10 139s/5h of 197615 corpus (96830s/100785h RM) 02/22/05 These don't work here: #counts __PORN_WORD040s/1h of 197615 corpus (96830s/100785h RM) 02/22/05 #counts PORN_WORD09 26s/23h of 197615 corpus (96830s/100785h RM) 02/22/05 #counts PORN_WORD20 13s/4h of 197615 corpus (96830s/100785h RM) 02/22/05 #counts __PORN_WORD124543s/4626h of 197615 corpus (96830s/100785h RM) 02/22/05 #counts __PORN_WORD1318s/2h of 197615 corpus (96830s/100785h RM) 02/22/05 #counts __PORN_WORD142s/1h of 197615 corpus (96830s/100785h RM) 02/22/05 #counts __PORN_WORD154s/1h of 197615 corpus (96830s/100785h RM) 02/22/05 There's a fair amount of overlap with current SARE rules, which I haven't tested yet, but some of these should be worth adding to the SARE rule set if we can have your permission to do so. Bob Menschel
RE: How to purge bayes?
-Original Message- From: Robert Menschel [mailto:[EMAIL PROTECTED] Sent: donderdag 24 februari 2005 3:28 To: Mark Cc: users@spamassassin.apache.org Subject: Re: How to purge bayes? Don't purge/reinit the files, delete them. SA will recreate them when it tries to check Bayes next, and will update them with new auto-learned emails and/or your next sa-learn. Thanks for all your suggestions. I think it was the auto-learn thing. However, IMO the journal file should be tiny, and most of your Bayes data should be in _seen and _toks. SA has now re-created everything; only, now I have no bayes_journal any more at all: drwx-- 2 spamd spamd512 Feb 24 08:03 . drwxr-xr-x 8 root wheel512 Feb 24 08:11 .. -rw--- 1 spamd spamd 17 Feb 24 08:03 bayes_msgcount -rw--- 1 spamd spamd 49152 Feb 24 08:03 bayes_seen -rw--- 1 spamd spamd 49152 Feb 24 08:03 bayes_toks -rw--- 1 spamd spamd 1218 Feb 24 02:29 user_prefs Maybe it will just write it out later? - Mark
Re: highly available sitewide bayes, local db vs. sql
Ben Poliakoff wrote: Hi Ben What sort of experiences have people had managing a sitewide bayes db that is used by spamassassin (spamd|amavisd) instances on multiple machines? I've got an environment with spamassassin/amavisd-new running in parallel on a pool of two (but possibly more in the future) equally weighted machines. How have you avoided the dreaded Single Point of Failure? Running here two servers with SA in load balancing. Each machine has its own local BayesAWL DB (no SPoF). Given the amount of incoming traffic (100kmsgs/server/workday) we are statistically sure that both servers see the same (spam) messages. We have not noticed any efficiency unbalance between the two instances in over 12 months. Having two DBs has also one advantage: if Bayes on one machine gets corrupted (wrong training, ...) you can restore it from the twin server with a simple FTP. We have done this at least once. What needs to be done periodically is AWL DB purging/reset since it keeps growing and growing... We were considering a MySQL DB on a third machine (with failover on other two), but the loss of Bayes history is not such a big issue IMHO. A nighttime backup is probably enough as long as you have another machine to restore the DB few hours after failure. Nevertheless a good ham/spam collection will re-train your Bayesian filter in a matter of minutes. Our third machine will probably run a local mirror of SURBL, instead! HTH, Paolo
Re: SA 3.01 eventually stops noticing DNSBLs
Jay Levitt wrote: A quick test shows that indeed, an awful lot of domains are repeatedly failing in lookup_ns, but that different domains fail at different times - the domains that repeatedly fail right now were fine last night in the SA logs. So it looks like this is something (intermittment) to do with the resolver on my system, or perhaps the caching nameserver, and nothing to do with SA. I'll keep digging and report back what I find. If anyone has any tips, of course, feel free to let me know. I spoke too soon. Turns out I'd accidentally left recurse=0 in the test harness. No wonder it was failing so often. I discovered Net::DNS::Resolver::errorstring, and put some more logging into SA, and the problem is really simple: my caching-only nameserver times out when looking up NS records for a site that's not in the cache. Not entirely surprising, with a 3-second timeout in SA. And my site is infinitely small (just me), so it's going to be fairly common that one of the well-known sites is not in cache. SA realizes this, and tries to loop, in Dns.pm's is_dns_available, but the loop is coded wrong, because either a success or a failure breaks out of the loop! A timeout in lookup_ns will result in $result defined, but containing no records, and that triggers the failed horribly clause, setting $IS_DNS_AVAILABLE to zero until mimedefang eventually cycles the child process. I *think* the bug fix is just to remove that whole else clause from is_dns_available, but as a Perl novice I'd certainly like someone to double-check that. And, you know, now that I look at it, it seems like is_dns_available uses lookup_ns to test general DNS availability, but lookup_ns has its own caching that would seem to defeat the point of the test if a site is ever hit twice! Jay
Re: How to purge bayes?
Am Donnerstag, 24. Februar 2005 08:23 schrieb Mark: -Original Message- From: Robert Menschel [mailto:[EMAIL PROTECTED] Sent: donderdag 24 februari 2005 3:28 To: Mark Cc: users@spamassassin.apache.org Subject: Re: How to purge bayes? Don't purge/reinit the files, delete them. SA will recreate them when it tries to check Bayes next, and will update them with new auto-learned emails and/or your next sa-learn. Thanks for all your suggestions. I think it was the auto-learn thing. However, IMO the journal file should be tiny, and most of your Bayes data should be in _seen and _toks. SA has now re-created everything; only, now I have no bayes_journal any more at all: drwx-- 2 spamd spamd512 Feb 24 08:03 . drwxr-xr-x 8 root wheel512 Feb 24 08:11 .. -rw--- 1 spamd spamd 17 Feb 24 08:03 bayes_msgcount -rw--- 1 spamd spamd 49152 Feb 24 08:03 bayes_seen -rw--- 1 spamd spamd 49152 Feb 24 08:03 bayes_toks -rw--- 1 spamd spamd 1218 Feb 24 02:29 user_prefs Maybe it will just write it out later? - Mark Why didn't you try sa-learn --force-expire This should reduce your journal and bayes_toks. Now you must train your bayes again with ham and spam. Thomas -- icq:133073900 http://www.t-arend.de pgpAktZhEGi3D.pgp Description: PGP signature
spamd setup
I have two mailhubs running exim (w/ exiscan) +SA+others I want to let them use each others spamd if the load gets too great on any one machine mailhub1 spamd: /usr/local/bin/spamd -d -r /logs/spamd.pid -m 7 \ -i -A 127.0.0.1,mailhub2IP exiscan: spamd_address = 127.0.0.1 783 : mailhub2IP 783 = mailhub2 spamd: /usr/local/bin/spamd -d -r /logs/spamd.pid -m 7 \ -i -A 127.0.0.1,mailhub1IP exiscan: spamd_address = 127.0.0.1 783 : mailhub1IP 783 Is this the right config?? becuase currently I dont see any activity of either hub on the others log
Character Sets in Subject and To/From
Hello, I got lots of messages with subjects of the form: Subject: =?utf-8?q?Wholesale Rolex Watc?= =?utf-8?q?hes?= Also mail Addresses use this type of obfuscation. My Question: How are thes character set changes handled by SpamAssassin rules and bayesian filtering. Best regards Thomas Arend -- icq:133073900 http://www.t-arend.de pgp3fw1DFhQbM.pgp Description: PGP signature
Header Check
Can anyone tell me what this header check means: X-BLTSYMAVREINSERT I have someone sending a power point presentation via email to our CIO but its getting blocked with this in the header. What does it represent? Thanks in advance Jason
Re: Header Check
At 10:40 AM 2/24/2005, Jason Bennett wrote: Can anyone tell me what this header check means: X-BLTSYMAVREINSERT I have someone sending a power point presentation via email to our CIO but its getting blocked with this in the header. What does it represent? Thanks in advance I take it you're using an outdated version of the bogus-virus-warnings.cf file... That rule was disabled back in august due to FP problems. You might want to get a fresh copy http://www.timj.co.uk/linux/bogus-virus-warnings.cf
RE: Low scoring spam
At 08:06 PM 2/23/2005, Robert Bartlett wrote: Well all I did was run spamd -D /path/to/message Is that a typo of spamc, or did you really try to feed a message to spamd?
RE: Low scoring spam
At 08:06 PM 2/23/2005, Robert Bartlett wrote: Well all I did was run spamd -D /path/to/message Wait.. even if it is a typo, it still won't work. You need to redirect things when calling spamc.. You can't pass it a filename. And spamc doesn't take a -D parameter, only spamd does... but spamd does not accept message input like that. Try this instead: spamassassin -D /path/to/message *or* add -D to your spamd startup script, restart spamd and use spamc /path/to/message
RE: Low scoring spam
spamd is aliases to spamassassin. I forgot to insert the part in my email but that is what I did. Robert At 08:06 PM 2/23/2005, Robert Bartlett wrote: Well all I did was run spamd -D /path/to/message Wait.. even if it is a typo, it still won't work. You need to redirect things when calling spamc.. You can't pass it a filename. And spamc doesn't take a -D parameter, only spamd does... but spamd does not accept message input like that. Try this instead: spamassassin -D /path/to/message *or* add -D to your spamd startup script, restart spamd and use spamc /path/to/message
RE: Low scoring spam
At 11:59 AM 2/24/2005, [EMAIL PROTECTED] wrote: spamd is aliases to spamassassin. Ok... May I ask why?
RE: mail scored 6.5 points lead to autolearn=spam
At 01:59 AM 2/19/2005, Philipp Snizek wrote: I had sorbs in airmax.cf and frenchrules.cf and have uncommented these rules. Why is sorbs a no-show? Is something wrong with sorbs? No, sorbs is still in SA, there's just no RCVD_IN_SORBS rule anymore.. That rule used to show up if any sorbs list matched any IP in the email, including sorbs DUL for legitimately relayed mail. It's primary purpose was not to assign points, but to trigger a DNS query that could be used by the other sorbs rules. To prevent further questions about why RCVD_IN_SORBS misfired on properly relayed dialup mail it got renamed to __RCVD_IN_SORBS. This results in it running with 0 score, and not showing up in the hits list. This way only the specific sub-lists of sorbs show up with a score, but the top-level rule has no score and doesn't show up in the hits list. Whats the directive to change the score set? There is no directive to change score sets. The Score set is picked based on what features are enabled. Since the AWL learns on a no bayes basis, that automatically causes a shift in score set used. The four scoresets in SA are: Set 0 - used if neither network nor bayes tests are enabled Set 1 - used if network tests are enabled, but bayes is not Set 2 - Used if bayes tests are enabled, but network tests are not Set 3 - Used if both bayes and network tests are enabled.
Re: Character Sets in Subject and To/From
At 05:42 AM 2/24/2005, Thomas Arend wrote: I got lots of messages with subjects of the form: Subject: =3D?utf-8?q?Wholesale Rolex Watc?=3D =3D?utf-8?q?hes?=3D Also mail Addresses use this type of obfuscation. My Question: How are thes character set changes handled by SpamAssassin rules and bayesian filtering. Normal rules and bayes see them after they've been decoded. So as far as 90% of SA is concerned, the character set changes aren't there. Rules that specifically want to detect this stuff can do so by using the :raw modifier.. i.e.: header LOCAL_ENCSUBJECT Subject:raw =~ /\=\?.*\?\=/i Matches subject lines like: Subject: =?iso-8859-8?Q?=F2=EC_=E7=EB=EE=FA_=E4=E9=EC=E3?=
Re: spamassassin cant detect spam
Hi, thanks for ur help. actually i want to setup a mail server which is only for use outgoing mail. so, i cant chk it from outside of network as it is used for outgoing. so that i want to chk my spamassassin. but failure. i send a mail by using another email addr which content gets from GTUBE. but score is only 1.08. whereas for spam need to be 5 score. how can i confirm my spamassassin is working. pls help me. thanks Meshbah --- Matt Kettler [EMAIL PROTECTED] wrote: At 01:50 PM 2/23/2005, Meshbah Uddin Ahmed wrote: i m using Debian + Exim + MailScanner + ClamAV + SpamAssassin. All r successfully installed But spamassassin cant detect spam. I use required_hits 5 in mailscanner.conf. For ur inf, i enable Auto Whitelist SpamAssassin = yes. when i tried to send any mail, score is 1 or 2. Not goes to 5. Well, are you a spammer? Why would you expect email you sent to score above 5? If you want to test SA, send yourself a GTUBE string, however, be careful about what address you use as a From: address. Some older versions of SA included GTUBE results in the AWL, so whatever From: address you use will likely be blacklisted. http://spamassassin.apache.org/gtube/ __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
RE: spamassassin cant detect spam
Meshbah Uddin Ahmed wrote: Hi, thanks for ur help. actually i want to setup a mail server which is only for use outgoing mail. so, i cant chk it from outside of network as it is used for outgoing. so that i want to chk my spamassassin. but failure. i send a mail by using another email addr which content gets from GTUBE. but score is only 1.08. whereas for spam need to be 5 score. Try turning off Auto Whitelist Matthew.van.Eerde (at) hbinc.com 805.964.4554 x902 Hispanic Business Inc./HireDiversity.com Software Engineer perl -emap{y/a-z/l-za-k/;print}shift Jjhi pcdiwtg Ptga wprztg,
RE: spamassassin cant detect spam
At 02:13 PM 2/24/2005, [EMAIL PROTECTED] wrote: i send a mail by using another email addr which content gets from GTUBE. but score is only 1.08. whereas for spam need to be 5 score. Try turning off Auto Whitelist Even the AWL can't be the problem here.. The AWL has a factor of 0.5. Any message matching GTUBE should still score several hundred points. (ie: if the AWL average is 0, a message matching GTUBE still gets 500 points).
Re: Character Sets in Subject and To/From
Am Donnerstag, 24. Februar 2005 19:12 schrieb Matt Kettler: At 05:42 AM 2/24/2005, Thomas Arend wrote: I got lots of messages with subjects of the form: Subject: =3D?utf-8?q?Wholesale Rolex Watc?=3D =3D?utf-8?q?hes?=3D Also mail Addresses use this type of obfuscation. My Question: How are thes character set changes handled by SpamAssassin rules and bayesian filtering. Normal rules and bayes see them after they've been decoded. So as far as 90% of SA is concerned, the character set changes aren't there. Rules that specifically want to detect this stuff can do so by using the :raw modifier.. i.e.: header LOCAL_ENCSUBJECT Subject:raw =~ /\=\?.*\?\=/i Matches subject lines like: Subject: =?iso-8859-8?Q?=F2=EC_=E7=EB=EE=FA_=E4=E9=EC=E3?= When I understand you right my rolex rule is spoiled by this trick. Because header LOCAL_ENCSUBJECT Subject: =~ /rolex/i will not fire on these subjects. Thomas -- icq:133073900 http://www.t-arend.de pgp6EiCkKHslY.pgp Description: PGP signature
Re: Character Sets in Subject and To/From
At 02:33 PM 2/24/2005, Thomas Arend wrote: When I understand you right my rolex rule is spoiled by this trick. Because header LOCAL_ENCSUBJECT Subject: =~ /rolex/i will not fire on these subjects. You misunderstood me completely. That rule should fire on those subject lines just fine. SA will automatically decode the character sets and then feed the decoded text to your rule. You don't need to take any extra action to try to detect encoded text. SA handles this for you by default. SA always decodes unless you change it to Subject:raw instead of Subject:.
MEDIA: It pays to read the EULA!!!
http://www.pcpitstop.com/spycheck/eula.asp When Doug Heckman was installing a PC Pitstop program, he actually read the EULA. In it, he found a clause stating that he could get financial compensation if he e-mailed PC Pitstop. The result: a $1,000 check, and proof that people don't read EULAs (3,000 people before him didn't notice it). The goal of this was to prove that one should read all EULAs, so that one can see if an app is spyware if it is buried in the EULA. UmmeNarf? --Chris
Millions and Billions
I've been seeing a ton of stock spam this week, no URLs - no SURBL :( Bayes and Razor, etc pick up on it eventually but to speed things up, I wrote a rule. One thing that is unique about these messages is that they replace l's with |'s. They usually will have some variation on Mil|ions or Bi|lions. Here is the rule I came up with: body L_MILLBILL /[mb]i(?:\|l|l\||\|\|)ions?/i Feel free to use it, make suggestions or point out that I wasted my time writing a rule already available from SARE. Stuart Johnston
Any tools to gauge bayes accuracy?
Before I actually write this, I'll aks to see if someone already has done it. On my imap server, I've got two different trash folders, one for ham, one for spam. Nothing new there. However, on the hour, I've got a script that runs sa-learn on them and records three things for each message: - The overall spam score - The BAYES_XX number - Whether the user marked it as spam or ham Originally, I was using this to fine-tune my spam-threshold. However, since I've been building my bayes db for over a year now, it has become very accurate. What I want now is a script that can: A) Find some optimum spam-threshold based on FP or FN rate. (I've already got that) B) Compare this with the BAYES_XX values for the various spams/hams and, if the Bayes values have a higher correlation with what the *user* considers spam/ham, suggest different scoring values for the BAYES_XX hits. In other words, I want a script that doesn't just auto-tune a user's spam-threshold, but the bayes scoring as well as the bayes db gets better and better. Anybody done something like this? - Joe smime.p7s Description: S/MIME Cryptographic Signature
RE: Millions and Billions
Stuart Johnston wrote: body L_MILLBILL /[mb]i(?:\|l|l\||\|\|)ions?/i Feel free to use it, make suggestions or point out that I wasted my time writing a rule already available from SARE. Stuart Johnston How about (slightly easier to read) body L_MILLBILL /[mb]i[l|][l|]ions?/i or even body L_MILLBILL /[mb]i[l|]{2}ions?/i Matthew.van.Eerde (at) hbinc.com 805.964.4554 x902 Hispanic Business Inc./HireDiversity.com Software Engineer perl -emap{y/a-z/l-za-k/;print}shift Jjhi pcdiwtg Ptga wprztg,
Re: Millions and Billions
[EMAIL PROTECTED] wrote: Stuart Johnston wrote: body L_MILLBILL /[mb]i(?:\|l|l\||\|\|)ions?/i Feel free to use it, make suggestions or point out that I wasted my time writing a rule already available from SARE. Stuart Johnston How about (slightly easier to read) body L_MILLBILL /[mb]i[l|][l|]ions?/i or even body L_MILLBILL /[mb]i[l|]{2}ions?/i I started with something similar to that but it will also match millions which we don't want. Stuart Johnston
RE: Millions and Billions
Stuart Johnston wrote: [EMAIL PROTECTED] wrote: Stuart Johnston wrote: body L_MILLBILL /[mb]i(?:\|l|l\||\|\|)ions?/i body L_MILLBILL /[mb]i[l|][l|]ions?/i I started with something similar to that but it will also match millions which we don't want. Touché! OK, how about body L_MILLBILL /[mb]il?\|+l?ions?/i Also catches mi|ions, mil||ions Matthew.van.Eerde (at) hbinc.com 805.964.4554 x902 Hispanic Business Inc./HireDiversity.com Software Engineer perl -emap{y/a-z/l-za-k/;print}shift Jjhi pcdiwtg Ptga wprztg,
Re: Millions and Billions
On Thu, 24 Feb 2005, Stuart Johnston wrote: [EMAIL PROTECTED] wrote: [snip..] How about (slightly easier to read) body L_MILLBILL /[mb]i[l|][l|]ions?/i or even body L_MILLBILL /[mb]i[l|]{2}ions?/i I started with something similar to that but it will also match millions which we don't want. Stuart Johnston Use negative lookahead to prevent matching on the unobfuscated version: body L_MILLBILL /\b(?!millions?)[mb]i[l|]{2}ions?/i The '(?!pattern)' says do-NOT match on this pattern. -- Dave Funk University of Iowa dbfunk (at) engineering.uiowa.eduCollege of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include std_disclaimer.h Better is not better, 'standard' is better. B{
Re: Millions and Billions
Hi, [EMAIL PROTECTED] wrote: Stuart Johnston wrote: [EMAIL PROTECTED] wrote: Stuart Johnston wrote: body L_MILLBILL /[mb]i(?:\|l|l\||\|\|)ions?/i body L_MILLBILL /[mb]i[l|][l|]ions?/i I started with something similar to that but it will also match millions which we don't want. Touché! OK, how about body L_MILLBILL /[mb]il?\|+l?ions?/i Also catches mi|ions, mil||ions Matthew.van.Eerde (at) hbinc.com 805.964.4554 x902 Hispanic Business Inc./HireDiversity.com Software Engineer perl -emap{y/a-z/l-za-k/;print}shift Jjhi pcdiwtg Ptga wprztg, Not to get super fancy or anything, but try this (with negative lookahead): body LOCAL_OBFU_ONLY_MLLNS /(?!\bmillions\b)(?:\bm|\Brn|\/V\\|\/\\\/\\|\xCE\x9C|\xD0\x9C|\xD0\xBC)[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[il1:\|\*\xCC-\xCF\xEC-\xEF\xA6]|\xC4[\xA8-\xB0]|\xC4\xBA|\xC4\xBC|\xC4\xBE|\xC5\x80|\xC5\x82|\xC7[\x8F-\x90]|\xD0[\x86-\x87]|\xD1[\x96-\x97]|\xCE\x8A|\xCE\x90|\xCE\x99|\xCE\xAA|\xCE\xAF|\xCE\xB9|\xCF\x8A)[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[l1I\|\xA3]|(?:\xC5[\x80-\x82]|\xC4[\xB9-\xBF]))[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[l1I\|\xA3]|(?:\xC5[\x80-\x82]|\xC4[\xB9-\xBF]))[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[il1:\|\*\xCC-\xCF\xEC-\xEF\xA6]|\xC4[\xA8-\xB0]|\xC4\xBA|\xC4\xBC|\xC4\xBE|\xC5\x80|\xC5\x82|\xC7[\x8F-\x90]|\xD0[\x86-\x87]|\xD1[\x96-\x97]|\xCE\x8A|\xCE\x90|\xCE\x99|\xCE\xAA|\xCE\xAF|\xCE\xB9|\xCF\x8A)[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[o0\*\xB0\xBA\xD8\xF8\xD2-\xD6\xF2-\xF6]|\(\)|\[\]|\xC5[\x8C-\x91]|\xC6[\xA0-\xA1]|\xC7[\x91-\x92]|\xC7[\xBE-\xBF]|\xCE\x8C|\xCE\x98|\xCE\x9F|\xCE\xB8|\xCE\xBF|\xCF\x8C|\xD0\x9E|\xD0\xBE|\xD5\x95)[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[n\xD1\xF1]|\|\\\||\xC5[\x83-\x8B]|\xCE\x9D|\xCE\xA0|\xCE\xAE|\xCE\xB7|\xD5\xB2|\xD5\xB8)[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[s5]\b|[\$\xA7]|\xC5[\x9A-\xA1]|\xD0\x85|\xD1\x95|\xD5\x8F\B)/i body LOCAL_OBFU_ONLY_BLLNS /(?!\bbillions\b)(?:\b[b8]|\B[\xDF]|\xCE\x92|\xCE\xB2|\xD0\x92|\xD0\xB2)[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[il1:\|\*\xCC-\xCF\xEC-\xEF\xA6]|\xC4[\xA8-\xB0]|\xC4\xBA|\xC4\xBC|\xC4\xBE|\xC5\x80|\xC5\x82|\xC7[\x8F-\x90]|\xD0[\x86-\x87]|\xD1[\x96-\x97]|\xCE\x8A|\xCE\x90|\xCE\x99|\xCE\xAA|\xCE\xAF|\xCE\xB9|\xCF\x8A)[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[l1I\|\xA3]|(?:\xC5[\x80-\x82]|\xC4[\xB9-\xBF]))[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[l1I\|\xA3]|(?:\xC5[\x80-\x82]|\xC4[\xB9-\xBF]))[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[il1:\|\*\xCC-\xCF\xEC-\xEF\xA6]|\xC4[\xA8-\xB0]|\xC4\xBA|\xC4\xBC|\xC4\xBE|\xC5\x80|\xC5\x82|\xC7[\x8F-\x90]|\xD0[\x86-\x87]|\xD1[\x96-\x97]|\xCE\x8A|\xCE\x90|\xCE\x99|\xCE\xAA|\xCE\xAF|\xCE\xB9|\xCF\x8A)[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[o0\*\xB0\xBA\xD8\xF8\xD2-\xD6\xF2-\xF6]|\(\)|\[\]|\xC5[\x8C-\x91]|\xC6[\xA0-\xA1]|\xC7[\x91-\x92]|\xC7[\xBE-\xBF]|\xCE\x8C|\xCE\x98|\xCE\x9F|\xCE\xB8|\xCE\xBF|\xCF\x8C|\xD0\x9E|\xD0\xBE|\xD5\x95)[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[n\xD1\xF1]|\|\\\||\xC5[\x83-\x8B]|\xCE\x9D|\xCE\xA0|\xCE\xAE|\xCE\xB7|\xD5\xB2|\xD5\xB8)[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[s5]\b|[\$\xA7]|\xC5[\x9A-\xA1]|\xD0\x85|\xD1\x95|\xD5\x8F\B)/i signature.asc Description: OpenPGP digital signature
Re: Millions and Billions
On Thursday 24 February 2005 05:42 pm, [EMAIL PROTECTED] wrote: Stuart Johnston wrote: [EMAIL PROTECTED] wrote: Stuart Johnston wrote: body L_MILLBILL /[mb]i(?:\|l|l\||\|\|)ions?/i body L_MILLBILL /[mb]i[l|][l|]ions?/i I started with something similar to that but it will also match millions which we don't want. Touché! OK, how about body L_MILLBILL /[mb]il?\|+l?ions?/i Also catches mi|ions, mil||ions Matthew.van.Eerde (at) hbinc.com 805.964.4554 x902 Hispanic Business Inc./HireDiversity.com Software Engineer perl -emap{y/a-z/l-za-k/;print}shift Jjhi pcdiwtg Ptga wprztg, Over a period of time, here's some of the character groups that I've seen substituted. Feel free to use them wherever. i or l = [|ííiil1] a = [EMAIL PROTECTED] e = [eé3] o = [o0] And, don't forget that a .? means either 1 or no characters ignored. so [EMAIL PROTECTED] matches prozac p r o z a c [EMAIL PROTECTED] pmrwomawzmawc (not that you'd want to...) etc. -- The significant problems we face cannot be solved at the same level of thinking we were at when we created them Albert Einstein