Bayes DB version issue 3.1.3 = 3.1.4
Title: Bayes DB version issue 3.1.3 = 3.1.4 Hello, I cant remember smoking crack when copying the config files over but anythings possible. I built out a new machine today and installed SA. We have a list of CPAN modules that were installed (same list as from the 3.1.3 servers). I copied everything in the /etc/mail/spamassassin from our productions servers to the test server and after starting we receive errors. I have checked and the MySQL data instance is accessible from this server. There are also several rules that are errors as well. I know that someone has asked this question already but I didnt find the answer in the thread archive. Here are the contents of the log file: Aug 7 21:45:59 labtest01c spamd[2693]: config: score: the non-numeric score (.8) is not valid, a numeric score is required Aug 7 21:45:59 labtest01c spamd[2693]: config: SpamAssassin failed to parse line, SUBJ_HAS_UNIQ_ID .8 0.212 0.682 1.677 is not valid for score, skipping: score SUBJ_HAS_UNIQ_ID .8 0.212 0.682 1.677 Aug 7 21:46:01 labtest01c spamd[2693]: bayes: database version 0 is different than we understand (3), aborting! at /usr/lib/perl5/site_perl/5.8.7/Mail/SpamAssassin/BayesStore/SQL.pm line 135. Aug 7 21:46:03 labtest01c spamd[2693]: bayes: database version 0 is different than we understand (3), aborting! at /usr/lib/perl5/site_perl/5.8.7/Mail/SpamAssassin/BayesStore/SQL.pm line 135. Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DIGEST_MULTIPLE has undefined dependency 'RAZOR2_CHECK' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DIGEST_MULTIPLE has undefined dependency 'DCC_CHECK' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DRUGS_ERECTILE has undefined dependency '__DRUGS_ERECTILE7' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_SUB_ACCEPT_CCARDS has undefined dependency '__SARE_SUB_FROM_PAYPAL' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_SPEC_PROLEO_M2a has dependency 'MIME_QP_LONG_LINE' with a zero score Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_HEAD_SUBJ_RAND has undefined dependency 'SARE_XMAIL_SUSP2' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_HEAD_SUBJ_RAND has undefined dependency 'SARE_HEAD_XAUTH_WARN' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_HEAD_SUBJ_RAND has dependency 'X_AUTH_WARN_FAKED' with a zero score Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_RD_SAFE has undefined dependency 'SARE_RD_SAFE_MKSHRT' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_RD_SAFE has undefined dependency 'SARE_RD_SAFE_GT' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_RD_SAFE has undefined dependency 'SARE_RD_SAFE_TINY' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_MSGID_LONG45 has undefined dependency '__SARE_MSGID_LONG50' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_MSGID_LONG45 has undefined dependency '__SARE_MSGID_LONG55' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_MSGID_LONG45 has undefined dependency '__SARE_MSGID_LONG65' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_MSGID_LONG45 has undefined dependency '__SARE_MSGID_LONG75' Aug 7 21:46:06 labtest01c spamd[2693]: rules: meta test VIRUS_WARNING_DOOM_BNC has undefined dependency 'VIRUS_WARNING_MYDOOM4' Aug 7 21:46:06 labtest01c spamd[2693]: rules: meta test SARE_OBFU_CIALIS has undefined dependency 'SARE_OBFU_CIALIS2' Aug 7 21:46:06 labtest01c spamd[2693]: rules: meta test FP_MIXED_PORN3 has undefined dependency 'FP_PENETRATION' Aug 7 21:46:07 labtest01c spamd[2693]: spamd: server started on port 783/tcp (running version 3.1.4) Aug 7 21:46:07 labtest01c spamd[2693]: spamd: server pid: 2693 Aug 7 21:46:07 labtest01c spamd[2693]: spamd: server successfully spawned child process, pid 2700 Aug 7 21:46:07 labtest01c spamd[2693]: spamd: server successfully spawned child process, pid 2701 Aug 7 21:46:07 labtest01c spamd[2693]: prefork: child states: II Any help would be greatly appreciated. Gary Wayne Smith
Re: Bayes DB version issue 3.1.3 = 3.1.4
On 8/8/2006 3:29 AM, Gary W. Smith wrote: Hello, I can’t remember smoking crack when copying the config files over but anything’s possible. I built out a new machine today and installed SA. We have a list of CPAN modules that were installed (same list as from the 3.1.3 servers). I copied everything in the /etc/mail/spamassassin from our productions servers to the test server and after starting we receive errors. I have checked and the MySQL data instance is accessible from this server. There are also several rules that are errors as well. I know that someone has asked this question already but I didn’t find the answer in the thread archive. Here are the contents of the log file: Aug 7 21:45:59 labtest01c spamd[2693]: config: score: the non-numeric score (.8) is not valid, a numeric score is required Aug 7 21:45:59 labtest01c spamd[2693]: config: SpamAssassin failed to parse line, SUBJ_HAS_UNIQ_ID .8 0.212 0.682 1.677 is not valid for score, skipping: score SUBJ_HAS_UNIQ_ID .8 0.212 0.682 1.677 .8 requires a leading zero. Aug 7 21:46:01 labtest01c spamd[2693]: bayes: database version 0 is different than we understand (3), aborting! at /usr/lib/perl5/site_perl/5.8.7/Mail/SpamAssassin/BayesStore/SQL.pm line 135. Aug 7 21:46:03 labtest01c spamd[2693]: bayes: database version 0 is different than we understand (3), aborting! at /usr/lib/perl5/site_perl/5.8.7/Mail/SpamAssassin/BayesStore/SQL.pm line 135. SQL server privilege issue? Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DIGEST_MULTIPLE has undefined dependency 'RAZOR2_CHECK' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DIGEST_MULTIPLE has undefined dependency 'DCC_CHECK' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DRUGS_ERECTILE has undefined dependency '__DRUGS_ERECTILE7' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_SUB_ACCEPT_CCARDS has undefined dependency '__SARE_SUB_FROM_PAYPAL' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_SPEC_PROLEO_M2a has dependency 'MIME_QP_LONG_LINE' with a zero score Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_HEAD_SUBJ_RAND has undefined dependency 'SARE_XMAIL_SUSP2' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_HEAD_SUBJ_RAND has undefined dependency 'SARE_HEAD_XAUTH_WARN' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_HEAD_SUBJ_RAND has dependency 'X_AUTH_WARN_FAKED' with a zero score Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_RD_SAFE has undefined dependency 'SARE_RD_SAFE_MKSHRT' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_RD_SAFE has undefined dependency 'SARE_RD_SAFE_GT' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_RD_SAFE has undefined dependency 'SARE_RD_SAFE_TINY' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_MSGID_LONG45 has undefined dependency '__SARE_MSGID_LONG50' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_MSGID_LONG45 has undefined dependency '__SARE_MSGID_LONG55' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_MSGID_LONG45 has undefined dependency '__SARE_MSGID_LONG65' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_MSGID_LONG45 has undefined dependency '__SARE_MSGID_LONG75' Aug 7 21:46:06 labtest01c spamd[2693]: rules: meta test VIRUS_WARNING_DOOM_BNC has undefined dependency 'VIRUS_WARNING_MYDOOM4' Aug 7 21:46:06 labtest01c spamd[2693]: rules: meta test SARE_OBFU_CIALIS has undefined dependency 'SARE_OBFU_CIALIS2' Aug 7 21:46:06 labtest01c spamd[2693]: rules: meta test FP_MIXED_PORN3 has undefined dependency 'FP_PENETRATION' Not errors, just info. Aug 7 21:46:07 labtest01c spamd[2693]: spamd: server started on port 783/tcp (running version 3.1.4) Aug 7 21:46:07 labtest01c spamd[2693]: spamd: server pid: 2693 Aug 7 21:46:07 labtest01c spamd[2693]: spamd: server successfully spawned child process, pid 2700 Aug 7 21:46:07 labtest01c spamd[2693]: spamd: server successfully spawned child process, pid 2701 Aug 7 21:46:07 labtest01c spamd[2693]: prefork: child states: II Normal startup info. Daryl
Word Doc spam
Received in my .mac (basically a spam bin) account.http://www.triksys.be/docspam.jpg = screenshot of word doc attached.Neer seen this beforeIs this new, or old news?211.16.219.135 is in all kinds of blacklists though.Patrick SneyersBelgiumVan: Robert Nicholson [EMAIL PROTECTED]Datum: 8 augustus 2006 05:34:32 GMT+02:00Aan: users@spamassassin.apache.orgOnderwerp: Latest Network Upgrade not spam.Return-path: [EMAIL PROTECTED]Received: from mac.com (smtpin18-en2 [10.13.11.246]) by ms24.mac.com (iPlanet Messaging Server 5.2 HotFix 2.08 (built Sep 22 2005)) with ESMTP id [EMAIL PROTECTED] for [EMAIL PROTECTED]; Mon, 07 Aug 2006 04:55:24 -0700 (PDT)Received: from localhost (b135.nasicnet.com [211.16.219.135]) by mac.com (Xserve/smtpin18/MantshX 4.0) with SMTP id k77Bt0h3027089 for [EMAIL PROTECTED]; Mon, 07 Aug 2006 04:55:20 -0700 (PDT)Date: Mon, 07 Aug 2006 20:53:23 +0900From: Lizzie Matthews [EMAIL PROTECTED]Subject: August Payment Summary, Invoice #52782To: [EMAIL PROTECTED]Message-id: [EMAIL PROTECTED]MIME-version: 1.0X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1506X-Mailer: Microsoft Outlook, Build 10.0.3416Content-type: multipart/mixed; boundary="=_NextPart_000_0001_01C6BA17.2A1A0400"Importance: NormalX-Priority: 3 (Normal)X-MSMail-priority: NormalOriginal-recipient: rfc822;[EMAIL PROTECTED]--=_NextPart_000_0001_01C6BA17.2A1A0400Content-Type: multipart/mixed; boundary="=_NextPart_001_000E_01C6BA17.2A1A0400"--=_NextPart_001_000E_01C6BA17.2A1A0400Content-Transfer-Encoding: 7bitContent-Type: text/plain; charset=ISO-8859-1; format=flowedPast Due Invoice Attached--=_NextPart_001_000E_01C6BA17.2A1A0400Content-Transfer-Encoding: base64Content-Type: application/msword; name=invoice.docContent-Disposition: inline; filename=invoice.doc
Re: Improved OCR Plugin with approximate matching
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello again, I only wanted to add a small note: I recently saw gifs that cannot be converted using imagemagick because they are either sloppy generated or with intention partly corrupted. Please think about using giftopnm and jpegtopnm instead. If you have a better idea, tell me. To use giftopnm and jpegtopnm, change the code from: if (($ctype eq image/gif) || ($ctype eq image/jpeg)) { open OCR, |/usr/bin/convert - pnm:-|/usr/bin/gocr -i - /tmp/spamassassin.focr.$$; to: if (($ctype eq image/gif) || ($ctype eq image/jpeg)) { if ($ctype eq image/gif) { open OCR, |/usr/bin/giftopnm - |/usr/bin/gocr -i - /tmp/spamassassin.focr.$$; } else { open OCR, |/usr/bin/jpegtopnm - |/usr/bin/gocr -i - /tmp/spamassassin.focr.$$; } Note that with imagemagick, things can get really bad. I experienced a highly increased time to convert (about 30 seconds and then an error message from imagemagick for a 7kb gif file). So I really advise you to change the code to use different tools. These will also complain, for example: giftopnm: Extraneous data at end of image. Skipped to end of image giftopnm: bogus character 0x4f, ignoring giftopnm: bogus character 0xa7, ignoring giftopnm: bogus character 0xc0, ignoring giftopnm: bogus character 0x8a, ignoring giftopnm: Unable to read Color 33 from colormap But it still continues and the text gets recognized correctly. Best regards, Chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2F0IJQIKXnJyDxURAtTdAJ4nx25dKbocHd7DW+ff1biW3GFmMACeO7t0 ZjYofyRHdknL5L3GcyMdgLo= =e1ze -END PGP SIGNATURE-
Broken images in mails
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello there, as I recently mentioned in the FuzzyOcr Thread, I found quite a lot mails that contain broken or corrupted gifs. I found one type that lets convert calculate extremely long and then fails, but with giftopnm it works after it spits out some errors. The other type doesn't work with both, they both say the image is corrupted and don't convert anything, but my browser is fully able to view it. (And yes, I made sure these are really gifs, file says so) Here's an example: samples # giftopnm viagra2.gif giftopnm: EOF or error reading data portion of 194 byte DataBlock from file samples # convert viagra2.gif pnm:- convert: Corrupt image `viagra2.gif'. samples # file viagra2.gif viagra2.gif: GIF image data, version 89a, 353 x 262 But I can view it perfectly. Does anyone know what this could be caused by and a tool which can reliably convert these to pnm? Another question that I would have in mind is, if that was intended to happen... Best regards Chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2F6ZJQIKXnJyDxURAlAqAJwPEvWVasgljWXaXSMty79MmSEMcwCbBp2I DxU9fM/qCWQPgMVp/2lGSXI= =AZAd -END PGP SIGNATURE-
Re: 0451.com
On Mon, 7 Aug 2006, Hamish wrote: Yeah, Right... And Verisign never wildcarded domains either did they? Duh! right back at you. RFC 1123 section 2.1: The syntax of a legal Internet host name was specified in RFC-952 Hostname vs DomainName The domain name system itself doesn't have any restrictions on labels: they are counted binary strings and can contain embedded nul bytes or even dots (see RFC 1035 section 5.1 for an example). Traditionally, RFC 952 host name syntax (as updated by RFC 1123) has also been used for mail domains and delegations from TLDs. The host name syntax described in RFC 1035 is informative, not normative. RFC 1912 is also informative, and it obviously misinterprets RFC 1123 which clearly allows all-numeric labels. All RFCs are not created equal and the earlier ones especially must be interpreted intelligently. Tony. -- f.a.n.finch [EMAIL PROTECTED] http://dotat.at/ FISHER: WEST OR NORTHWEST 4 OR 5 BECOMING VARIABLE 3 OR 4. FAIR. MODERATE OR GOOD.
Re: ImageInfo path
[EMAIL PROTECTED] wrote: Hello all. Mostly a lurker here. I am trying to install the imageinfo plugin. So, i followed the instructions, place *.pm file in Plugins dir and *.cf file in Spamassassin dir. Do a spamassassin --lint and get [6870] warn: plugin: failed to parse plugin (from @INC): Can't locate Mail/SpamA ssassin/Plugin/ImageInfo.pm in @INC (@INC contains: /usr/lib/perl5/vendor_perl/5 .8.3/i586-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/5.8 .3/i586-linux-thread-multi /usr/lib/perl5/5.8.3 /usr/lib/perl5/site_perl/5.8.3/i 586-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl / usr/lib/perl5/vendor_perl) at (eval 58) line 1. [6870] warn: plugin: failed to create instance of plugin Mail::SpamAssassin::Plu gin::ImageInfo: Can't locate object method new via package Mail::SpamAssassin ::Plugin::ImageInfo at (eval 59) line 1. I am sure it has to do with the dir structure. We use oes-linux and the dir structure on it is /etc/mail/spamassassin. So i am asking in what file do i change the path from /mail/spamassassin to /etc/mail/spamassassin. I have searched through the 2 files (*.pm and *.cf and can not find it_). Thanks for any help Hi I had a similiar problem - in my case the plugin was loading twice which didn't work in the second attempt because of a similiar error. There's no problem to put the .pm file in ANY directory you might want, just add this to your v310.pre (or whatever the .pre file is called in your installation) and adjust the path: # ImageInfo - performs some checks over the attached images # loadplugin Mail::SpamAssassin::Plugin::ImageInfo /etc/mail/spamassassin/plugins/ImageInfo.pm Then comment-out the loadplugin line in the .cf file and you're fine. Matt
Bayes errors...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I keep getting the following from spamassasin (Running under amavisd debug-sa). Any ideas what I've done wrong this time? The database is mysql. SpamAssassin is 3.1.4 (It also did the same with 3.1.3). [12172] dbg: bayes: database connection established [12172] dbg: bayes: found bayes db version 3 [12172] dbg: bayes: Using userid: 1 [12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' [12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-?7?k?' for key 1 [12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' [12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-??%m!' for key 1 [12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' [12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-?4??%' for key 1 [12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' [12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-l???' for key 1 When it's not giving the above it gives [12254] dbg: bayes: database connection established [12254] dbg: bayes: found bayes db version 3 [12254] dbg: bayes: Using userid: 1 [12254] dbg: bayes: corpus size: nspam = 7492, nham = 100752 [12254] dbg: bayes: tok_get_all: token count: 198 [12254] dbg: bayes: tok_get_all: SQL error: Illegal mix of collations for operation ' IN ' [12254] dbg: bayes: cannot use bayes on this message; none of the tokens were found in the database [12254] dbg: bayes: not scoring message, returning undef TIA Hamish. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2GqC/3QXwQQkZYwRAsg+AKDTrpxO1Zs/D3vMpHpH33v192LwfACdHriQ gPVGxD5aCuAImhjhUzaFR9w= =kll1 -END PGP SIGNATURE-
Re: Improved OCR Plugin with approximate matching
decoder wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello there, I have improved the original OcrPlugin (found at http://wiki.apache.org/spamassassin/OcrPlugin), so it contains fuzzy matching. Like that, mistakes made by the OCR recognition or intentional obfuscations in the text don't make the recognition impossible. This is being done with a relative distance calculation between the pattern (word from a given word list) and a line in the recognized input. Also, the plugin uses dynamic scoring (more matched words means more score, this can be adjusted in the source). You can find a full description and an example in the wiki under: http://wiki.apache.org/spamassassin/FuzzyOcrPlugin Ideas for improvements or critics are always welcome :) Hi Could this plugin be extended to support png images? I receive quite a few of them... I guess it's probably just a line or two in addition to the jpg and gif Also might it be a good idea not to trust the content-type but instead use file or another 'detection utility'? As mentioned on the original ocrplugin page - gif2pnm and jpg2pnm have been abandoned because of sometimes wrong content types? Matt
Re: Bayes errors...
I'm not sure what you've done there, I didn't realise it was possible to mix collation types in the same table. Have you checked that all tables are the same type? MyISAM or Inno? If they are all the same, I'd be inclined to pull it down, rebuild from the SA supplied SQL and retrain. Did you merge 2 databases at any point? Nigel On Tue, 08 Aug 2006 11:42:11 +0100, Hamish Marson [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I keep getting the following from spamassasin (Running under amavisd debug-sa). Any ideas what I've done wrong this time? The database is mysql. SpamAssassin is 3.1.4 (It also did the same with 3.1.3). [12172] dbg: bayes: database connection established [12172] dbg: bayes: found bayes db version 3 [12172] dbg: bayes: Using userid: 1 [12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' [12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-?7?k?' for key 1 [12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' [12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-??%m!' for key 1 [12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' [12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-?4??%' for key 1 [12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' [12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-l???' for key 1 When it's not giving the above it gives [12254] dbg: bayes: database connection established [12254] dbg: bayes: found bayes db version 3 [12254] dbg: bayes: Using userid: 1 [12254] dbg: bayes: corpus size: nspam = 7492, nham = 100752 [12254] dbg: bayes: tok_get_all: token count: 198 [12254] dbg: bayes: tok_get_all: SQL error: Illegal mix of collations for operation ' IN ' [12254] dbg: bayes: cannot use bayes on this message; none of the tokens were found in the database [12254] dbg: bayes: not scoring message, returning undef TIA Hamish. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2GqC/3QXwQQkZYwRAsg+AKDTrpxO1Zs/D3vMpHpH33v192LwfACdHriQ gPVGxD5aCuAImhjhUzaFR9w= =kll1 -END PGP SIGNATURE-
Re: Bayes errors...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Nigel Frankcom wrote: I'm not sure what you've done there, I didn't realise it was possible to mix collation types in the same table. Have you checked that all tables are the same type? MyISAM or Inno? If they are all the same, I'd be inclined to pull it down, rebuild from the SA supplied SQL and retrain. Did you merge 2 databases at any point? Nope... I think it broke when I updates to 3.1.3 (But it was previously running 3.1.0 fine IIRC). Maybe I'll try a rebuild... on the bayes tables... H -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2G8r/3QXwQQkZYwRAjJdAKCN2POhd2faxG8Um6QZzkcig99A4ACghG3O zT9bwVcF0V+JALTf6TIL55c= =gFaH -END PGP SIGNATURE-
Re: Improved OCR Plugin with approximate matching
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Matthias Keller wrote: decoder wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello there, I have improved the original OcrPlugin (found at http://wiki.apache.org/spamassassin/OcrPlugin), so it contains fuzzy matching. Like that, mistakes made by the OCR recognition or intentional obfuscations in the text don't make the recognition impossible. This is being done with a relative distance calculation between the pattern (word from a given word list) and a line in the recognized input. Also, the plugin uses dynamic scoring (more matched words means more score, this can be adjusted in the source). You can find a full description and an example in the wiki under: http://wiki.apache.org/spamassassin/FuzzyOcrPlugin Ideas for improvements or critics are always welcome :) Hi Could this plugin be extended to support png images? I receive quite a few of them... I guess it's probably just a line or two in addition to the jpg and gif Also might it be a good idea not to trust the content-type but instead use file or another 'detection utility'? As mentioned on the original ocrplugin page - gif2pnm and jpg2pnm have been abandoned because of sometimes wrong content types? Matt That is a good idea... I will try to implement the file command somewhere to make sure we are really using the correct tool to convert. I explicitly use giftopnm and jpegtopnm here (from netpbm) because, as I mentioned in an earlier reply, I received some gifs which are corrupt, and these cause convert from imagemagick to drain CPU for 30 seconds and more without any result... so one should really avoid imagemagick here. In the latest version I am working on, I invoke giffix (from the giflib) to fix these gifs before converting them with giftopnm... Adding png support will not be hard, I will put it on the todo list. I will post a new version in the wiki and announce it here as soon as I am finished. :) Thanks for the input :) Chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2G+rJQIKXnJyDxURAiT0AJ0di3sBaL4D5/mHy0Y7MhXXBlASTgCfRakO lqp2m/v+vdxVJ5gZwIGZ7qo= =6Nt6 -END PGP SIGNATURE-
Re: Bayes errors...
On Tue, 08 Aug 2006 12:02:04 +0100, Hamish Marson [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Nigel Frankcom wrote: I'm not sure what you've done there, I didn't realise it was possible to mix collation types in the same table. Have you checked that all tables are the same type? MyISAM or Inno? If they are all the same, I'd be inclined to pull it down, rebuild from the SA supplied SQL and retrain. Did you merge 2 databases at any point? Nope... I think it broke when I updates to 3.1.3 (But it was previously running 3.1.0 fine IIRC). Maybe I'll try a rebuild... on the bayes tables... H -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2G8r/3QXwQQkZYwRAjJdAKCN2POhd2faxG8Um6QZzkcig99A4ACghG3O zT9bwVcF0V+JALTf6TIL55c= =gFaH -END PGP SIGNATURE- It might be worth trying the MySQL Admin Tool and seeing if it can repair the tables. http://dev.mysql.com/downloads/administrator/1.1.html Nigel
Re: Bayes errors...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Nigel Frankcom wrote: I'm not sure what you've done there, I didn't realise it was possible to mix collation types in the same table. Have you checked that all tables are the same type? MyISAM or Inno? If they are all the same, I'd be inclined to pull it down, rebuild from the SA supplied SQL and retrain. The tables are all MyISAM... But the collation is latin1_swedish_ci for some reason, WHich seems strange to me. What collation do others have? And what should it be? (I'm assuming the problem is SA using utf8 the database using latin1_swedish_ci). I'm looking now to see if it's possible to change the collation on the fly. Did you merge 2 databases at any point? Nigel -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2HDE/3QXwQQkZYwRAl+TAJ92L9d4yvm48M/4fCj/6HlOwJIdfACgvwxz OSh1p7YKzNR/GBNLTsQsSXU= =nTNc -END PGP SIGNATURE-
Re: Re: Bayes errors...
On Tue, 08 Aug 2006 12:08:52 +0100, Hamish Marson [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Nigel Frankcom wrote: I'm not sure what you've done there, I didn't realise it was possible to mix collation types in the same table. Have you checked that all tables are the same type? MyISAM or Inno? If they are all the same, I'd be inclined to pull it down, rebuild from the SA supplied SQL and retrain. The tables are all MyISAM... But the collation is latin1_swedish_ci for some reason, WHich seems strange to me. latin1_swedish_ci is the default for MyISAM and should be fine (mine are set that way here). What collation do others have? And what should it be? (I'm assuming the problem is SA using utf8 the database using latin1_swedish_ci). I'm looking now to see if it's possible to change the collation on the fly. You probably can change it on the fly, but you shouldn't have to, I've checked 3 servers here and they are all latin1_swedish_ci/MyISAM Did you merge 2 databases at any point? Yes, if you merged it's possible things went awry along the way.
Re: Word Doc spam
--On Tuesday, August 08, 2006 10:27 AM +0200 Patrick Sneyers [EMAIL PROTECTED] wrote: Received in my .mac (basically a spam bin) account. http://www.triksys.be/docspam.jpg = screenshot of word doc attached. Neer seen this before Is this new, or old news? 211.16.219.135 is in all kinds of blacklists though. I was surprised to see one of these as well. I'd always thought that it would be nice for the Open Office people to create a simple command-line utility to convert Word files to plain text for spam checking. Or it could strip any macros for virus protection.
Re: Word Doc spam
* Kenneth Porter [EMAIL PROTECTED]: I was surprised to see one of these as well. I'd always thought that it would be nice for the Open Office people to create a simple command-line utility to convert Word files to plain text for spam checking. man antiword -- Ralf Hildebrandt (i.A. des IT-Zentrums) [EMAIL PROTECTED] Charite - Universitätsmedizin BerlinTel. +49 (0)30-450 570-155 Gemeinsame Einrichtung von FU- und HU-BerlinFax. +49 (0)30-450 570-962 IT-Zentrum Standort CBF send no mail to [EMAIL PROTECTED]
Re: Broken images in mails
--On Tuesday, August 08, 2006 11:51 AM +0200 decoder [EMAIL PROTECTED] wrote: as I recently mentioned in the FuzzyOcr Thread, I found quite a lot mails that contain broken or corrupted gifs. Until we have a better answer, I'd reject anything with an unrecognizable format. It might be an attempt to exploit an overflow bug in an older copy of IE. Similarly, I'm a fan of validating HTML and rejecting broken stuff, but that would reject a lot of stuff created by MS software. OTOH
Re: Improved OCR Plugin with approximate matching
Perhaps corrupted gifs should be treated as spam? decoder wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello again, I only wanted to add a small note: I recently saw gifs that cannot be converted using imagemagick because they are either sloppy generated or with intention partly corrupted. Please think about using giftopnm and jpegtopnm instead. If you have a better idea, tell me. To use giftopnm and jpegtopnm, change the code from: if (($ctype eq "image/gif") || ($ctype eq "image/jpeg")) { open OCR, "|/usr/bin/convert - pnm:-|/usr/bin/gocr -i - /tmp/spamassassin.focr.$$"; to: if (($ctype eq "image/gif") || ($ctype eq "image/jpeg")) { if ($ctype eq "image/gif") { open OCR, "|/usr/bin/giftopnm - |/usr/bin/gocr -i - /tmp/spamassassin.focr.$$"; } else { open OCR, "|/usr/bin/jpegtopnm - |/usr/bin/gocr -i - /tmp/spamassassin.focr.$$"; } Note that with imagemagick, things can get really bad. I experienced a highly increased time to convert (about 30 seconds and then an error message from imagemagick for a 7kb gif file). So I really advise you to change the code to use different tools. These will also complain, for example: giftopnm: Extraneous data at end of image. Skipped to end of image giftopnm: bogus character 0x4f, ignoring giftopnm: bogus character 0xa7, ignoring giftopnm: bogus character 0xc0, ignoring giftopnm: bogus character 0x8a, ignoring giftopnm: Unable to read Color 33 from colormap But it still continues and the text gets recognized correctly. Best regards, Chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2F0IJQIKXnJyDxURAtTdAJ4nx25dKbocHd7DW+ff1biW3GFmMACeO7t0 ZjYofyRHdknL5L3GcyMdgLo= =e1ze -END PGP SIGNATURE-
Re: Improved OCR Plugin with approximate matching
On Tue, 8 Aug 2006, decoder wrote: I only wanted to add a small note: I recently saw gifs that cannot be converted using imagemagick because they are either sloppy generated or with intention partly corrupted. Please think about using giftopnm and jpegtopnm instead. If you have a better idea, tell me. giftopnm: Extraneous data at end of image. Skipped to end of image giftopnm: bogus character 0x4f, ignoring giftopnm: bogus character 0xa7, ignoring giftopnm: bogus character 0xc0, ignoring giftopnm: bogus character 0x8a, ignoring giftopnm: Unable to read Color 33 from colormap Add a few points for the fact that it is a corrupt GIF? -- John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- In 1998 more than three times as many people in the US were killed by incompetent physicians than were killed by handguns, yet the President of the A.M.A. is adopting gun safety as his platform. ---
Re: ImageInfo path
On Tue, Aug 08, 2006 at 12:33:38PM +0200, Matthias Keller wrote: Then comment-out the loadplugin line in the .cf file and you're fine. Generally speaking, don't put loadplugin lines in your cf files. (if people are looking at the sandbox cf and saying but you do that, yes, for development. ;) ) -- Randomly Generated Tagline: You can tell that I got this out from the newspaper because it looks like I cut it out with a spatula. - Jim Duncan pgp4osrud0lZQ.pgp Description: PGP signature
Re: ImageInfo path
On Tue, August 8, 2006 12:33, Matthias Keller wrote: # ImageInfo - performs some checks over the attached images # loadplugin Mail::SpamAssassin::Plugin::ImageInfo /etc/mail/spamassassin/plugins/ImageInfo.pm Then comment-out the loadplugin line in the .cf file and you're fine. make it into local.pre will be better since it still works after spamassassin update to new version idealy loadplugin should be in a seperately pre file pr plugin where the pre file is the same filename as cf file, this list them nicely when ls -l :-) -- Benny
Re: URIBL and SURBL no lnger hitting
On Monday, August 7, 2006, 1:56:41 PM, DAve DAve wrote: In frustration I edited /etc/resolv.conf and removed 127.0.0.1, URI lookups are completing and MailScanner is blasting through the queues on both machines exceedingly fast now. No idea what could have possibly changed, dnscache is normally bulletproof. I run it on a dozen servers as a local cache, it is a standard install on all my servers and all installs share the same config. Especially since dig worked, and still works to 127.0.0.1. Perhaps there's an incompatability between dnscache and the way SA 3.1 does DNSBL queries. Please open a bugzilla about it: http://issues.apache.org/SpamAssassin/ Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/
Re: URIBL and SURBL no lnger hitting
Jeff Chan wrote: On Monday, August 7, 2006, 1:56:41 PM, DAve DAve wrote: In frustration I edited /etc/resolv.conf and removed 127.0.0.1, URI lookups are completing and MailScanner is blasting through the queues on both machines exceedingly fast now. No idea what could have possibly changed, dnscache is normally bulletproof. I run it on a dozen servers as a local cache, it is a standard install on all my servers and all installs share the same config. Especially since dig worked, and still works to 127.0.0.1. Perhaps there's an incompatability between dnscache and the way SA 3.1 does DNSBL queries. Please open a bugzilla about it: http://issues.apache.org/SpamAssassin/ Jeff C. Hi, Unlikely to be a dnscache issue. I run over 10 SA servers, all with local djb dnscaches. Regards, Rick
Re: ImageInfo path
Benny Pedersen wrote: On Tue, August 8, 2006 12:33, Matthias Keller wrote: # ImageInfo - performs some checks over the attached images # loadplugin Mail::SpamAssassin::Plugin::ImageInfo /etc/mail/spamassassin/plugins/ImageInfo.pm Then comment-out the loadplugin line in the .cf file and you're fine. make it into local.pre will be better since it still works after spamassassin update to new version Correct me if I'm wrong but the doc says, it will NOT be overwritten in an upgrade... !? # This file was installed during the installation of SpamAssassin 3.1.0, # and contains plugin loading commands for the new plugins added in that # release. It will not be overwritten during future SpamAssassin installs, # so you can modify it to enable some disabled-by-default plugins below, # if you so wish. Matt
RE: Bayes errors...
This is because your database is in UTF8 format. As a result SA cannot read it (though it can write it). Drop the database and recreate it and the tables in latin and it will work just fine. You will have to retrain after that though. -Original Message- From: Hamish Marson [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 08, 2006 3:42 AM To: users@spamassassin.apache.org Subject: Bayes errors... -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I keep getting the following from spamassasin (Running under amavisd debug-sa). Any ideas what I've done wrong this time? The database is mysql. SpamAssassin is 3.1.4 (It also did the same with 3.1.3). [12172] dbg: bayes: database connection established [12172] dbg: bayes: found bayes db version 3 [12172] dbg: bayes: Using userid: 1 [12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' [12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-?7?k?' for key 1 [12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' [12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-??%m!' for key 1 [12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' [12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-?4??%' for key 1 [12172] dbg: bayes: tok_get: SQL error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' [12172] dbg: bayes: _put_token: SQL error: Duplicate entry '1-l???' for key 1 When it's not giving the above it gives [12254] dbg: bayes: database connection established [12254] dbg: bayes: found bayes db version 3 [12254] dbg: bayes: Using userid: 1 [12254] dbg: bayes: corpus size: nspam = 7492, nham = 100752 [12254] dbg: bayes: tok_get_all: token count: 198 [12254] dbg: bayes: tok_get_all: SQL error: Illegal mix of collations for operation ' IN ' [12254] dbg: bayes: cannot use bayes on this message; none of the tokens were found in the database [12254] dbg: bayes: not scoring message, returning undef TIA Hamish. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2GqC/3QXwQQkZYwRAsg+AKDTrpxO1Zs/D3vMpHpH33v192LwfACdHriQ gPVGxD5aCuAImhjhUzaFR9w= =kll1 -END PGP SIGNATURE-
Re: Am I wasting my time with SpamCop?
On Saturday, August 5, 2006, 12:46:20 PM, Benny Pedersen wrote: spamcop.com is the windows client for spamcop.net ? No, IIRC it's something totally different that's squatting a similar domain name, probably on purpose. Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/
Re: Am I wasting my time with SpamCop?
On Thursday, August 3, 2006, 7:40:57 AM, Andrzej Filip wrote: Make *clear* distiction between thre basic ways of using spmacop.net Correct: spamcop.net has multiple functions. 1) email blocking at MTA level [may be controversial cause of zero+ tolerance] Not recommended. Too many FPs to block outright using the SpamCop BL at the MTA level. 2) scoring by SpamAssassin [score may be decreased or zeroed] Excellent use for it since SA gives it an appropriately low score for the FP level in the SpamCop BL. They way you get the benefits of the fairly aggressive correct hits, but not much aggravation from the FPs. 3) spam *reporting* (automatization of sending LARTs) [*I recomend it*] Also recommended here. Reporting spam using SpamCop gets some spams blocked using the SpamCop BL. But it also gets them blacklisted using the SURBL SC list, which is very effective and has additional processing and whitelisting so it doesn't FP like SpamCop's own IP BL does: http://www.surbl.org/lists.html#sc Therefore, please report spams using SpamCop. It's worth mentioning that despite the munging SpamCop does to try to not be a confirmation loop for spammers, using SpamCop may result in some more spam due to that effect. There is also a mole option you can set in SpamCop that does not report, just blacklists. Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/
Re: URIBL and SURBL no lnger hitting
Jeff Chan wrote: On Monday, August 7, 2006, 1:56:41 PM, DAve DAve wrote: In frustration I edited /etc/resolv.conf and removed 127.0.0.1, URI lookups are completing and MailScanner is blasting through the queues on both machines exceedingly fast now. No idea what could have possibly changed, dnscache is normally bulletproof. I run it on a dozen servers as a local cache, it is a standard install on all my servers and all installs share the same config. Especially since dig worked, and still works to 127.0.0.1. Perhaps there's an incompatability between dnscache and the way SA 3.1 does DNSBL queries. Please open a bugzilla about it: http://issues.apache.org/SpamAssassin/ Jeff C. I had no logging running on dnscache before so I don't *know* what was happening. I re-enabled logging and the issue went away. To be specific I changed my run file from exec setuidgid Gdnslog multilog -* to exec setuidgid Gdnslog multilog t ./main Which should make no difference. Oddly though restarting dnscache several times didn't help previously. I can open a bug report, and help troubleshoot, if you believe it smart to do so. But at this time I really don't think it is an SA issue. DAve -- Three years now I've asked Google why they don't have a logo change for Memorial Day. Why do they choose to do logos for other non-international holidays, but nothing for Veterans? Maybe they forgot who made that choice possible.
RE: Bayes DB version issue 3.1.3 = 3.1.4
Daryl, Thanks for the info. I will update the .8. As for the database, which is the primary concern, the user account is correct. I have logged into the database from that server using the same credentials from the local.cf file. I had thought that we might have restricted by subnet so I did indeed try that last night. [EMAIL PROTECTED] spamassassin]# mysql -u xxx -h xx.xx.xx.xx -D spamassassin -p Enter password: Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 6649341 to server version: 4.1.7-log Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql show tables; ++ | Tables_in_spamassassin | ++ | awl| | bayes_expire | | bayes_global_vars | | bayes_seen | | bayes_token| | bayes_vars | | userpref | ++ 7 rows in set (0.00 sec) mysql select * from bayes_global_vars; +--+---+ | variable | value | +--+---+ | VERSION | 3 | +--+---+ 1 row in set (0.00 sec) mysql -Original Message- From: Daryl C. W. O'Shea [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 08, 2006 12:38 AM To: Gary W. Smith Cc: users@spamassassin.apache.org Subject: Re: Bayes DB version issue 3.1.3 = 3.1.4 On 8/8/2006 3:29 AM, Gary W. Smith wrote: Hello, I can't remember smoking crack when copying the config files over but anything's possible. I built out a new machine today and installed SA. We have a list of CPAN modules that were installed (same list as from the 3.1.3 servers). I copied everything in the /etc/mail/spamassassin from our productions servers to the test server and after starting we receive errors. I have checked and the MySQL data instance is accessible from this server. There are also several rules that are errors as well. I know that someone has asked this question already but I didn't find the answer in the thread archive. Here are the contents of the log file: Aug 7 21:45:59 labtest01c spamd[2693]: config: score: the non-numeric score (.8) is not valid, a numeric score is required Aug 7 21:45:59 labtest01c spamd[2693]: config: SpamAssassin failed to parse line, SUBJ_HAS_UNIQ_ID .8 0.212 0.682 1.677 is not valid for score, skipping: score SUBJ_HAS_UNIQ_ID .8 0.212 0.682 1.677 .8 requires a leading zero. Aug 7 21:46:01 labtest01c spamd[2693]: bayes: database version 0 is different than we understand (3), aborting! at /usr/lib/perl5/site_perl/5.8.7/Mail/SpamAssassin/BayesStore/SQL.pm line 135. Aug 7 21:46:03 labtest01c spamd[2693]: bayes: database version 0 is different than we understand (3), aborting! at /usr/lib/perl5/site_perl/5.8.7/Mail/SpamAssassin/BayesStore/SQL.pm line 135. SQL server privilege issue? Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DIGEST_MULTIPLE has undefined dependency 'RAZOR2_CHECK' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DIGEST_MULTIPLE has undefined dependency 'DCC_CHECK' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DRUGS_ERECTILE has undefined dependency '__DRUGS_ERECTILE7' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_SUB_ACCEPT_CCARDS has undefined dependency '__SARE_SUB_FROM_PAYPAL' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_SPEC_PROLEO_M2a has dependency 'MIME_QP_LONG_LINE' with a zero score Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_HEAD_SUBJ_RAND has undefined dependency 'SARE_XMAIL_SUSP2' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_HEAD_SUBJ_RAND has undefined dependency 'SARE_HEAD_XAUTH_WARN' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_HEAD_SUBJ_RAND has dependency 'X_AUTH_WARN_FAKED' with a zero score Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_RD_SAFE has undefined dependency 'SARE_RD_SAFE_MKSHRT' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_RD_SAFE has undefined dependency 'SARE_RD_SAFE_GT' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_RD_SAFE has undefined dependency 'SARE_RD_SAFE_TINY' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_MSGID_LONG45 has undefined dependency '__SARE_MSGID_LONG50' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_MSGID_LONG45 has undefined dependency '__SARE_MSGID_LONG55' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_MSGID_LONG45 has undefined dependency '__SARE_MSGID_LONG65' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_MSGID_LONG45 has undefined dependency '__SARE_MSGID_LONG75' Aug 7 21:46:06 labtest01c spamd[2693]: rules: meta test VIRUS_WARNING_DOOM_BNC has
Re: Am I wasting my time with SpamCop?
On Wednesday, August 2, 2006, 2:01:44 PM, Michele Blacknight.ie wrote: Steven W. Orr wrote: Hold on there Bullwinkle! I have been religiously using spamcop in the hopes that the reports that are sent out get used by at least some of the ISPs. Am I wrong about this? We're an ISP and we take every single spamcop report (or other email abuse report) seriously and investigate all of them. Same here. We take all abuse reports seriously and investigate them, including SpamCop reports. Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/
Re: URIBL and SURBL no lnger hitting
On Tuesday, August 8, 2006, 7:53:45 AM, Rick Macdougall wrote: Jeff Chan wrote: On Monday, August 7, 2006, 1:56:41 PM, DAve DAve wrote: In frustration I edited /etc/resolv.conf and removed 127.0.0.1, URI lookups are completing and MailScanner is blasting through the queues on both machines exceedingly fast now. No idea what could have possibly changed, dnscache is normally bulletproof. I run it on a dozen servers as a local cache, it is a standard install on all my servers and all installs share the same config. Especially since dig worked, and still works to 127.0.0.1. Perhaps there's an incompatability between dnscache and the way SA 3.1 does DNSBL queries. Please open a bugzilla about it: http://issues.apache.org/SpamAssassin/ Jeff C. Hi, Unlikely to be a dnscache issue. I run over 10 SA servers, all with local djb dnscaches. Aha, but do you use Linux or FreeBSD? I can't remember the details but I remember a FreeBSD/SA issue recently. Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/
Multiple image spams: best practices?
Aside from the experimental OCR some folks are trying, what SA techniques are folks having good luck with for stopping those stock spams that are multiple, vertical images? Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/
Re: URIBL and SURBL no lnger hitting
On Tuesday, August 8, 2006, 8:05:04 AM, DAve DAve wrote: I had no logging running on dnscache before so I don't *know* what was happening. I re-enabled logging and the issue went away. To be specific I changed my run file from exec setuidgid Gdnslog multilog -* to exec setuidgid Gdnslog multilog t ./main Which should make no difference. Oddly though restarting dnscache several times didn't help previously. I can open a bug report, and help troubleshoot, if you believe it smart to do so. But at this time I really don't think it is an SA issue. Hmm, well only file a bug report if it's a specific SA interaction. But it would still be nice to know what's causing it, even if it's not SA or an interaction with it. Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/
Re: URIBL and SURBL no lnger hitting
Jeff Chan wrote: On Tuesday, August 8, 2006, 7:53:45 AM, Rick Macdougall wrote: Unlikely to be a dnscache issue. I run over 10 SA servers, all with local djb dnscaches. Aha, but do you use Linux or FreeBSD? I can't remember the details but I remember a FreeBSD/SA issue recently. Hi, Both. FreeBSD v4.8 - v6.0, Slackware, Centos and Fedora. No problems on any of them. Regards, Rick
Looking for a good Ebay whitelist
All, I have been having FPs from Ebay in AU and DE, as well as [EMAIL PROTECTED] Does anybody have a good whitelist for these?
Re: Multiple image spams: best practices?
On Tuesday, August 8, 2006, 8:08:04 AM, Jeff Chan wrote: Aside from the experimental OCR some folks are trying, what SA techniques are folks having good luck with for stopping those stock spams that are multiple, vertical images? Any technique for single image stock spams would be welcomed too! Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/
Re: URIBL and SURBL no lnger hitting
Jeff Chan wrote: On Tuesday, August 8, 2006, 8:05:04 AM, DAve DAve wrote: I had no logging running on dnscache before so I don't *know* what was happening. I re-enabled logging and the issue went away. To be specific I changed my run file from exec setuidgid Gdnslog multilog -* to exec setuidgid Gdnslog multilog t ./main Which should make no difference. Oddly though restarting dnscache several times didn't help previously. I can open a bug report, and help troubleshoot, if you believe it smart to do so. But at this time I really don't think it is an SA issue. Hmm, well only file a bug report if it's a specific SA interaction. But it would still be nice to know what's causing it, even if it's not SA or an interaction with it. Jeff C. If it happens again I'll have some logs, provided I catch it in time, dnscache makes logs like bunnies make more bunnies. Until then I'm inclined to think it was a resource issue or anomaly on my system rather than an issue with SA or dnscache. I run dnscache on all my web/mail/SA/ftp servers on FreeBSD, Linux, and Solaris. Never had the slightest issue with any software making dns queries through it. DAve -- Three years now I've asked Google why they don't have a logo change for Memorial Day. Why do they choose to do logos for other non-international holidays, but nothing for Veterans? Maybe they forgot who made that choice possible.
Re: Multiple image spams: best practices?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jeff Chan wrote: On Tuesday, August 8, 2006, 8:08:04 AM, Jeff Chan wrote: Aside from the experimental OCR some folks are trying, what SA techniques are folks having good luck with for stopping those stock spams that are multiple, vertical images? Any technique for single image stock spams would be welcomed too! Je Well the OCR technique is quite effective in my tests, so far it works for jpegs and gifs, and I'm extending it to pngs. I will soon release a new version in the wiki. Any specific reason why you don't want to use OCR? Chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2K0aJQIKXnJyDxURAphWAJwI1YCe317NyCSEnIf0I+rV0WEeswCfV2cg I2jlyottqh+2/SlUA1S2UmM= =IJZu -END PGP SIGNATURE-
Re: Multiple image spams: best practices?
On Tue, 2006-08-08 at 08:22 -0700, Jeff Chan wrote: On Tuesday, August 8, 2006, 8:08:04 AM, Jeff Chan wrote: Aside from the experimental OCR some folks are trying, what SA techniques are folks having good luck with for stopping those stock spams that are multiple, vertical images? Any technique for single image stock spams would be welcomed too! I haven't had a single stock spam get through since adding the ImageInfo plugin. (Note this is not the same as the OCR plugin.) -Bill
Re: Multiple image spams: best practices?
On Tuesday, August 8, 2006, 8:26:18 AM, decoder decoder wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jeff Chan wrote: On Tuesday, August 8, 2006, 8:08:04 AM, Jeff Chan wrote: Aside from the experimental OCR some folks are trying, what SA techniques are folks having good luck with for stopping those stock spams that are multiple, vertical images? Any technique for single image stock spams would be welcomed too! Je Well the OCR technique is quite effective in my tests, so far it works for jpegs and gifs, and I'm extending it to pngs. I will soon release a new version in the wiki. Any specific reason why you don't want to use OCR? I assume it's CPU intensive or at least adds some overhead to already busy SA servers Maybe it's a wrong assumption? Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/
Re: Multiple image spams: best practices?
Jeff Chan wrote: On Tuesday, August 8, 2006, 8:08:04 AM, Jeff Chan wrote: Aside from the experimental OCR some folks are trying, what SA techniques are folks having good luck with for stopping those stock spams that are multiple, vertical images? Any technique for single image stock spams would be welcomed too! Jeff C. Well, I'm using the ImageInfo Plugin for some days now with quite good results (doesn't catch all tough) and I'm using the FuzzyOCR for a few hours too with positive results too... I guess these are the only real countermeasures against this type of spam at the moment because as long as you dont look at the picture itself you can't really distinguish a stock spam from a normal inline picture in a text message Just my 2c Matt
Re: Multiple image spams: best practices?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jeff Chan wrote: On Tuesday, August 8, 2006, 8:26:18 AM, decoder decoder wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jeff Chan wrote: On Tuesday, August 8, 2006, 8:08:04 AM, Jeff Chan wrote: Aside from the experimental OCR some folks are trying, what SA techniques are folks having good luck with for stopping those stock spams that are multiple, vertical images? Any technique for single image stock spams would be welcomed too! Je Well the OCR technique is quite effective in my tests, so far it works for jpegs and gifs, and I'm extending it to pngs. I will soon release a new version in the wiki. Any specific reason why you don't want to use OCR? I assume it's CPU intensive or at least adds some overhead to already busy SA servers Maybe it's a wrong assumption? Jeff C. I noticed that the OCR process itself does not seem to be so CPU consuming, but imagemagick was often consuming resources (which I replaced now). Anyway, sure it adds some overhead, but if you take into account that the whole test only even starts when an image is found, then this doesn't seem too bad to me. Not every mail I get contains an image, lets say every tenth maybe. That is acceptable for me at least. Chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2K5ZJQIKXnJyDxURAlWaAKCYOLLfL8akV269hH08MTJkHexadQCZAYKl UHALPQ5biQzppL+dIa2+NQ8= =yvbv -END PGP SIGNATURE-
Re: Looking for a good Ebay whitelist
On Tue, 8 Aug 2006, wrote: I have been having FPs from Ebay in AU and DE, as well as [EMAIL PROTECTED] Does anybody have a good whitelist for these? Because so many people try to forge messages from eBay but what comes from their own servers is almost definitely not spam, eBay seems like an ideal example of an organization that could benefit from SPF. And sure enough: $ host -t TXT ebay.com ebay.com descriptive text spf2.0/pra mx include:s._sid.ebay.com include:m._sid.ebay.com include:p._sid.ebay.com include:c._sid.ebay.com ~all ebay.com descriptive text v=spf1 mx include:s._spf.ebay.com include:m._spf.ebay.com include:p._spf.ebay.com include:c._spf.ebay.com ~all $ host -t TXT ebay.com.au ebay.com.au descriptive text spf2.0/pra mx include:s._sid.ebay.com include:m._sid.ebay.com include:p._sid.ebay.com include:c._sid.ebay.com ~all ebay.com.au descriptive text v=spf1 mx include:s._spf.ebay.com include:m._spf.ebay.com include:p._spf.ebay.com include:c._spf.ebay.com ~all $ host -t TXT ebay.de ebay.de descriptive text v=spf1 mx include:s._spf.ebay.com include:m._spf.ebay.com include:p._spf.ebay.com include:c._spf.ebay.com ~all ebay.de descriptive text spf2.0/pra mx include:s._sid.ebay.com include:m._sid.ebay.com include:p._sid.ebay.com include:c._sid.ebay.com ~all So it seems like SPF is probably something good to rely on in this case. I don't fully understand the SPF plug-in, but perhaps all you need to do is add the appropriate ebay domains to new def_whitelist_from_spf rules like the ones in 60_whitelist_spf.cf. This page: http://pages.ebay.com/help/confidence/isgw-account-theft-spoof.html has a list of eBay's US and international web sites, so presumably the list of valid e-mail domains ([EMAIL PROTECTED], [EMAIL PROTECTED], etc.) can be easily and correctly derived from that list. - Logan
Re: URIBL and SURBL no lnger hitting
DAve wrote: [snip] If it happens again I'll have some logs, provided I catch it in time, dnscache makes logs like bunnies make more bunnies. Until then I'm inclined to think it was a resource issue or anomaly on my system rather than an issue with SA or dnscache. I run dnscache on all my web/mail/SA/ftp servers on FreeBSD, Linux, and Solaris. Never had the slightest issue with any software making dns queries through it. DAve Dave, you might need to update the 'root/servers/@' file. IIRC, a couple of root servers have changed in the past few years. - dhawal
RE: Looking for a good Ebay whitelist
RE: Looking for a good Ebay whitelist The following are what I have deemed as frequently used official e-bay smtp servers. This list might be used for whitelisting or/and negative scoring: 66.135.195.180-181 66.135.195.254 66.135.197.7-29 66.135.197.164 66.135.207.155 66.135.209.198-221 66.135.215.231-240 216.113.168.128 216.113.168.139 216.113.184.201-203 216.113.188.96 216.113.188.112 216.113.188.202 But I make no guarantees about this list. Please correct me if there are any errors or omissions. Use at your own risk. Rob McEwen PowerView Systems
Re: URIBL and SURBL no lnger hitting
Dhawal Doshy wrote: DAve wrote: [snip] If it happens again I'll have some logs, provided I catch it in time, dnscache makes logs like bunnies make more bunnies. Until then I'm inclined to think it was a resource issue or anomaly on my system rather than an issue with SA or dnscache. I run dnscache on all my web/mail/SA/ftp servers on FreeBSD, Linux, and Solaris. Never had the slightest issue with any software making dns queries through it. DAve Dave, you might need to update the 'root/servers/@' file. IIRC, a couple of root servers have changed in the past few years. - dhawal We replace the @ file with one of our own on every server. I contains just our dns servers and our own caches. I know what you're thinking. I checked my DNS servers first and found them with plenty of resources, that is why I took dnscache out of /etc/resolv.conf yesterday, and then discovered the problem. DAve -- Three years now I've asked Google why they don't have a logo change for Memorial Day. Why do they choose to do logos for other non-international holidays, but nothing for Veterans? Maybe they forgot who made that choice possible.
Re: ImageInfo path
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 hi, Theo Van Dinter wrote, On 8/8/06 7:04 AM: On Tue, Aug 08, 2006 at 12:33:38PM +0200, Matthias Keller wrote: Then comment-out the loadplugin line in the .cf file and you're fine. Generally speaking, don't put loadplugin lines in your cf files. (if people are looking at the sandbox cf and saying but you do that, yes, for development. ;) ) well, there _is_ 72_active.cf, which contains: ##{ loadplugin_Mail::SpamAssassin::Plugin::SendmailID tryplugin Mail::SpamAssassin::Plugin::SendmailID SendmailID.pm ##} loadplugin_Mail::SpamAssassin::Plugin::SendmailID which, despite: % grep SendMail /var/MailServer/Conf/SA/Local/local/init.pre loadplugin Mail::SpamAssassin::Plugin::SendmailID SendmailID.pm and, % ls /var/MailServer/Conf/SA/Local/local/SendmailID.pm /var/MailServer/Conf/SA/Local/SendmailID.pm on: % spamassassin --lint \ --debug \ --siteconfigpath=/var/MailServer/Conf/SA/Dist \ --configpath=/var/MailServer/Conf/SA/Local \ --prefs-file=local.cf \ --nocreate-prefs reports ( | grep -i sendmail): [17634] dbg: plugin: fixed relative path: /var/MailServer/Conf/SA/Local/SendmailID.pm [17634] dbg: plugin: loading Mail::SpamAssassin::Plugin::SendmailID from /var/MailServer/Conf/SA/Local/SendmailID.pm [17634] dbg: plugin: registered Mail::SpamAssassin::Plugin::SendmailID=HASH(0x1fe4ca4) [17634] dbg: plugin: fixed relative path: /var/MailServer/Conf/SA/Dist/SendmailID.pm [17634] dbg: plugin: loading Mail::SpamAssassin::Plugin::SendmailID from /var/MailServer/Conf/SA/Dist/SendmailID.pm [17634] warn: Subroutine new redefined at /var/MailServer/Conf/SA/Dist/SendmailID.pm line 19. [17634] warn: Subroutine check_sendmail_id redefined at /var/MailServer/Conf/SA/Dist/SendmailID.pm line 32. [17634] warn: Subroutine time2smid redefined at /var/MailServer/Conf/SA/Dist/SendmailID.pm line 67. [17634] dbg: plugin: did not register Mail::SpamAssassin::Plugin::SendmailID=HASH(0x1fe43f8), already registered [17634] dbg: plugin: fixed relative path: /var/MailServer/Conf/SA/Dist/updates_spamassassin_org/SendmailID.pm [17634] dbg: plugin: loading Mail::SpamAssassin::Plugin::SendmailID from /var/MailServer/Conf/SA/Dist/updates_spamassassin_org/SendmailID.pm [17634] dbg: plugin: failed to parse tryplugin /var/MailServer/Conf/SA/Dist/updates_spamassassin_org/SendmailID.pm: Can't locate /var/MailServer/Conf/SA/Dist/updates_spamassassin_org/SendmailID.pm in @INC (@INC contains: lib /usr/local/perl_libs/sitelib/darwin-thread-multi-2level /usr/local/perl_libs/sitelib /usr/local/perl_libs/privlib/darwin-thread-multi-2level /usr/local/perl_libs/privlib /usr/local/perl_libs/vendorlib/darwin-thread-multi-2level /usr/local/perl_libs/vendorlib) at /usr/local/perl_libs/sitelib/Mail/SpamAssassin/PluginHandler.pm line 96. which includes both a SUCCESS a FAILURE. this seem unique to SendMail.pm. am i config'ing wrong? richard - -- /\ \ / ASCII Ribbon Campaign X against HTML email, vCards / \ micro$oft attachments [GPG] OpenMacNews at gmail dot com fingerprint: 50C9 1C46 2F8F DE42 2EDB D460 95F7 DDBD 3671 08C6 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (Darwin) iEYEAREDAAYFAkTYtzoACgkQlffdvTZxCMa3hQCfdnAl6B5TMuZYC6+zto631Mg1 5qIAniuz/5xL29iIIw2KLySljWDOdHZE =r+gP -END PGP SIGNATURE-
RE: Looking for a good Ebay whitelist
On Tue, 8 Aug 2006, Rob McEwen wrote: The following are what I have deemed as frequently used official e-bay smtp servers. This list might be used for whitelisting or/and negative scoring: 66.135.195.180-181 66.135.195.254 66.135.197.7-29 66.135.197.164 66.135.207.155 66.135.209.198-221 66.135.215.231-240 216.113.168.128 216.113.168.139 216.113.184.201-203 216.113.188.96 216.113.188.112 216.113.188.202 But I make no guarantees about this list. Please correct me if there are any errors or omissions. Use at your own risk. By looking up their SPF records, you get a much larger list: s._spf.ebay.com descriptive text v=spf1 ip4:66.135.209.192/27 ip4:66.135.197.0/27 ip4:64.4.240.64/27 ip4:64.4.244.64/27 ~all m._spf.ebay.com descriptive text v=spf1 ip4:66.135.215.224/27 ip4:216.33.244.96/27 ip4:216.33.244.84 ~all p._spf.ebay.com descriptive text v=spf1 ip4:67.72.99.26 ip4:206.165.246.83 ip4:206.165.246.84 ip4:206.165.246.85 ip4:206.165.246.86 ip4:64.127.115.252 ip4:194.64.234.129/27 include:p2._spf.ebay.com ~all p2._spf.ebay.com descriptive text v=spf1 ip4:65.110.161.77 ip4:204.13.11.49 ip4:204.13.11.51 ~all c._spf.ebay.com descriptive text v=spf1 ip4:12.155.144.75 ip4:62.22.61.131 ip4:63.104.149.126 ip4:64.68.79.253 ip4:64.94.204.222 ip4:66.135.215.134 ip4:67.72.12.29 include:c2._spf.ebay.com ~all c2._spf.ebay.com descriptive text v=spf1 ip4:80.93.9.10 ip4:195.234.136.12 ip4:203.49.69.114 ip4:209.63.28.11 ip4:210.80.80.136 ip4:212.110.10.2 ip4:212.147.136.123 include:c3._spf.ebay.com ~all c3._spf.ebay.com descriptive text v=spf1 ip4:213.219.8.227 ip4:216.113.168.128 ip4:216.113.175.128 ip4:216.177.178.3 ip4:217.149.33.234 ip4:220.248.6.124 ip4:67.72.12.30 include:c4._spf.ebay.com ~all c4._spf.ebay.com descriptive text v=spf1 ip4:216.113.188.112 ip4:80.66.137.58 ip4:212.208.64.34 ip4:216.113.188.96 ip4:216.33.244.6 ip4:216.33.244.7 ~all Grabbing the IP addresses out of that looks like this: 12.155.144.75 62.22.61.131 63.104.149.126 64.127.115.252 64.4.240.64/27 64.4.244.64/27 64.68.79.253 64.94.204.222 65.110.161.77 66.135.197.0/27 66.135.209.192/27 66.135.215.134 66.135.215.224/27 67.72.12.29 67.72.12.30 67.72.99.26 80.66.137.58 80.93.9.10 194.64.234.129/27 195.234.136.12 203.49.69.114 204.13.11.49 204.13.11.51 206.165.246.83 206.165.246.84 206.165.246.85 206.165.246.86 209.63.28.11 210.80.80.136 212.110.10.2 212.147.136.123 212.208.64.34 213.219.8.227 216.113.168.128 216.113.175.128 216.113.188.112 216.113.188.96 216.177.178.3 216.33.244.6 216.33.244.7 216.33.244.84 216.33.244.96/27 217.149.33.234 220.248.6.124 Of course, it is probably better to use whatever they publish through DNS rather than your own fixed copy of the list, which is bound to go out of date. Especially since a list that long is likely to be out of date pretty quickly. Also, in my previous message, I suggested maybe the original poster should add some def_whitelist_from_spf configuration lines. perldoc Mail::SpamAssassin::Plugin::SPF seems to indicates that whitelist_from_spf (no def) would be better. Sort of on the same subject, is there any kind of network whitelist of domains that both (a) can be trusted to themselves not send out spam and (b) have valid SPF records? SPF strikes me as a useful way of authenticating that messages were not forged. But a spammer could get a server, register a domain, and register valid SPF records. So you need both (a) and (b) to be sure a message isn't spam. With a whitelist of domains that use SPF and don't themselves send spam, you could give a huge negative score to messages from that domain. A distributed database would make it possible to make this list pretty extensive (but never, of course, exhaustive). Of course, the data in the distributed database would have to be trustworthy... - Logan
Re: URIBL and SURBL no lnger hitting
On Tue, 8 Aug 2006, DAve wrote: Dhawal Doshy wrote: Dave, you might need to update the 'root/servers/@' file. IIRC, a couple of root servers have changed in the past few years. We replace the @ file with one of our own on every server. I contains just our dns servers and our own caches. Silly question, and veering off topic, but if you take away the list of root servers, how do your nameservers find things? If you want to find a node in a tree (or in a directed acyclic graph) it helps to start at the root (or roots). If your local DNS server doesn't have any way of finding the root, how can it find the nodes it needs to find? I suppose it's possible your organization's DNS servers and caches are giving authoritative responses for the . domain. Is that what you're saying? - Logan
Re: URIBL and SURBL no lnger hitting
Logan Shaw wrote: On Tue, 8 Aug 2006, DAve wrote: Dhawal Doshy wrote: Dave, you might need to update the 'root/servers/@' file. IIRC, a couple of root servers have changed in the past few years. We replace the @ file with one of our own on every server. I contains just our dns servers and our own caches. Silly question, and veering off topic, but if you take away the list of root servers, how do your nameservers find things? If you want to find a node in a tree (or in a directed acyclic graph) it helps to start at the root (or roots). If your local DNS server doesn't have any way of finding the root, how can it find the nodes it needs to find? I suppose it's possible your organization's DNS servers and caches are giving authoritative responses for the . domain. Is that what you're saying? - Logan It depends on why you are using dnscache. I am talking about running dnscache only for certain services on the box such as SA URIDNSBL lookups, Webalizer lookup on Apache logs, RBL checks at SMTP connect, etc. I simply want to retain and reuse the results of querying my own DNS servers without making a network connection outside my PIX (mailScanners inside, DNS servers outside). Do you have the list of root servers in your mail server's /etc/resolv.conf? DAve -- Three years now I've asked Google why they don't have a logo change for Memorial Day. Why do they choose to do logos for other non-international holidays, but nothing for Veterans? Maybe they forgot who made that choice possible.
Re: Looking for a good Ebay whitelist
On Tue, 8 Aug 2006, Rob McEwen wrote: The following are what I have deemed as frequently used official e-bay smtp servers. This list might be used for whitelisting or/and negative scoring: Seems like ebay is signing messages with DomainKeys, I'm getting DK_VERIFIED in my log for mail from [EMAIL PROTECTED] and [EMAIL PROTECTED] and similar. The Mail::SpamAssassin::Plugin::DKIM offers whitelist_from_dkim, it should not be difficult to port it to Mail::SpamAssassin::Plugin::DomainKeys, making a whitelist_from_dk. Anyone? Btw, the following patch is needed for Mail::DomainKeys 0.82 (the author has been notified): --- --- DomainKeys/Signature.pm~Wed Jun 21 06:25:26 2006 +++ DomainKeys/Signature.pm Thu Aug 3 20:01:52 2006 @@ -46,5 +46,5 @@ /^d=([A-Za-z0-9\-\.]+)$/ and $self-{'DOMN'} = lc $1; - /^h=(\S+)$/ and + /^h=(.*)$/s and $self-{'HDRS'} = lc $1; /^q=(dns)$/i and @@ -269,5 +269,5 @@ if (wantarray and $self-{'HDRS'}) { - my @list = split /:/, $self-{'HDRS'}; + my @list = split /[ \t]*:[ \t]*/, $self-{'HDRS'}; return @list; } --- Mark
Re: Broken images in mails
On Tuesday 08 August 2006 01:51, decoder wrote: But I can view it perfectly. Does anyone know what this could be caused by and a tool which can reliably convert these to pnm? Another question that I would have in mind is, if that was intended to happen... Best regards Chris Are you sure its perfect? I've seem many of these where they are intentionally corrupting the last portion (bottom edge) of the image so as to avoid simple size or hashing techniques. The ones I saw were the same image visually, but the bottom edge was intentionally corrupted beginning at different offsets. -- _ John Andersen pgpyRtbVYwjpH.pgp Description: PGP signature
Re: Looking for a good Ebay whitelist
Heute (08.08.2006/18:52 Uhr) schrieb Mark Martinec, On Tue, 8 Aug 2006, Rob McEwen wrote: The following are what I have deemed as frequently used official e-bay smtp servers. This list might be used for whitelisting or/and negative scoring: Seems like ebay is signing messages with DomainKeys Mark yes, really. --snip DomainKey-Signature: a=rsa-sha1; s=dk; d=ebay.de; c=nofws; q=dns; h=message-id:from:to:subject:mime-version:content-type: content-transfer-encoding:x-ebay-mailtracker; b=XVo26T4Vu0KfRwpbRa928JXSTP1INRdhJfnZm+zZjO/+eF0EsA1ep22j79xsDvxno r4rw2VzlTxgQppQrOC19TFr3M3MY/3iRmO7UnyQtj0oImISsFBxNrfv9WgzNlENPBHs HeQ+u8oAp31/6PbDsaH6Ne4ABnAbr+7TFaOnW5A= --snap -- Viele Gruesse, Kind regards, Jim Knuth [EMAIL PROTECTED] ICQ #277289867 -- Zufalls-Zitat -- Hilfe! Mein Editor verbucht die Drehstaben. -- Der Text hat nichts mit dem Empfaenger der Mail zu tun -- Virus free. Checked by NOD32 Version 1.1697 Build 7812 08.08.2006
updates.spamassassin.org.cf overrides local.cf?
I noticed that if I have updates_spamassassin_org.cf in place in my rules dir, my local.cf rule changes are set back to default. I tried to post, but that soesn't seem to be an option today. If anyone is interested it's here. http://pixelhammer.com/local-cf.txt DAve -- Three years now I've asked Google why they don't have a logo change for Memorial Day. Why do they choose to do logos for other non-international holidays, but nothing for Veterans? Maybe they forgot who made that choice possible.
RE: updates.spamassassin.org.cf overrides local.cf?
I noticed that if I have updates_spamassassin_org.cf in place in my rules dir, my local.cf rule changes are set back to default. I tried to post, but that soesn't seem to be an option today. If anyone is interested it's here. http://pixelhammer.com/local-cf.txt Actually, I can understand that if you're using your site config folder for your updates folder. CF files get read in alpha order so it reads local.cf, then updates_spamassass_org.cf, which then re-includes all the rule files in the update. Updates get put in /var/spamassassin here by default. If you really want them with your site config, then how about a subfolder so the files don't get read in the wrong order? Bret
RE: updates.spamassassin.org.cf overrides local.cf?
DAve wrote: I noticed that if I have updates_spamassassin_org.cf in place in my rules dir, my local.cf rule changes are set back to default. I tried to post, but that soesn't seem to be an option today. If anyone is interested it's here. http://pixelhammer.com/local-cf.txt updates_spamassassin_org.cf should not be your local rules directory. It should be in /var/lib/spamassassin/3.001003 (or whichever version). Put it back where it's supposed to be and the problem will go away. -- Bowie
Blocking based on ALL IPs in the header
Just thought ya'll would be interested to know that I just spent about 45 minutes trying to convince an I.T. guy at one of the largest regional banks in my area that a spam filter should ONLY check the IP address of the sending mail server against RBLs, NOT every single IP contained within the header. I told him that often, dynamically assigned IPs will show up in blacklists even if they've never sent spam and I explained that on any given day, a person's own computer can get reassigned a blacklisted IP which was previously used by a spammer or by a worm-infected computer even if that computer has never had a worm and the user never had sent a spam. I also explained how he doesn't have to worry about what might happen if he didn't check other IPs in the header because if that person's computer were spewing out spams, he still be able to block them if one were to happen to head his way. My client who couldn't send to this bank uses **my** server for sending mail and they are only allowed to do so based on authentication. But the messages are getting blocked because that bank's spam filter is checking every IP in the header and my client's IP is blacklisted. Unbelievable. Rob McEwen PowerView Systems [EMAIL PROTECTED] (478) 475-9032
Re: Looking for a good Ebay whitelist
At 09:52 08-08-2006, Mark Martinec wrote: Seems like ebay is signing messages with DomainKeys, I'm getting DK_VERIFIED in my log for mail from [EMAIL PROTECTED] and [EMAIL PROTECTED] and similar. Ebay.com and a few other high profile domains have been signing their mail with DK. Note that they still have the testing flag set. Regards, -sm
Re: updates.spamassassin.org.cf overrides local.cf?
Bowie Bailey wrote: DAve wrote: I noticed that if I have updates_spamassassin_org.cf in place in my rules dir, my local.cf rule changes are set back to default. I tried to post, but that soesn't seem to be an option today. If anyone is interested it's here. http://pixelhammer.com/local-cf.txt updates_spamassassin_org.cf should not be your local rules directory. It should be in /var/lib/spamassassin/3.001003 (or whichever version). Put it back where it's supposed to be and the problem will go away. Yep, I suspected as much. Now I have SA in three places, four if you count plugins. /usr/local/etc/mail/spamassassin /usr/local/share/spamassassin /var/lib/spamassassin/3.001001/updates_spamassassin_org /usr/local/lib/perl5/site_perl/5.8.8/Mail/SpamAssassin/Plugin I just had the crazy idea that I could keep rules in one place. Is this beginning to look unwieldy to anyone else? (rhetorical, don't answer). -- Three years now I've asked Google why they don't have a logo change for Memorial Day. Why do they choose to do logos for other non-international holidays, but nothing for Veterans? Maybe they forgot who made that choice possible.
Re: updates.spamassassin.org.cf overrides local.cf?
On Tue, Aug 08, 2006 at 03:48:49PM -0400, DAve wrote: Yep, I suspected as much. Now I have SA in three places, four if you count plugins. More if you count the modules and the commandline tools. I just had the crazy idea that I could keep rules in one place. Is this beginning to look unwieldy to anyone else? (rhetorical, don't answer). Well -- you could if you wanted to, but then you have to do some work to deal with it. You can't expect a tool which works one way to do something else without doing anything. -- Randomly Generated Tagline: We may not have got everything right, but at least we knew the century was going to end.- Douglas Adams pgpV7UGjHpT48.pgp Description: PGP signature
RE: updates.spamassassin.org.cf overrides local.cf?
Yep, I suspected as much. Now I have SA in three places, four if you count plugins. More if you count the modules and the commandline tools. I just had the crazy idea that I could keep rules in one place. Is this beginning to look unwieldy to anyone else? (rhetorical, don't answer). Well -- you could if you wanted to, but then you have to do some work to deal with it. You can't expect a tool which works one way to do something else without doing anything. You could, if you set the updates directory to /usr/share/etc/mail/spamassassin/updates or something like that to keep it in an updates folder under your site config... I'm not exactly sure what the thinking was in moving the updates to /var/lib instead of keeping them with /usr/share with the original rules. I wonder why sa-update doesn't just create a version folder under /share/spamassassin and use that... Certainly would be less to keep track of and purge when you install a new version. Bret
problems, problems
Hello, I was kind of shocked when I discovered that there is no SpamAssassin manual or tutorial. For me, it's unimaginable that the world's leading open source spam detection software is missing such an important piece of documentation. The wiki pages are more bits and pieces than a coherent documentation and often don't explain things in principal but give you finished configuration files for procmail Co. But what if I don't use procmail? (I use Courier maildrop.) At the moment, I run spamassassin with no arguments as an ordinary user on every message I receive and decied what to do with the message accoring to the X-Spam-Flag: header line. But I have some problems with this. First, SpamAssassin seems to do autolearning. What does this mean? Does it learn that messages which it already considers spam are spam, and messages which it already considers ham are ham? Wouldn't this mean that SpamAssassin is just doing self-affirmation? Second, I often have a message of the following form in my mail log: courierlocal: […] Cannot open bayes databases /home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists What's the problem here, and how can I get rid of it? I'm using SpamAssassin 3.0.3 on Debian GNU/Linux 3.1. Thanks for you help. Best wishes, Wolfgang
modifications done by Courier MTA confusing SpamAssassin?
Hello, I use Courier MTA. Courier MTA replaces certain mailformed mails with mails which contain some explaination and the original mail as an attachment. The attachment of the mail you're just reading contains such a mail produced by Courier MTA. Do those modifications done by Courier MTA confuse SpamAssassin's spam detection algorithm? Does SpamAssassin look at attachments at all? If yes, are they taken as seriously as message bodies? And what about training the bayesian filter? Should I feed such Courier-MTA-modified mails to sa-learn or should I better not do this? Best wishes, Wolfgang ---BeginMessage--- CORRUPTED MESSAGE This is the Courier Mail Server 0.47 on v791.vanager.de. I received the following message for delivery to your address. This message contains several internal formatting errors. This is often caused by viruses that attempt to infect remote systems. Instead of blocking this message, it has been converted as a safe, text-only attachment that can be safely read with a text editor. This sometimes also happens when the sender's mail software has a bug that creates improperly-formatted messages. Although these kinds of formatting errors may often be ignored by other mail servers, this server detects and intercepts improperly-coded messages in order to prevent viruses from taking advantage of bugs in E-mail programs: - The headers in this message contain improperly-formatted binary content. See URL:ftp://ftp.isi.edu/in-notes/rfc2047.txt for more information. - Received: from 85.119.157.121 (softdnserr [:::58.121.220.188]) by v791.vanager.de with esmtp; Fri, 24 Mar 2006 15:32:31 +0100 id 01FB8004.442402FF.29A0 Received: from [72.199.47.228] by 85.119.157.121 with ESMTP id 98D4CC88F5D; Fri, 24 Mar 2006 17:31:41 +0300 Message-ID: [EMAIL PROTECTED] From: ¾È½Éµå¶óÀ̺ê [EMAIL PROTECTED] Reply-To: ¾È½Éµå¶óÀ̺ê [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: ÀÌÁ¨ À̵¿Ä«¸Þ¶ó ´Ü¼Ó °ÆÁ¤¿¡¼ ¹þ¾î ³ª¼¼¿ä~! Date: Fri, 24 Mar 06 17:31:41 GMT X-Mailer: Microsoft Outlook Express 5.00.2615.200 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary=FA8E.C62592.35EAD.C. X-Priority: 3 X-MSMail-Priority: Normal --FA8E.C62592.35EAD.C. Content-Type: text/plain; Content-Transfer-Encoding: quoted-printable table border=3D0 cellpadding=3D0 cellspacing=3D0 oncontextmenu=3Dr= eturn false ondragstart=3Dreturn false onselectstart=3Dreturn false a= lign=3Dcenter trtdA HREF=3Dhttp://%69n%6fm%61.%69%62%62%75%6e%2e%6f%72= %67/sensor/?pcode=3DicdilnsIMG style=3DCURSOR: hand alt=3D src=3Dh= ttp://%20%77w%77.%77h%6f%6e%69%2e%62%69%7a/prod_img/sensor/images/mail.jpg= border=3D0/A/td/tr/table TABLE border=3D0 cellpadding=3D0 cellspacing=3D0 align=3Dcenter= TRTD A href=3Dhttp://%69n%6fm%61.%69%62%62%75%6e%2e%6f%72%67/common2/mail_lis= t.htmlIMG style=3DCURSOR: hand alt=3D src=3Dhttp://%20= %77w%77.%77h%6f%6e%69%2e%62%69%7a/prod_img/common2/images/reject.gif bord= er=3D0/A/TD/TR/TABLE --FA8E.C62592.35EAD.C.-- ---End Message---
Re: modifications done by Courier MTA confusing SpamAssassin?
Wolfgang Jeltsch wrote: Hello, I use Courier MTA. Courier MTA replaces certain mailformed mails with mails which contain some explaination and the original mail as an attachment. The attachment of the mail you're just reading contains such a mail produced by Courier MTA. Do those modifications done by Courier MTA confuse SpamAssassin's spam detection algorithm? Does SpamAssassin look at attachments at all? If yes, are they taken as seriously as message bodies? And what about training the bayesian filter? Should I feed such Courier-MTA-modified mails to sa-learn or should I better not do this? Best wishes, Wolfgang even your email with attachments made SA barf. I don't think sa-learn will help. -- Michael Scheidell, CTO SECNAP Network Security / www.secnap.com [EMAIL PROTECTED] / 1+561-999-5000, x 1131
Re: Improved OCR Plugin with approximate matching
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 decoder wrote: Hello there, I have improved the original OcrPlugin (found at http://wiki.apache.org/spamassassin/OcrPlugin), so it contains fuzzy matching. Like that, mistakes made by the OCR recognition or intentional obfuscations in the text don't make the recognition impossible. This is being done with a relative distance calculation between the pattern (word from a given word list) and a line in the recognized input. Also, the plugin uses dynamic scoring (more matched words means more score, this can be adjusted in the source). You can find a full description and an example in the wiki under: http://wiki.apache.org/spamassassin/FuzzyOcrPlugin Ideas for improvements or critics are always welcome :) Best regards, Chris See http://wiki.apache.org/spamassassin/FuzzyOcrPlugin Major changes: Replaced imagemagick with netpbm, support png, invoked giffix for broken gifs, detect image format with magic bytes and not by content-type, added various configuration options. Feedback is welcome :) Chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2PqdJQIKXnJyDxURAnFuAJ4vfLmW4UZUO0YH0EGcJlyNwJMUsACdGmAJ 1ZfXWyUvpaJ8ZNC1HeRMbLA= =/Cyu -END PGP SIGNATURE-
Re: modifications done by Courier MTA confusing SpamAssassin?
Am Dienstag, 8. August 2006 22:51 schrieb Michael Scheidell: Wolfgang Jeltsch wrote: Hello, I use Courier MTA. Courier MTA replaces certain mailformed mails with mails which contain some explaination and the original mail as an attachment. The attachment of the mail you're just reading contains such a mail produced by Courier MTA. Do those modifications done by Courier MTA confuse SpamAssassin's spam detection algorithm? Does SpamAssassin look at attachments at all? If yes, are they taken as seriously as message bodies? And what about training the bayesian filter? Should I feed such Courier-MTA-modified mails to sa-learn or should I better not do this? Best wishes, Wolfgang even your email with attachments made SA barf. I don't think sa-learn will help. Could you please elaborate a bit? Best wishes, Wolfgang
RE: modifications done by Courier MTA confusing SpamAssassin?
Wolfgang Jeltsch wrote: Hello, I use Courier MTA. Courier MTA replaces certain mailformed mails with mails which contain some explaination and the original mail as an attachment. The attachment of the mail you're just reading contains such a mail produced by Courier MTA. Do those modifications done by Courier MTA confuse SpamAssassin's spam detection algorithm? Does SpamAssassin look at attachments at all? If yes, are they taken as seriously as message bodies? And what about training the bayesian filter? Should I feed such Courier-MTA-modified mails to sa-learn or should I better not do this? Yes, those modified emails will confuse SA. They will also confuse your users. The best option is to tell Courier to leave the emails alone. In your /etc/courier/bofh file, add this line: opt BOFHBADMIME=accept I use Courier as well and SA works great for me. The main thing you will want to do is start up the spamd daemon and use spamc instead of spamassassin in maildrop. I think the main reason there is no SpamAssassin manual is that there are so many ways to use it. SpamAssassin is a fairly simple program. The hard part is usually making it work with your mail system. There is a book out (and I'm sure the author will speak up before too long). If you have any specific questions about interfacing SA with Courier, I'll be glad to help out. -- Bowie
subject was meant to be new version, please test ;) -nt-
decoder wrote: decoder wrote: Hello there, I have improved the original OcrPlugin (found at http://wiki.apache.org/spamassassin/OcrPlugin), so it contains fuzzy matching. Like that, mistakes made by the OCR recognition or intentional obfuscations in the text don't make the recognition impossible. This is being done with a relative distance calculation between the pattern (word from a given word list) and a line in the recognized input. Also, the plugin uses dynamic scoring (more matched words means more score, this can be adjusted in the source). You can find a full description and an example in the wiki under: http://wiki.apache.org/spamassassin/FuzzyOcrPlugin Ideas for improvements or critics are always welcome :) Best regards, Chris See http://wiki.apache.org/spamassassin/FuzzyOcrPlugin Major changes: Replaced imagemagick with netpbm, support png, invoked giffix for broken gifs, detect image format with magic bytes and not by content-type, added various configuration options. Feedback is welcome :) Chris
Re: Looking for a good Ebay whitelist
SARE maintains a whitelist. I don't know if those particular sites are on it or not. If you can provide the appropriate info for a whitelist_from_recvd line they could probably be added. Loren
RE: Bayes DB version issue 3.1.3 = 3.1.4
Okay, I have a little more information now. I run the same command that sql.pm would run. It appears to be a collation issue. Can we force the collation with 3.1.4 to a specific type? In my case the database is in latin because 3.1.3 choked on UTF8. This was on RHEL4 (which defaults to UTF8). The kernel is 2.6.9. I'm trying to get this to run on rPath Linux which is on 2.6.16. I would suspect that they have implemented more libraries in UTF8 now than back on kernel 2.6.9. Anyway, here is the command I issued to catch this point: echo SELECT value FROM bayes_global_vars WHERE variable = 'VERSION'; | mysql -u user -D database -h 10.0.13.13 -ppassword ERROR 1267 (HY000) at line 1: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' Any help would be greatly appreciated. Gary Wayne Smith -Original Message- From: Gary W. Smith [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 08, 2006 8:06 AM To: Daryl C. W. O'Shea Cc: users@spamassassin.apache.org Subject: RE: Bayes DB version issue 3.1.3 = 3.1.4 Daryl, Thanks for the info. I will update the .8. As for the database, which is the primary concern, the user account is correct. I have logged into the database from that server using the same credentials from the local.cf file. I had thought that we might have restricted by subnet so I did indeed try that last night. [EMAIL PROTECTED] spamassassin]# mysql -u xxx -h xx.xx.xx.xx -D spamassassin -p Enter password: Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 6649341 to server version: 4.1.7-log Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql show tables; ++ | Tables_in_spamassassin | ++ | awl| | bayes_expire | | bayes_global_vars | | bayes_seen | | bayes_token| | bayes_vars | | userpref | ++ 7 rows in set (0.00 sec) mysql select * from bayes_global_vars; +--+---+ | variable | value | +--+---+ | VERSION | 3 | +--+---+ 1 row in set (0.00 sec) mysql -Original Message- From: Daryl C. W. O'Shea [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 08, 2006 12:38 AM To: Gary W. Smith Cc: users@spamassassin.apache.org Subject: Re: Bayes DB version issue 3.1.3 = 3.1.4 On 8/8/2006 3:29 AM, Gary W. Smith wrote: Hello, I can't remember smoking crack when copying the config files over but anything's possible. I built out a new machine today and installed SA. We have a list of CPAN modules that were installed (same list as from the 3.1.3 servers). I copied everything in the /etc/mail/spamassassin from our productions servers to the test server and after starting we receive errors. I have checked and the MySQL data instance is accessible from this server. There are also several rules that are errors as well. I know that someone has asked this question already but I didn't find the answer in the thread archive. Here are the contents of the log file: Aug 7 21:45:59 labtest01c spamd[2693]: config: score: the non-numeric score (.8) is not valid, a numeric score is required Aug 7 21:45:59 labtest01c spamd[2693]: config: SpamAssassin failed to parse line, SUBJ_HAS_UNIQ_ID .8 0.212 0.682 1.677 is not valid for score, skipping: score SUBJ_HAS_UNIQ_ID .8 0.212 0.682 1.677 .8 requires a leading zero. Aug 7 21:46:01 labtest01c spamd[2693]: bayes: database version 0 is different than we understand (3), aborting! at /usr/lib/perl5/site_perl/5.8.7/Mail/SpamAssassin/BayesStore/SQL.pm line 135. Aug 7 21:46:03 labtest01c spamd[2693]: bayes: database version 0 is different than we understand (3), aborting! at /usr/lib/perl5/site_perl/5.8.7/Mail/SpamAssassin/BayesStore/SQL.pm line 135. SQL server privilege issue? Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DIGEST_MULTIPLE has undefined dependency 'RAZOR2_CHECK' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DIGEST_MULTIPLE has undefined dependency 'DCC_CHECK' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DRUGS_ERECTILE has undefined dependency '__DRUGS_ERECTILE7' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_SUB_ACCEPT_CCARDS has undefined dependency '__SARE_SUB_FROM_PAYPAL' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_SPEC_PROLEO_M2a has dependency 'MIME_QP_LONG_LINE' with a zero score Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_HEAD_SUBJ_RAND has undefined dependency 'SARE_XMAIL_SUSP2' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_HEAD_SUBJ_RAND has undefined dependency 'SARE_HEAD_XAUTH_WARN' Aug 7
Re: modifications done by Courier MTA confusing SpamAssassin?
Am Dienstag, 8. August 2006 23:04 schrieb Bowie Bailey: Wolfgang Jeltsch wrote: Hello, I use Courier MTA. Courier MTA replaces certain mailformed mails with mails which contain some explaination and the original mail as an attachment. The attachment of the mail you're just reading contains such a mail produced by Courier MTA. Do those modifications done by Courier MTA confuse SpamAssassin's spam detection algorithm? Does SpamAssassin look at attachments at all? If yes, are they taken as seriously as message bodies? And what about training the bayesian filter? Should I feed such Courier-MTA-modified mails to sa-learn or should I better not do this? Yes, those modified emails will confuse SA. They will also confuse your users. The best option is to tell Courier to leave the emails alone. In your /etc/courier/bofh file, add this line: opt BOFHBADMIME=accept Thanks for this tip. I didn't know that it's possible to stop Courier MTA rewriting those mails. I use Courier as well and SA works great for me. The main thing you will want to do is start up the spamd daemon and use spamc instead of spamassassin in maildrop. I decided against this but have forgotten why I did so. Maybe because of security issues. Since my server serves very few users, I see no resource problems in using spamassassin instead of spamc/spamd. But could using spamc/spamd resolve the locking problem I described? [...] If you have any specific questions about interfacing SA with Courier, I'll be glad to help out. Thanks a lot! Best wishes, Wolfgang
RE: updates.spamassassin.org.cf overrides local.cf?
On Tue, 8 Aug 2006, Bret Miller wrote: I'm not exactly sure what the thinking was in moving the updates to /var/lib instead of keeping them with /usr/share with the original rules. I wonder why sa-update doesn't just create a version folder under /share/spamassassin and use that... Because it's bad form, at least according to some people, to write mutable data into the software install directory. This isn't as common as it used to be, but at one time, some people would mount /usr as a separate filesystem and make it read-only. Since all the data was written into /var, there was no reason for /usr to be read-write if you weren't installing or updating software. Similarly, some sites have a great big shared /usr/local that is NFS-mounted across many machines. The clients aren't supposed to write there because they don't own it and could potentially screw it up for other systems sharing the same /usr/local. In fact, in some cases, /usr/local can be mounted from any one of several servers and updated copies are pushed from the master /usr/local to the other servers' /usr/local. The general idea is that software and its data should be separated so that you can run multiple instances of the software without having multiple copies of the constant parts of it or so that you can put one in a read-only location and the other in a read-write location. Now, is this directly applicable to SpamAssassin? Well, on a dedicated mail server that uses SpamAssassin as a plugin, it's probably not. You are going to have exactly one set of rules and exactly one SA install. But one could envision several users on a machine sharing the same install of it. Maybe the users all fetch mail from other servers and run their own instances of SpamAssassin through procmail or something. Also, even if it isn't relevant for SpamAssassin in particular, it's part of the idiom for installing software on Unix in general, so it is might be logical to install it that way just for the sake of consistency. - Logan
Re: problems, problems
On Tue, 8 Aug 2006, Wolfgang Jeltsch wrote: I was kind of shocked when I discovered that there is no SpamAssassin manual or tutorial. For me, it's unimaginable that the world's leading open source spam detection software is missing such an important piece of documentation. Well, it's not entirely true that there isn't a manual. The various components do have manuals. Here are the most commonly useful ones: perldoc Mail::SpamAssassin perldoc Mail::SpamAssassin::Conf man spamassassin man sa-learn man sa-update And some other ones: perldoc Mail::SpamAssassin::Plugin perldoc Mail::SpamAssassin::Bayes perldoc Mail::SpamAssassin::BayesStore perldoc Mail::SpamAssassin::Plugin::Hashcash Not all the modules that should have documentation do have documentation (for instance, Mail::SpamAssassin::BayesStore::DBM doesn't have any), but there is at least some information. You can root around in the Mail/SpamAssassin directory (should be somewhere inside your site_perl directory) to find more modules that might have documentation. There may be a more elegant way, but this is one of seeing a list of modules which have documentation: cd ./site_perl/./Mail/SpamAssassin find . -name '*.pm' -print | xargs grep -l '^=head' The wiki pages are more bits and pieces than a coherent documentation and often don't explain things in principal but give you finished configuration files for procmail Co. But what if I don't use procmail? Well, SpamAssassin doesn't deliver mail, so this question, which is about delivery methods, isn't really relevant. First, SpamAssassin seems to do autolearning. What does this mean? Does it learn that messages which it already considers spam are spam, and messages which it already considers ham are ham? Wouldn't this mean that SpamAssassin is just doing self-affirmation? The Bayes database needs to be fed training data in order to be effective. It needs to see several (preferably hundreds and hundreds) of known spam and known ham messages. sa-learn is the command that is used to do this manually. Autolearning means to do the same thing as sa-learn, but automatically. Basically, the other rules work well enough that they can identify obvious spam and ham. Those messages can be used to train the Bayes database. Second, I often have a message of the following form in my mail log: courierlocal: [???] Cannot open bayes databases /home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists What's the problem here, and how can I get rid of it? Without any more information than that, I would say that something is either still using the Bayes database in your home directory or it is finished but the lock file hasn't been removed. I haven't tried using SpamAssassin with Courier anything, so I'm not really familiar with how it's normally invoked. - Logan
Re: Looking for a good Ebay whitelist
From: Logan Shaw [EMAIL PROTECTED] On Tue, 8 Aug 2006, wrote: I have been having FPs from Ebay in AU and DE, as well as [EMAIL PROTECTED] Does anybody have a good whitelist for these? Because so many people try to forge messages from eBay but what comes from their own servers is almost definitely not spam, eBay seems like an ideal example of an organization that could benefit from SPF. And sure enough: $ host -t TXT ebay.com ebay.com descriptive text spf2.0/pra mx include:s._sid.ebay.com include:m._sid.ebay.com include:p._sid.ebay.com include:c._sid.ebay.com ~all ebay.com descriptive text v=spf1 mx include:s._spf.ebay.com include:m._spf.ebay.com include:p._spf.ebay.com include:c._spf.ebay.com ~all $ host -t TXT ebay.com.au ebay.com.au descriptive text spf2.0/pra mx include:s._sid.ebay.com include:m._sid.ebay.com include:p._sid.ebay.com include:c._sid.ebay.com ~all ebay.com.au descriptive text v=spf1 mx include:s._spf.ebay.com include:m._spf.ebay.com include:p._spf.ebay.com include:c._spf.ebay.com ~all $ host -t TXT ebay.de ebay.de descriptive text v=spf1 mx include:s._spf.ebay.com include:m._spf.ebay.com include:p._spf.ebay.com include:c._spf.ebay.com ~all ebay.de descriptive text spf2.0/pra mx include:s._sid.ebay.com include:m._sid.ebay.com include:p._sid.ebay.com include:c._sid.ebay.com ~all So it seems like SPF is probably something good to rely on in this case. I don't fully understand the SPF plug-in, but perhaps all you need to do is add the appropriate ebay domains to new def_whitelist_from_spf rules like the ones in 60_whitelist_spf.cf. This page: http://pages.ebay.com/help/confidence/isgw-account-theft-spoof.html has a list of eBay's US and international web sites, so presumably the list of valid e-mail domains ([EMAIL PROTECTED], [EMAIL PROTECTED], etc.) can be easily and correctly derived from that list. SMOMR - Simple Matter Of Meta Rules. If SPF is bad and says it is from ebay add spam points. {^_^}
Re: problems, problems
man spamassassin is the key to the whole thing beyond the INSTALL files. Then you have things like man Mail::SpamAssassin and its kith and kin like man Mail::SpamAssassin::Conf. These will generally be more up to date than any documentation file that exists. And of course the original man spamassassin results point to some of the other that are important. {^_^} - Original Message - From: Wolfgang Jeltsch [EMAIL PROTECTED] Hello, I was kind of shocked when I discovered that there is no SpamAssassin manual or tutorial. For me, it's unimaginable that the world's leading open source spam detection software is missing such an important piece of documentation. The wiki pages are more bits and pieces than a coherent documentation and often don't explain things in principal but give you finished configuration files for procmail Co. But what if I don't use procmail? (I use Courier maildrop.) At the moment, I run spamassassin with no arguments as an ordinary user on every message I receive and decied what to do with the message accoring to the X-Spam-Flag: header line. But I have some problems with this. First, SpamAssassin seems to do autolearning. What does this mean? Does it learn that messages which it already considers spam are spam, and messages which it already considers ham are ham? Wouldn't this mean that SpamAssassin is just doing self-affirmation? Second, I often have a message of the following form in my mail log: courierlocal: […] Cannot open bayes databases /home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists What's the problem here, and how can I get rid of it? I'm using SpamAssassin 3.0.3 on Debian GNU/Linux 3.1. Thanks for you help. Best wishes, Wolfgang
Re: Broken images in mails
On Tue, 8 Aug 2006, John Andersen wrote: Are you sure its perfect? I've seem many of these where they are intentionally corrupting the last portion (bottom edge) of the image so as to avoid simple size or hashing techniques. The ones I saw were the same image visually, but the bottom edge was intentionally corrupted beginning at different offsets. Adding a point for corrupted images is sounding better and better. -- John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- If someone has a gun and is trying to kill you, it would be reasonable to shoot back with your own gun. -- the Dalai Lama, May 15, 2001 ---
Re: Word Doc spam
From: Ralf Hildebrandt [EMAIL PROTECTED] * Kenneth Porter [EMAIL PROTECTED]: I was surprised to see one of these as well. I'd always thought that it would be nice for the Open Office people to create a simple command-line utility to convert Word files to plain text for spam checking. man antiword No manual entry for antiword {^_-}
RE: Word Doc spam
From: Ralf Hildebrandt [EMAIL PROTECTED] * Kenneth Porter [EMAIL PROTECTED]: I was surprised to see one of these as well. I'd always thought that it would be nice for the Open Office people to create a simple command-line utility to convert Word files to plain text for spam checking. man antiword No manual entry for antiword {^_-} FWIW, we're starting to see these .doc attachments come through as well. I guess the figure if we're gonna block images, .doc files are the next best thing? Bret
Re: Broken images in mails
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 John D. Hardin wrote: On Tue, 8 Aug 2006, John Andersen wrote: Are you sure its perfect? I've seem many of these where they are intentionally corrupting the last portion (bottom edge) of the image so as to avoid simple size or hashing techniques. The ones I saw were the same image visually, but the bottom edge was intentionally corrupted beginning at different offsets. Adding a point for corrupted images is sounding better and better. Definetly a good idea... I will try to add this feature in the next release of FuzzyOcr (v.2.1) then. I am also thinking about scanning all attachments, no matter if the content type specifies image or not (in the current version 2.0, only attachments that have image in their content type are scanned with format auto-detection) because for example outlook always displays the image, no matter if the content type is what/ever or image/blah... :( Chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2Q2dJQIKXnJyDxURAu7kAKDJLt19AywH0aZSbHNRKpLYvgtpCgCfWG+8 EhKhLMk12XQ8cC8vOJy6FY0= =/GO+ -END PGP SIGNATURE-
Some Ebay stats
This is interesting. This is a list of relays with the From field matching '@ebay.' 202.64.65.129.in-addr.arpa domain name pointer gabriel.its.calpoly.edu. 204.64.65.129.in-addr.arpa domain name pointer email-gateway-michael.its.calpoly.edu. 10.193.98.140.in-addr.arpa domain name pointer ruebert.ieee.org. 23.193.98.140.in-addr.arpa domain name pointer engine.ieee.org. 55.1.41.198.in-addr.arpa domain name pointer mx01.nic.name. 56.1.41.198.in-addr.arpa domain name pointer mx02.nic.name. 34.3.41.198.in-addr.arpa domain name pointer mx04.nic.name. 35.3.41.198.in-addr.arpa domain name pointer mx05.nic.name. 199.132.22.203.in-addr.arpa domain name pointer wm-06.dcsi.net.au. 51.11.13.204.in-addr.arpa domain name pointer bellerophon.decipherinc.com. 173.52.190.206.in-addr.arpa domain name pointer smtp104.biz.mail.re2.yahoo.com. 45.66.191.209.in-addr.arpa domain name pointer mailforward101.store.mud.yahoo.com. 47.66.191.209.in-addr.arpa domain name pointer mailforward103.store.mud.yahoo.com. 48.66.191.209.in-addr.arpa domain name pointer mailforward104.store.mud.yahoo.com. 21.98.109.210.in-addr.arpa is an alias for 21.0-255.98.109.210.in-addr.arpa. 21.0-255.98.109.210.in-addr.arpa domain name pointer mail-kr1.bigfoot.com. 228.216.115.211.in-addr.arpa is an alias for 228.0-255.216.115.211.in-addr.arpa. 228.0-255.216.115.211.in-addr.arpa domain name pointer mail-kr.bigfoot.com. 150.180.144.216.in-addr.arpa is an alias for 216.144.180.150.rev.k12system.com. 216.144.180.150.rev.k12system.com domain name pointer boboshrimps.dok.org. 51.145.200.216.in-addr.arpa domain name pointer sitemail.everyone.net. 105.160.220.216.in-addr.arpa domain name pointer dickory.paonline.com. 84.244.33.216.in-addr.arpa domain name pointer sem.ebay.com. 123.161.121.59.in-addr.arpa domain name pointer 59-121-161-123.dynamic.hinet.net. 106.255.123.61.in-addr.arpa domain name pointer 61123255106.cidr.odn.ne.jp. 3.186.209.63.in-addr.arpa domain name pointer mx1.iviewer.com. 163.171.251.63.in-addr.arpa domain name pointer m1.dnsix.com. 19.14.80.63.in-addr.arpa domain name pointer mailhost.liveworld.com. 202.166.71.64.in-addr.arpa is an alias for 202.subnet192.166.71.64.in-addr.arpa. 202.subnet192.166.71.64.in-addr.arpa domain name pointer spf5.us4.outblaze.com. 138.45.48.65.in-addr.arpa domain name pointer reverse.138.45.48.65.static.ldmi.com. 5.21.134.66.in-addr.arpa domain name pointer h-66-134-21-5.hstqtx02.covad.net. 180.195.135.66.in-addr.arpa domain name pointer data.ebay.com. 11.197.135.66.in-addr.arpa domain name pointer mxpool05.ebay.com. 12.197.135.66.in-addr.arpa domain name pointer mxpool06.ebay.com. 13.197.135.66.in-addr.arpa domain name pointer mxpool07.ebay.com. 14.197.135.66.in-addr.arpa domain name pointer mxpool08.ebay.com. 15.197.135.66.in-addr.arpa domain name pointer mxpool09.ebay.com. 16.197.135.66.in-addr.arpa domain name pointer mxpool10.ebay.com. 17.197.135.66.in-addr.arpa domain name pointer mxpool11.ebay.com. 18.197.135.66.in-addr.arpa domain name pointer mxpool12.ebay.com. 19.197.135.66.in-addr.arpa domain name pointer mxpool13.ebay.com. 20.197.135.66.in-addr.arpa domain name pointer mxpool14.ebay.com. 21.197.135.66.in-addr.arpa domain name pointer mxpool15.ebay.com. 22.197.135.66.in-addr.arpa domain name pointer mxpool16.ebay.com. 23.197.135.66.in-addr.arpa domain name pointer mxpool17.ebay.com. 24.197.135.66.in-addr.arpa domain name pointer mxpool18.ebay.com. 25.197.135.66.in-addr.arpa domain name pointer mxpool19.ebay.com. 26.197.135.66.in-addr.arpa domain name pointer mxpool20.ebay.com. 27.197.135.66.in-addr.arpa domain name pointer mxpool21.ebay.com. 28.197.135.66.in-addr.arpa domain name pointer mxpool22.ebay.com. 29.197.135.66.in-addr.arpa domain name pointer mxpool23.ebay.com. 8.197.135.66.in-addr.arpa domain name pointer mxpool02.ebay.com. 198.209.135.66.in-addr.arpa domain name pointer mxsmfpool01.ebay.com. 199.209.135.66.in-addr.arpa domain name pointer mxsmfpool02.ebay.com. 200.209.135.66.in-addr.arpa domain name pointer mxsmfpool03.ebay.com. 201.209.135.66.in-addr.arpa domain name pointer mxsmfpool04.ebay.com. 202.209.135.66.in-addr.arpa domain name pointer mxsmfpool05.ebay.com. 203.209.135.66.in-addr.arpa domain name pointer mxsmfpool06.ebay.com. 204.209.135.66.in-addr.arpa domain name pointer mxsmfpool07.ebay.com. 205.209.135.66.in-addr.arpa domain name pointer mxsmfpool08.ebay.com. 206.209.135.66.in-addr.arpa domain name pointer mxsmfpool09.ebay.com. 207.209.135.66.in-addr.arpa domain name pointer mxsmfpool10.ebay.com. 208.209.135.66.in-addr.arpa domain name pointer mxsmfpool11.ebay.com. 209.209.135.66.in-addr.arpa domain name pointer mxsmfpool12.ebay.com. 210.209.135.66.in-addr.arpa domain name pointer mxsmfpool13.ebay.com. 211.209.135.66.in-addr.arpa domain name pointer mxsmfpool14.ebay.com. 212.209.135.66.in-addr.arpa domain name pointer mxsmfpool15.ebay.com. 213.209.135.66.in-addr.arpa domain name pointer mxsmfpool16.ebay.com. 214.209.135.66.in-addr.arpa domain name pointer
Re: Looking for a good Ebay whitelist
On Tue, 8 Aug 2006, jdow wrote: From: Logan Shaw [EMAIL PROTECTED] On Tue, 8 Aug 2006, wrote: I have been having FPs from Ebay in AU and DE, as well as [EMAIL PROTECTED] Does anybody have a good whitelist for these? So it seems like SPF is probably something good to rely on in this case. I don't fully understand the SPF plug-in, but perhaps all you need to do is add the appropriate ebay domains to new def_whitelist_from_spf rules like the ones in 60_whitelist_spf.cf. SMOMR - Simple Matter Of Meta Rules. If SPF is bad and says it is from ebay add spam points. I thought in this case the problem was false positives. Presumably forged ebay mails are such a nuisance that all kinds of rules (bayes?) hit, causing real ebay rules to get flagged every now and then. But since all the info about which servers to whitelist is right there in the SPF records that eBay supplies, and since we know that eBay itself doesn't send spam, it seems like some whitelist_from_spf rules (which, as I understand it, whitelist an address or address pattern, but conditionally on it passing SPF checks) should be all that's necessary. Something like this: whitelist_from_spf [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] whitelist_from_spf [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] whitelist_from_spf [EMAIL PROTECTED] [EMAIL PROTECTED] whitelist_from_spf [EMAIL PROTECTED] Note that [EMAIL PROTECTED] is a judgement call since on the one hand it's already in 60_whitelist_spf.cf and on the other hand, it has a def_whitelist_from_spf rule, which gets a lower score than whitelist_from_spf. - Logan
RE: problems, problems
Hello, I was kind of shocked when I discovered that there is no SpamAssassin manual or tutorial. For me, it's unimaginable that the world's leading open source spam detection software is missing such an important piece of documentation. http://spamassassin.apache.org/doc.html There are a large number of ways SpamAssassin can be incorporated into someone's system. Besides what is provided on the SpamAssassin site and the documentation provided with SpamAssassin itself, there are many HOWTOs out there that deal with particular setups. Google is your friend. The wiki pages are more bits and pieces than a coherent documentation and often don't explain things in principal but give you finished configuration files for procmail Co. But what if I don't use procmail? (I use Courier maildrop.) At the moment, I run spamassassin with no arguments as an ordinary user on every message I receive and decied what to do with the message accoring to the X-Spam-Flag: header line. But I have some problems with this. First, SpamAssassin seems to do autolearning. What does this mean? Does it learn that messages which it already considers spam are spam, and messages which it already considers ham are ham? Wouldn't this mean that SpamAssassin is just doing self-affirmation? Bayes builds a database of the tokens in obvious spam, and in obvious ham. When a message is recieved its tokens are compared to the database to help push the score one way or the other (or not). It's not self-affirmation because Bayes itself does not influence whether something is autolearned or not. The Bayes score tweak happens afterwards. It's more akin to learning from experience. Second, I often have a message of the following form in my mail log: courierlocal: [â¦] Cannot open bayes databases /home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists What's the problem here, and how can I get rid of it? I would first try setting lock_method flock in local.cf and if that does not help, try bayes_learn_to_journal 1 http://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_Conf.html#learning_options Better yet, move Bayes to MySQL. This HOWTO is geared towards amavisd-new, but could be used for any other user and would be good for site-wide use, simply substitute the user name: http://www200.pair.com/mecham/spam/debian-spamassassin-sql.html I'm using SpamAssassin 3.0.3 on Debian GNU/Linux 3.1. Thanks for you help. Best wishes, Wolfgang Gary V _ Is your PC infected? Get a FREE online computer virus scan from McAfee® Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963
Re: updates.spamassassin.org.cf overrides local.cf?
Bret Miller wrote: Yep, I suspected as much. Now I have SA in three places, four if you count plugins. More if you count the modules and the commandline tools. I just had the crazy idea that I could keep rules in one place. Is this beginning to look unwieldy to anyone else? (rhetorical, don't answer). Well -- you could if you wanted to, but then you have to do some work to deal with it. You can't expect a tool which works one way to do something else without doing anything. You could, if you set the updates directory to /usr/share/etc/mail/spamassassin/updates or something like that to keep it in an updates folder under your site config... I'm not exactly sure what the thinking was in moving the updates to /var/lib instead of keeping them with /usr/share with the original rules. I wonder why sa-update doesn't just create a version folder under /share/spamassassin and use that... Certainly would be less to keep track of and purge when you install a new version. Bret I opened a bug for this, #5036 with a severity of minor. Only because sa-update offers to allow a different install path for updates and provides a flag to do so. I don't think it needs fixed, but a warning that I was acting like a luser using my site rules dir would have been nice. DAve -- Three years now I've asked Google why they don't have a logo change for Memorial Day. Why do they choose to do logos for other non-international holidays, but nothing for Veterans? Maybe they forgot who made that choice possible.
RE: HTML-tests good or bad?
| From: jdow [mailto:[EMAIL PROTECTED] | From: Chris Santerre [EMAIL PROTECTED] | | ... | | --Chris | | (If I spelt everything correct.I'm sorry.) | ^What's this spelt stuff? It sounds nasty. It's what's left over from making beer - I've seen it used to make bread. Pretty tasty, actually.
Re: updates.spamassassin.org.cf overrides local.cf?
Bret Miller wrote: Yep, I suspected as much. Now I have SA in three places, four if you count plugins. More if you count the modules and the commandline tools. I just had the crazy idea that I could keep rules in one place. Is this beginning to look unwieldy to anyone else? (rhetorical, don't answer). Well -- you could if you wanted to, but then you have to do some work to deal with it. You can't expect a tool which works one way to do something else without doing anything. You could, if you set the updates directory to /usr/share/etc/mail/spamassassin/updates or something like that to keep it in an updates folder under your site config... I'm not exactly sure what the thinking was in moving the updates to /var/lib instead of keeping them with /usr/share with the original rules. I wonder why sa-update doesn't just create a version folder under /share/spamassassin and use that... Certainly would be less to keep track of and purge when you install a new version. Updates are variable... they go in /var. Anywhere else wouldn't be following FHS. http://www.pathname.com/fhs/ Daryl
Re: Word Doc spam
From: Ralf Hildebrandt [EMAIL PROTECTED] man antiword No manual entry for antiword Looks really useful and straightforward, thanks Ralf! In the FreeBSD ports collection it comes under: textproc/antiword or fetch it from its home site: http://www.winfield.demon.nl/ Mark
RE: Bayes DB version issue 3.1.3 = 3.1.4
I've created a new database in UTF8 format. I will see how this works out. I might try to copy the data from the Latin database to the UTF8 database but in past experience this hasn't worked that great. I might also make a backup as well and try that. -Original Message- From: Gary W. Smith [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 08, 2006 2:23 PM To: users@spamassassin.apache.org Subject: RE: Bayes DB version issue 3.1.3 = 3.1.4 Okay, I have a little more information now. I run the same command that sql.pm would run. It appears to be a collation issue. Can we force the collation with 3.1.4 to a specific type? In my case the database is in latin because 3.1.3 choked on UTF8. This was on RHEL4 (which defaults to UTF8). The kernel is 2.6.9. I'm trying to get this to run on rPath Linux which is on 2.6.16. I would suspect that they have implemented more libraries in UTF8 now than back on kernel 2.6.9. Anyway, here is the command I issued to catch this point: echo SELECT value FROM bayes_global_vars WHERE variable = 'VERSION'; | mysql -u user -D database -h 10.0.13.13 -ppassword ERROR 1267 (HY000) at line 1: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' Any help would be greatly appreciated. Gary Wayne Smith -Original Message- From: Gary W. Smith [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 08, 2006 8:06 AM To: Daryl C. W. O'Shea Cc: users@spamassassin.apache.org Subject: RE: Bayes DB version issue 3.1.3 = 3.1.4 Daryl, Thanks for the info. I will update the .8. As for the database, which is the primary concern, the user account is correct. I have logged into the database from that server using the same credentials from the local.cf file. I had thought that we might have restricted by subnet so I did indeed try that last night. [EMAIL PROTECTED] spamassassin]# mysql -u xxx -h xx.xx.xx.xx -D spamassassin -p Enter password: Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 6649341 to server version: 4.1.7-log Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql show tables; ++ | Tables_in_spamassassin | ++ | awl| | bayes_expire | | bayes_global_vars | | bayes_seen | | bayes_token| | bayes_vars | | userpref | ++ 7 rows in set (0.00 sec) mysql select * from bayes_global_vars; +--+---+ | variable | value | +--+---+ | VERSION | 3 | +--+---+ 1 row in set (0.00 sec) mysql -Original Message- From: Daryl C. W. O'Shea [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 08, 2006 12:38 AM To: Gary W. Smith Cc: users@spamassassin.apache.org Subject: Re: Bayes DB version issue 3.1.3 = 3.1.4 On 8/8/2006 3:29 AM, Gary W. Smith wrote: Hello, I can't remember smoking crack when copying the config files over but anything's possible. I built out a new machine today and installed SA. We have a list of CPAN modules that were installed (same list as from the 3.1.3 servers). I copied everything in the /etc/mail/spamassassin from our productions servers to the test server and after starting we receive errors. I have checked and the MySQL data instance is accessible from this server. There are also several rules that are errors as well. I know that someone has asked this question already but I didn't find the answer in the thread archive. Here are the contents of the log file: Aug 7 21:45:59 labtest01c spamd[2693]: config: score: the non-numeric score (.8) is not valid, a numeric score is required Aug 7 21:45:59 labtest01c spamd[2693]: config: SpamAssassin failed to parse line, SUBJ_HAS_UNIQ_ID .8 0.212 0.682 1.677 is not valid for score, skipping: score SUBJ_HAS_UNIQ_ID .8 0.212 0.682 1.677 .8 requires a leading zero. Aug 7 21:46:01 labtest01c spamd[2693]: bayes: database version 0 is different than we understand (3), aborting! at /usr/lib/perl5/site_perl/5.8.7/Mail/SpamAssassin/BayesStore/SQL.pm line 135. Aug 7 21:46:03 labtest01c spamd[2693]: bayes: database version 0 is different than we understand (3), aborting! at /usr/lib/perl5/site_perl/5.8.7/Mail/SpamAssassin/BayesStore/SQL.pm line 135. SQL server privilege issue? Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DIGEST_MULTIPLE has undefined dependency 'RAZOR2_CHECK' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DIGEST_MULTIPLE has undefined dependency 'DCC_CHECK' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test DRUGS_ERECTILE has undefined dependency '__DRUGS_ERECTILE7' Aug 7 21:46:05 labtest01c spamd[2693]: rules: meta test SARE_SUB_ACCEPT_CCARDS
Re: Broken images in mails
On Wed, 9 Aug 2006, decoder wrote: John D. Hardin wrote: Adding a point for corrupted images is sounding better and better. Definetly a good idea... I will try to add this feature in the next release of FuzzyOcr (v.2.1) then. I'd suggest a better place would be the imageinfo plugin - corrupt/clean has little to do with whether or not the image contains text, and what that text is. -- John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- If someone has a gun and is trying to kill you, it would be reasonable to shoot back with your own gun. -- the Dalai Lama, May 15, 2001 ---
Re: Looking for a good Ebay whitelist
From: Logan Shaw [EMAIL PROTECTED] On Tue, 8 Aug 2006, jdow wrote: From: Logan Shaw [EMAIL PROTECTED] On Tue, 8 Aug 2006, wrote: I have been having FPs from Ebay in AU and DE, as well as [EMAIL PROTECTED] Does anybody have a good whitelist for these? So it seems like SPF is probably something good to rely on in this case. I don't fully understand the SPF plug-in, but perhaps all you need to do is add the appropriate ebay domains to new def_whitelist_from_spf rules like the ones in 60_whitelist_spf.cf. SMOMR - Simple Matter Of Meta Rules. If SPF is bad and says it is from ebay add spam points. I thought in this case the problem was false positives. Presumably forged ebay mails are such a nuisance that all kinds of rules (bayes?) hit, causing real ebay rules to get flagged every now and then. If SPF is good and says from ebay subtract some points in a meta rule. That gets you going while any whitelist that uses spf gets built. (Either that or do it yourself. {^_-}) {^_^}
Re: Word Doc spam
--On Wednesday, August 09, 2006 1:01 AM +0200 Mark Martinec [EMAIL PROTECTED] wrote: In the FreeBSD ports collection it comes under: textproc/antiword or fetch it from its home site: http://www.winfield.demon.nl/ Cool. What's involved in integrating this into SA? Can the image plugin machinery be easily adapted to invoke this?
Re: Broken images in mails
--On Wednesday, August 09, 2006 12:18 AM +0200 decoder [EMAIL PROTECTED] wrote: I am also thinking about scanning all attachments, no matter if the content type specifies image or not (in the current version 2.0, only attachments that have image in their content type are scanned with format auto-detection) because for example outlook always displays the image, no matter if the content type is what/ever or image/blah... :( Do any legitimate senders do this? Perhaps we can throw extra points at misleading content types.
Re: Looking for a good Ebay whitelist
jdow wrote: If SPF is good and says from ebay subtract some points in a meta rule. That gets you going while any whitelist that uses spf gets built. (Either that or do it yourself. {^_-}) You might as well do it yourself, since a single whitelist_from_spf seems a lot simpler (and faster in every way) than creating header rules against the EnvelopeFrom pseudo header (not From since it could say eBay while the envelope says something else that passes an SPF check) and then writing metas against those and SPF_PASS (not neutral or anything else!). Daryl
Re: problems, problems
Am Dienstag, 8. August 2006 23:54 schrieb Logan Shaw: On Tue, 8 Aug 2006, Wolfgang Jeltsch wrote: [...] Second, I often have a message of the following form in my mail log: courierlocal: [...] Cannot open bayes databases /home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists What's the problem here, and how can I get rid of it? Without any more information than that, I would say that something is either still using the Bayes database in your home directory or it is finished but the lock file hasn't been removed. I haven't tried using SpamAssassin with Courier anything, so I'm not really familiar with how it's normally invoked. What I do currently, is to just pipe each message through spamassassin as an ordinary user before the mail is delivered. How do you normally invoke SpamAssassin in conjunction with mail software other than the Courier tools? - Logan Best wishes, Wolfgang
RE: RE: Bayes DB version issue 3.1.3 = 3.1.4
Nigel, I ended up taking the approach you listed a little earlier. The problem is that I now have two separate bayes databases; one for RH/3.1.3 and one for rPath/3.1.4. This isn't that much of a resource problem rather a redundancy problem (as I replicate the databases to our DR location, etc). So I imported the data and started testing. For some reason it was taking upwards of 70 seconds per message. This is starting SA right after installing. After reboot it did drop down to .5-1.5 range though. I was getting worried. I know have two 3.1.4 machines up and running. I will swap out two of my 4 other 3.1.3 and upgrade those in a couple days after it has ran for a while. Gary Wayne Smith -Original Message- From: Nigel Frankcom [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 08, 2006 7:03 PM To: Gary W. Smith Subject: Re: RE: Bayes DB version issue 3.1.3 = 3.1.4 Hi Gary, A dump from the SA db should reimport; you may have to kill the latin line in the dump and replace it with UTF8, beyond that it should be a straight forward dump and reload? Let me know how it goes? Kind regards Nigel n Tue, 8 Aug 2006 16:12:21 -0700, Gary W. Smith [EMAIL PROTECTED] wrote: I've created a new database in UTF8 format. I will see how this works out. I might try to copy the data from the Latin database to the UTF8 database but in past experience this hasn't worked that great. I might also make a backup as well and try that. -Original Message- From: Gary W. Smith [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 08, 2006 2:23 PM To: users@spamassassin.apache.org Subject: RE: Bayes DB version issue 3.1.3 = 3.1.4 Okay, I have a little more information now. I run the same command that sql.pm would run. It appears to be a collation issue. Can we force the collation with 3.1.4 to a specific type? In my case the database is in latin because 3.1.3 choked on UTF8. This was on RHEL4 (which defaults to UTF8). The kernel is 2.6.9. I'm trying to get this to run on rPath Linux which is on 2.6.16. I would suspect that they have implemented more libraries in UTF8 now than back on kernel 2.6.9. Anyway, here is the command I issued to catch this point: echo SELECT value FROM bayes_global_vars WHERE variable = 'VERSION'; | mysql -u user -D database -h 10.0.13.13 -ppassword ERROR 1267 (HY000) at line 1: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' Any help would be greatly appreciated. Gary Wayne Smith -Original Message- From: Gary W. Smith [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 08, 2006 8:06 AM To: Daryl C. W. O'Shea Cc: users@spamassassin.apache.org Subject: RE: Bayes DB version issue 3.1.3 = 3.1.4 Daryl, Thanks for the info. I will update the .8. As for the database, which is the primary concern, the user account is correct. I have logged into the database from that server using the same credentials from the local.cf file. I had thought that we might have restricted by subnet so I did indeed try that last night. [EMAIL PROTECTED] spamassassin]# mysql -u xxx -h xx.xx.xx.xx -D spamassassin -p Enter password: Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 6649341 to server version: 4.1.7-log Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql show tables; ++ | Tables_in_spamassassin | ++ | awl| | bayes_expire | | bayes_global_vars | | bayes_seen | | bayes_token| | bayes_vars | | userpref | ++ 7 rows in set (0.00 sec) mysql select * from bayes_global_vars; +--+---+ | variable | value | +--+---+ | VERSION | 3 | +--+---+ 1 row in set (0.00 sec) mysql -Original Message- From: Daryl C. W. O'Shea [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 08, 2006 12:38 AM To: Gary W. Smith Cc: users@spamassassin.apache.org Subject: Re: Bayes DB version issue 3.1.3 = 3.1.4 On 8/8/2006 3:29 AM, Gary W. Smith wrote: Hello, I can't remember smoking crack when copying the config files over but anything's possible. I built out a new machine today and installed SA. We have a list of CPAN modules that were installed (same list as from the 3.1.3 servers). I copied everything in the /etc/mail/spamassassin from our productions servers to the test server and after starting we receive errors. I have checked and the MySQL data instance is accessible from this server. There are also several rules that are errors as well. I know that someone has asked this question already but I didn't find the answer
Re: updates.spamassassin.org.cf overrides local.cf?
Daryl C. W. O'Shea wrote: Bret Miller wrote: Yep, I suspected as much. Now I have SA in three places, four if you count plugins. More if you count the modules and the commandline tools. I just had the crazy idea that I could keep rules in one place. Is this beginning to look unwieldy to anyone else? (rhetorical, don't answer). Well -- you could if you wanted to, but then you have to do some work to deal with it. You can't expect a tool which works one way to do something else without doing anything. You could, if you set the updates directory to /usr/share/etc/mail/spamassassin/updates or something like that to keep it in an updates folder under your site config... I'm not exactly sure what the thinking was in moving the updates to /var/lib instead of keeping them with /usr/share with the original rules. I wonder why sa-update doesn't just create a version folder under /share/spamassassin and use that... Certainly would be less to keep track of and purge when you install a new version. Updates are variable... they go in /var. Anywhere else wouldn't be following FHS. http://www.pathname.com/fhs/ Daryl The point is really moot. What files are in what directories doesn't really matter. It seems the idea is that anyone reading all the documentation, and the wiki, should be able to discern what will go where, in what order, why, and when. DAve -- Three years now I've asked Google why they don't have a logo change for Memorial Day. Why do they choose to do logos for other non-international holidays, but nothing for Veterans? Maybe they forgot who made that choice possible.
Re: Memory requirements
On Mon, 7 Aug 2006 20:35:56 -0700 (PDT) John D. Hardin [EMAIL PROTECTED] wrote: On Mon, 7 Aug 2006, James Lay wrote: Anyone happen to know the memory requirements of SpamAssassin? I have 3.0.4 running on 128 Megs okwill upgrading to 3.1.4 plus the SARE rules tank it? Or am I safe? Thanks all! I'm running 3.1.3 with a bunch of SARE and local rules on my hosted server, which only has 96MB of RAM and 196MB of swap. It's also running BIND serving as authoritative for a few domains, and apache serving static content, but no databases or other fancy stuff. I have it configured to only spawn one child and run all scans sequentially, as I don't really care if it takes a couple of minutes to score a message. It works reliably, though there's little margin for adding much else. If there was any less memory I would not be able to run SA. How much swap do you have? And what else is running on the server? John, I have almost 500 megs of swap. And Postfix and SpamAssassin are the only things running on it. Thanks! James
Re: Memory requirements
On Mon, 7 Aug 2006 20:46:05 -0700 jdow [EMAIL PROTECTED] wrote: From: James Lay [EMAIL PROTECTED] Hey all! Anyone happen to know the memory requirements of SpamAssassin? I have 3.0.4 running on 128 Megs okwill upgrading to 3.1.4 plus the SARE rules tank it? Or am I safe? Thanks all! Perhaps. Do not run anything else with a significant memory footprint on the system at the same time. Do not use X, of course. Minimize the number of children spawned to one. {^_^} Joanne Thank you Joanne :) James
Re: Memory requirements
James Lay wrote: On Mon, 7 Aug 2006 20:35:56 -0700 (PDT) "John D. Hardin" [EMAIL PROTECTED] wrote: On Mon, 7 Aug 2006, James Lay wrote: Anyone happen to know the memory requirements of SpamAssassin? I have 3.0.4 running on 128 Megs okwill upgrading to 3.1.4 plus the SARE rules tank it? Or am I safe? Thanks all! I'm running 3.1.3 with a bunch of SARE and local rules on my hosted server, which only has 96MB of RAM and 196MB of swap. It's also running BIND serving as authoritative for a few domains, and apache serving static content, but no databases or other fancy stuff. I have it configured to only spawn one child and run all scans sequentially, as I don't really care if it takes a couple of minutes to score a message. It works reliably, though there's little margin for adding much else. If there was any less memory I would not be able to run SA. How much swap do you have? And what else is running on the server? John, I have almost 500 megs of swap. And Postfix and SpamAssassin are the only things running on it. Thanks! James I run with 4 gigs of ram and a separate MySQL server for Nayes.
RE: Blocking based on ALL IPs in the header
FOLLOW-UP: This bank is using GFI for spam filtering: http://www.gfi.com/ And looking at GFI's manual, it seems that GFI treats ALL IPs in the header the same and any one blacklisted is treated just the same as if the sending mail server's IP were blacklisted... with NO option to **only** check the sending server's IP. I've posted a message on GFI's forum to clarify and so far I've seen no response. CHECK IT OUT: http://forums.gfi.com/Checking_ALL_IPs_in_header_against_blacklists/m_900736 438/tm.htm Does anyone here have any knowledge of this software? This is almost like The Twilight Zone... Either (1) I have gone insane (2) GFI has made a critical error in the fundamentals of their architecture. Please read that post above and let me know which is the case. Thanks! Rob McEwen PowerView Systems [EMAIL PROTECTED]