no tokens ? How can that be ?
I came across a situation that seems non-intuitive; Two emails this am were spam, but hit BAYES_00. So they were (presumably) learned as Ham somewhere along the way. So far so good… Doing ‘ sa-learn –forget ./message.txt ‘ gets me : Forgot tokens 0 from message(s) (1 message(s) examined) What kind of situation can cause this ? I was under the impression that Bayes_00 meant it was explicitly learned as spam, so there must be related tokens. Thanks Michael Grey
Bayes lost info when 'upgrading' ?
We went through the process of changing from a V2 BDB to a V3 DBD then -> MySQL. When running the tests side by side, old system with new we see some substantial inconsistencies between the bayes scoring… Any ideas why ? There are obviously fewer tokens now than before the sync, but no mention in the docs of data being lost… Thanks Michael Grey Old System’s bdb info : 0.000 0 2 0 non-token data: bayes db version 0.000 0 3541311 0 non-token data: nspam 0.000 0 1707362 0 non-token data: nham 0.000 0 343897 0 non-token data: ntokens 0.000 0 1157674321 0 non-token data: oldest atime 0.000 0 1157749228 0 non-token data: newest atime 0.000 0 1157749183 0 non-token data: last journal sync atime 0.000 0 1157717569 0 non-token data: last expiry atime 0.000 0 43200 0 non-token data: last expire atime delta 0.000 0 145970 0 non-token data: last expire reduction count New systems bdb info after –sync : 0.000 0 3 0 non-token data: bayes db version 0.000 0 3541670 0 non-token data: nspam 0.000 0 1707603 0 non-token data: nham 0.000 0 263855 0 non-token data: ntokens 0.000 0 1157707085 0 non-token data: oldest atime 0.000 0 1157755124 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 0 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count
BAYES_00
Forgive what may be a newbie question; If you hit on BAYES_00, does that mean explicitly that the email has been learned as NOT SPAM ? If this is not the case ( or ONLY the case,) what other conditions may cause this ? ( Presuming the DB is available / healthy etc. ) Thanks… Michael Grey
RE: Fuzzy OCR false positives from Screenshots...
You will have to ask the cell company about the first issue ... In regards to the second, many large companies have outside companies do work for them in the areas of marketing and other aspects. So this also will happen regardless. Let me clarify; this is an OUTSIDE relay to INSIDE... A FuzzyOCR White List with (very privately held) keywords would help. Any other ideas ? Michael Grey -Original Message- From: John D. Hardin [mailto:[EMAIL PROTECTED] Sent: Friday, September 08, 2006 10:10 AM To: Michael Grey Cc: users@spamassassin.apache.org Subject: Re: Fuzzy OCR false positives from Screenshots... On Fri, 8 Sep 2006, Michael Grey wrote: > However, there have been two occasions in the last 24 hrs where screenshots > embedded into the emails caused false positives. > > One was an 'account summary' from a cell company, the other was some internal > marketing info. > > Are there other approaches to getting certain images white listed if they > contain, say, our specific company name ? Don't run SA against internal email. And what the heck is a cell-phone company doing sending you screenshots? -- John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- If someone has a gun and is trying to kill you, it would be reasonable to shoot back with your own gun. -- the Dalai Lama, May 15, 2001 --- 9 days until The 219th anniversary of the signing of the U.S. Constitution
Fuzzy OCR false positives from Screenshots...
We are testing a new configuration using FuzzyOCR, and found it to work very well overall… However, there have been two occasions in the last 24 hrs where screenshots embedded into the emails caused false positives. One was an ‘account summary’ from a cell company, the other was some internal marketing info. Are there other approaches to getting certain images white listed if they contain, say, our specific company name ? Any other ideas on how to deal with this ? Many thanks ! Michael Grey
RE: source SENDER authentication ? (as opposed to SPF HOST authentication)
Yes, I tend to agree with this... the reason why many POP servers reply to VRFY with 'You can try...' instead of a yes or no. Unfortunately I am not the one driving this requirement ;) I like Michel Vaillancourt's idea - if it has to be done. I appreciate everyone's feedback to this question. Michael Grey -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 30, 2006 10:44 AM To: Gino Cerullo Cc: users@spamassassin.apache.org Subject: Re: source SENDER authentication ? (as opposed to SPF HOST authentication) Gino Cerullo writes: > part 1.2 text/plain1027 > On 30-Aug-06, at 1:10 PM, Michael Grey wrote: > > > Are there any SA methods that allow verification of the 'sender' of > > an email ? > > > > I am aware of SPF which can confirm that a host at ip address > > x.x.x.x is authorized to send mail as from domain "A", but how > > about a means to confirm that '[EMAIL PROTECTED]' actually is a > > real user before accepting mail from him ? > > > I don't believe SA can do that as it's a content filter. Some MTAs > can do this and this is were you want those kinds of verifications to > happen, before DATA. The problem is that if you do it for every > address you will get false positives, especially from sources like > mailing lists, news & info subscriptions, etc., and you'll find > yourself whitelisting alot. > > I actually do this using Postfix but I use a table of 'frequently > forged domains' whose addresses are verified before they are allowed > to pass on to the content filters. It's also worth noting that doing this is counterproductive in an overall strategy sense, since it drives the spammers to simply use known-valid third-party addresses -- such as random addrs from their target address list -- as the forged source of the spam. The end result for us end users, is a massive increase in "spam blowback", which is what we've seen since those MTAs implemented it. :( --j.
source SENDER authentication ? (as opposed to SPF HOST authentication)
Are there any SA methods that allow verification of the ‘sender’ of an email ? I am aware of SPF which can confirm that a host at ip address x.x.x.x is authorized to send mail as from domain “A”, but how about a means to confirm that ‘[EMAIL PROTECTED]’ actually is a real user before accepting mail from him ? Thanks Michael Grey
RE: FuzzyOCR Install - Issues processing ONLY Gif images.
I did have libungif installed, but the rpm doesn't add some of the needed support that libungif-progs provides. That did the trick. Thanks ! Michael Grey -Original Message- From: Tim Litwiller [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 29, 2006 8:29 PM To: users@spamassassin.apache.org Subject: Re: FuzzyOCR Install - Issues processing ONLY Gif images. try changing your time out from 10 seconds to 15 or 20 and verify that giffix is installed and working correctly. libungif-utils rpm on fedora Michael Grey wrote: > > Installed FuzzyOCR and believe all the dependencies. > > Using the sample images I get a Pipe Error ONLY on gif images; > resulting in no hits on FUZZY_OCR. > > Pipe Command "/usr/bin/giftopnm -" > > Giftopnm exists in that path. > > Running giftopnm on the command line seems to work with no errors, > spitting out a binary file to stdout as expected. > > Any ideas of what might be missing ? ( Fedora Core 4 ). > > Thanks... > > > Michael Grey > > - log / reports - > > Corrupted-gif.eml > > pts rule name description > > -- > -- > > 0.1 HTML_MESSAGE BODY: HTML included in message > > 3.0 BAYES_95 BODY: Bayesian spam probability is 95 to 99% > > [score: 0.9694] > > 1.5 FUZZY_OCR_WRONG_CTYPE BODY: Mail contains an image with wrong > > content-type set > > Image has format "GIF" but content-type is > > "image/jpeg" > > [2006-08-29 19:20:00] Debug mode: Image has format "GIF" but > content-type is "image/jpeg" > > [2006-08-29 19:20:01] Debug mode: Image is single non-interlaced... > > [2006-08-29 19:20:01] Unexpected error in pipe to external programs. > > Please check that all helper programs are installed and in the correct > path. > > (Pipe Command "/usr/bin/giftopnm -", Pipe exit code 1 (""), Temporary > file: "/tmp/.spamassassin23614sXR9Dltmp") > > [2006-08-29 19:20:01] Debug mode: FuzzyOcr ending successfully... > > bash-3.00$ > > animated-gif.eml > > pts rule name description > > -- > -- > > 0.7 DATE_IN_PAST_06_12 Date: is 6 to 12 hours before Received: date > > 0.1 HTML_MESSAGE BODY: HTML included in message > > 0.0 BAYES_50 BODY: Bayesian spam probability is 40 to 60% > > [score: 0.5000] > > [2006-08-29 19:22:12] Debug mode: Analyzing file with content-type > "image/gif" > > [2006-08-29 19:22:12] Debug mode: Image is single non-interlaced... > > [2006-08-29 19:22:12] Unexpected error in pipe to external programs. > > Please check that all helper programs are installed and in the correct > path. > > (Pipe Command "/usr/bin/giftopnm -", Pipe exit code 1 (""), Temporary > file: "/tmp/.spamassassin23644bPPq3jtmp") > > [2006-08-29 19:22:12] Debug mode: FuzzyOcr ending successfully... >
FuzzyOCR Install - Issues processing ONLY Gif images.
Installed FuzzyOCR and believe all the dependencies. Using the sample images I get a Pipe Error ONLY on gif images; resulting in no hits on FUZZY_OCR. Pipe Command "/usr/bin/giftopnm -" Giftopnm exists in that path. Running giftopnm on the command line seems to work with no errors, spitting out a binary file to stdout as expected. Any ideas of what might be missing ? ( Fedora Core 4 ). Thanks… Michael Grey - log / reports - Corrupted-gif.eml pts rule name description -- -- 0.1 HTML_MESSAGE BODY: HTML included in message 3.0 BAYES_95 BODY: Bayesian spam probability is 95 to 99% [score: 0.9694] 1.5 FUZZY_OCR_WRONG_CTYPE BODY: Mail contains an image with wrong content-type set Image has format "GIF" but content-type is "image/jpeg" [2006-08-29 19:20:00] Debug mode: Image has format "GIF" but content-type is "image/jpeg" [2006-08-29 19:20:01] Debug mode: Image is single non-interlaced... [2006-08-29 19:20:01] Unexpected error in pipe to external programs. Please check that all helper programs are installed and in the correct path. (Pipe Command "/usr/bin/giftopnm -", Pipe exit code 1 (""), Temporary file: "/tmp/.spamassassin23614sXR9Dltmp") [2006-08-29 19:20:01] Debug mode: FuzzyOcr ending successfully... bash-3.00$ animated-gif.eml pts rule name description -- -- 0.7 DATE_IN_PAST_06_12 Date: is 6 to 12 hours before Received: date 0.1 HTML_MESSAGE BODY: HTML included in message 0.0 BAYES_50 BODY: Bayesian spam probability is 40 to 60% [score: 0.5000] [2006-08-29 19:22:12] Debug mode: Analyzing file with content-type "image/gif" [2006-08-29 19:22:12] Debug mode: Image is single non-interlaced... [2006-08-29 19:22:12] Unexpected error in pipe to external programs. Please check that all helper programs are installed and in the correct path. (Pipe Command "/usr/bin/giftopnm -", Pipe exit code 1 (""), Temporary file: "/tmp/.spamassassin23644bPPq3jtmp") [2006-08-29 19:22:12] Debug mode: FuzzyOcr ending successfully...
RE: Adding 'SA scores' to all incoming mails
Why is it 'better' ? I didn't say it was... Simply one of the possible approaches to getting the full headers. Michael Grey -Original Message- From: John D. Hardin [mailto:[EMAIL PROTECTED] Sent: Thursday, August 24, 2006 3:14 PM To: Michael Grey Cc: users@spamassassin.apache.org Subject: RE: Adding 'SA scores' to all incoming mails On Thu, 24 Aug 2006, Michael Grey wrote: > In this example, all emails get an additional header : > > X-Spam-score-breakdown calvin score 6.77/4.5 > > "add_header all score-breakdown calvin score _HITS_/_REQD_ " And that's better than this: X-Spam-Status: No, score=3.5 required=5.0 tests=BAYES_50,FROM_EXCESS_QP, FROM_SUBDOMAIN,HTML_COMMENTS,HTML_EMBED_IMG_04,HTML_MESSAGE, HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,SARE_UNSUB38D,SPF_PASS, SUBJECT_EXCESS_QP autolearn=disabled version=3.1.3 how? I'm not saying it shouldn't be done, but that the scores and rule hits are *already there* so why paste them in yet again? > On Thu, 24 Aug 2006, list wrote: > > > I'd like SA to make a extra line/section under all my mails where it > > tells what score the mail got (or maybe even which rules scored on the > > mail) is there such a setting? > > > > it would help me to finetune my SA. > > You mean, actually paste the score into the body or attach it as > another MIME body part? > > Are the X-Spam-* headers not sufficient? -- John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Taking my gun away because I *might* shoot someone is like cutting my tongue out because I *might* yell "Fire!" in a crowded theater. -- Peter Venetoklis --- 26 days until Talk Like a Pirate day
RE: Adding 'SA scores' to all incoming mails
Check the docs for 'add_header' in local.cf or user_prefs. The key words here are 'add_header all ' then the text and variables you want to have displayed; the Rule Scoring is another 'variable' that can be sourced. In this example, all emails get an additional header : X-Spam-score-breakdown calvin score 6.77/4.5 "add_header all score-breakdown calvin score _HITS_/_REQD_ " Good luck... Michael Grey -Original Message- From: John D. Hardin [mailto:[EMAIL PROTECTED] Sent: Thursday, August 24, 2006 2:32 PM To: list Cc: users@spamassassin.apache.org Subject: Re: Adding 'SA scores' to all incoming mails On Thu, 24 Aug 2006, list wrote: > I'd like SA to make a extra line/section under all my mails where it > tells what score the mail got (or maybe even which rules scored on the > mail) is there such a setting? > > it would help me to finetune my SA. You mean, actually paste the score into the body or attach it as another MIME body part? Are the X-Spam-* headers not sufficient? -- John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Taking my gun away because I *might* shoot someone is like cutting my tongue out because I *might* yell "Fire!" in a crowded theater. -- Peter Venetoklis --- 26 days until Talk Like a Pirate day
RE: SPF Scoring... SPF_NEUTRAL
Sorry, I was too philosophical in my question... to rephrase; In the standard SA config, should I expect to see an SPF_* rule hit returned when the SPF return value is 'none' ? Thanks Mike -Original Message- From: Gino Cerullo [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 23, 2006 9:54 AM To: users@spamassassin.apache.org Subject: Re: SPF Scoring... SPF_NEUTRAL On 23-Aug-06, at 12:45 PM, Michael Grey wrote: > Since this is not a production system, we have had to do some MX > magic on a > remote domain to push mail through this new system... that domain > doesn't > have SPF enabled (curse you Network Solutions !) > > So the big question is really this : Should "NONE" get an SPF score ? That is a matter of internal policy on your part. If you want to penalize domains for not having an SPF record you could give it a negative score. On the other hand, if you wish to reward them for not having an SPF record give them a positive score. I believe the general consensus is to leave it alone. Especially since SPF is still quite new and still technically in an experimental stage. -- Gino Cerullo Pixel Point Studios 21 Chesham Drive Toronto, ON M3M 1W6 416-247-7740
RE: SPF Scoring... SPF_NEUTRAL
Since this is not a production system, we have had to do some MX magic on a remote domain to push mail through this new system... that domain doesn't have SPF enabled (curse you Network Solutions !) So the big question is really this : Should "NONE" get an SPF score ? Thanks Mike -Original Message- From: Noel Jones [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 23, 2006 9:17 AM To: Michael Grey Cc: users@spamassassin.apache.org Subject: Re: SPF Scoring... SPF_NEUTRAL On 8/23/06, Michael Grey <[EMAIL PROTECTED]> wrote: > > > > > > > Has anyone experienced SPF_* rules not actually being scored ? > > In the debug I see that it comes back as 'result: none' - shouldn't this > come back as SPF_NEUTRAL ? > > When the domain does not publish SPF records you get "result: none". Test with a domain that does publish SPF records. -- Noel Jones
SPF Scoring... SPF_NEUTRAL
Has anyone experienced SPF_* rules not actually being scored ? In the debug I see that it comes back as ‘result: none’ – shouldn’t this come back as SPF_NEUTRAL ? We are setting up SA with amavisd, and when running amavis in debug mode (amavisd –u amavis –g amavis debug-sa) I can see it hit the spf checks; it comes back with --- debug output --- [2456] dbg: spf: checking HELO (helo=mail.yuki.com, ip=22.110.92.38) [2456] dbg: spf: query for /22.110.92.38/mail.yuki.com: result: none, comment: SPF: domain of sender mail.yuki.com does not designate mailers [2456] dbg: spf: checking EnvelopeFrom (helo=mail.yuki.com, ip=22.110.92.38, [EMAIL PROTECTED]) [2456] dbg: spf: query for [EMAIL PROTECTED]/22.110.92.38/mail.yuki.com: result: none, comment: SPF: domain of sender [EMAIL PROTECTED] does not designate mailers In SA local.cf I have tweaked the scores arbitrarily way up to try to ensure that the scoring is substantial enough to guarantee notice… --- local.cf --- score SPF_PASS 10 score SPF_HELO_PASS 10 score SPF_FAIL 12 score SPF_HELO_FAIL 13 score SPF_HELO_NEUTRAL 13 score SPF_HELO_SOFTFAIL 12 score SPF_NEUTRAL 12 score SPF_SOFTFAIL 12 However, the header result in the email is : --- email header --- X-Spam-Status: No, score=2.047 tagged_above=-999 required=4.5 tests=[BAYES_50=0.001, RCVD_IN_SORBS_DUL=2.046] X-Spam-Score: 2.047 X-Spam-Level: ** Still no hits… Other score changes in local.cf are effective; so if I modify RCVD_IN_SORBS_DUL= that change will be apparent in the email header. Any ideas ??? Many thanks. Michael Grey
FW: Bayes SQL Errors
Ryan, I just did this myself the first time in the last week; Be sure that all your operations on the DB are done as the user who is going to be accessing it; ie: Spamassassin spamuser etc. Not knowing the history of your install; In your Spamassassin local.cf file you should have these lines, COMMENTED OUT for now... You want spamassassin to use the berkely db for the moment. # bayes_store_module Mail::SpamAssassin::BayesStore::MySQL # bayes_sql_dsn DBI:mysql:bayes:localhost:3306 # bayes_sql_username spamassassin # bayes_sql_password spampassword # bayes_sql_override_username spamassassin First be sure that your B-DB is actually a vs 3.x by doing 'sa-learn --sync' This will ensure that the b-db format is 3.x compatible. Next, do a sa-learn --backup >backup.txt. Create the bayes DB in mysql, and then apply the tables using the templates ( that you obviously have ). In mysql (as root) be sure to do : - grant all privileges on bayes.* to [EMAIL PROTECTED] identified by 'spampassword' Uncomment the bayse_* lines from Spamassassin local.cf, then su back as spamassassin ( or whatever user is going to be accessing the db ) run 'sa-learn --restore ./backup.txt' this places all the entries from backup.txt into the mysql db. This should take a few minutes. >From your errors, it looks like the import process into your db got messed up. As root, go into mysql> and drop the bayes db and start again Good luck... Michael Grey -Original Message- From: Ryan Kather [mailto:[EMAIL PROTECTED] Sent: Monday, August 21, 2006 1:29 PM To: