Re: How do I search and capture text for use in a rule?

2021-05-07 Thread Steve Dondley

On 2021-05-07 10:33 AM, Henrik K wrote:

On Fri, May 07, 2021 at 10:19:49AM -0400, Steve Dondley wrote:
I want to extract the first part of an email address from the 
"Delivered-To"

header and use it witin a custom rule.

Example pseudo code:

my ($first_part) = $email_file =~ /^Deliver-To: (.*)/;

body __LOCAL_AWKWARD_INTRO /hi $first_part/i


How can I do this in my .cf file?


With a silly kludge, a full rule that matches the complete raw email 
with a

single regex.  Example in stock rules:

full __FROM_NAME_IN_MSG 
/^From:\s+([^<]\S+\s\S+)\s(?=.{1,2048}^\1\r?$)/sm


So something like (untested)

full __LOCAL_AWKWARD_INTRO
/^Delivered-To:\s+<([^@>]+)(?=.{1,2048}\bHi\s+\1\b)/sm



Thanks. I don't quite understand the {1,2048} bit. That looks like a 
look ahead assertion up to 2048 characters? What is magical about 2048? 
What if the "Delivered-To" header is more than 2048 characters away from 
the salutation, which doesn't seem unlikely.


How do I search and capture text for use in a rule?

2021-05-07 Thread Steve Dondley
I want to extract the first part of an email address from the 
"Delivered-To" header and use it witin a custom rule.


Example pseudo code:

my ($first_part) = $email_file =~ /^Deliver-To: (.*)/;

body __LOCAL_AWKWARD_INTRO /hi $first_part/i


How can I do this in my .cf file?


Re: More fake order spam

2021-04-27 Thread Steve Dondley

On 2021-04-27 03:03 PM, Dave Wreski wrote:
Invalid List-ID. You can then use that with other weirdness in a 
meta.
header    __LIST_ID_DOMAIN_IN_BRACKETS List-id =~ 
/<([\w-]+)(\.[\w-]+)+>/
meta   LIST_ID_IMPROPER_FORMAT __HAS_LIST_ID && 
!__LIST_ID_DOMAIN_IN_BRACKETS

score  LIST_ID_IMPROPER_FORMAT 0.001
describe LIST_ID_IMPROPER_FORMAT List-id has improper format


You lost me here. The spam has this:

List-Id: MzY3NDAxMi01Nzg2LTU= 



That's not legit? It's in brackets.


It's matching on the text before the brackets.


I meant to say that it's not matching the __LIST_ID_DOMAIN_IN_BRACKETS
because of the text before the brackets, so the rule
matches/triggered.


OK, gotcha. But now I gotta ask: I see the host tacked onto the random 
bit of text in the brackets, but why is it significant that the part 
outside the brackets doesn't exactly match the part inside? How does 
that let us know the email is bogus?


Re: More fake order spam

2021-04-27 Thread Steve Dondley

On 2021-04-27 02:23 PM, Reindl Harald wrote:

Am 27.04.21 um 19:57 schrieb Steve Dondley:

On 2021-04-27 01:19 PM, Dave Wreski wrote:

Investigate adding the SEM_FRESH rules - this domain was created less
than five days ago.
https://spameatingmonkey.com/services


OK, how do I get those rules installed?


why don't you just click on the link? there is a sample for copy&paste
monkeys and how local .cf files are working is supposed to know by
someone running a public mailserver


I did. That's why I wrote: "I don't see anything similar for SEM rules. 
I see the page you linked to says to drop this into the config:"


Re: More fake order spam

2021-04-27 Thread Steve Dondley

On 2021-04-27 01:19 PM, Dave Wreski wrote:

-2.5 RCVD_IN_HOSTKARMA_W    RBL: Sender listed in HOSTKARMA-WHITE
  [185.41.28.7 listed in 
hostkarma.junkemailfilter.com]


We've reduced this score to -1 locally.


-1.0 BAYES_00   BODY: Bayes spam probability is 0 to 1%


Needs to be trained, obviously. Bayes is best for this body content.

Looks like it's coming from some kind of bulk mail service which is 
whitelisted. Even after training with bayes, it will still be a false 
negative.


Any ideas on the best way to tackle these kinds of fake order spam?


Investigate adding the SEM_FRESH rules - this domain was created less
than five days ago.
https://spameatingmonkey.com/services


OK, how do I get those rules installed? I've only installed KAM rules 
using a channel. I don't see anything similar for SEM rules. I see the 
page you linked to says to drop this into the config:


# SEM-FRESH
urirhssub SEM_FRESH fresh.spameatingmonkey.net. A 2
body SEM_FRESH eval:check_uridnsbl('SEM_FRESH')
describe SEM_FRESH Contains a domain registered less than 5 days ago
tflags SEM_FRESH net
score SEM_FRESH 0.5

I've never seen anything like this before. Looks like this is the 
documentation for that: 
https://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Plugin_URIDNSBL.html


Should I be adding other services besides this one for urihssub lookups?



Invalid List-ID. You can then use that with other weirdness in a meta.
header__LIST_ID_DOMAIN_IN_BRACKETS List-id =~ 
/<([\w-]+)(\.[\w-]+)+>/
meta   LIST_ID_IMPROPER_FORMAT __HAS_LIST_ID && 
!__LIST_ID_DOMAIN_IN_BRACKETS

score  LIST_ID_IMPROPER_FORMAT 0.001
describe LIST_ID_IMPROPER_FORMAT List-id has improper format


You lost me here. The spam has this:

List-Id: MzY3NDAxMi01Nzg2LTU= 

That's not legit? It's in brackets.



Investigate configuring dcc. We also created a meta that matches DCC 
and URIBLs.


Yes, on my todo list.



I believe the new Esp module that works to identify bad sendgrid
accounts also has support for sendinblue accounts, but to what extent?
X-Mailer: Sendinblue


To start, I wrote this rule that I think will probably work well because 
it doesn't make sense for any order information is going to come from a 
mailing list.


# fake order spam
header__LOCAL_FAKE_ORDER_SUBJ   Subject =~ /your.order/i
header__LOCAL_FAKE_ORDER_1   X-Mailer =~ /Sendinblue/i
header__LOCAL_FAKE_ORDER_2   List-Id =~ /./

meta  LOCAL_FAKE_ORDER  _LOCAL_FAKE_ORDER_SUBJ + (__LOCAL_FAKE_ORDER_2 + 
__LOCAL_FAKE_ORDER_3 >= 1)

score LOCAL_FAKE_ORDER 3.0





I believe later versions of SA also have more geolocation support - do
you have a need to receive mail from France?
$ whois 185.41.28.7
...
route:  185.41.28.0/22
descr:  SENDINBLUE-185-41-28-0-22
origin: AS200484

Regards,
Dave


Re: More fake order spam

2021-04-27 Thread Steve Dondley

On 2021-04-27 01:12 PM, Greg Troxel wrote:

As always, if you have a problem stemming from a dns-based or similar
reputation list, you need to report problems to those lists.

If you aren't running greylisting with aggressive delays for SBL/XBL 
and

moderate for dialup, do that too.


What does "aggressive delays for SBL/XBL and moderate for dialup" mean, 
exactly? Do you mean greylist long enough to give the blocklists time to 
label the spam as spam?


And what does "moderate for dialup" mean?


More fake order spam

2021-04-27 Thread Steve Dondley

Got this: https://pastebin.com/Gfz951dh

Spam report:

Content analysis details:   (-2.3 points, 5.0 required)

 pts rule name  description
 -- 
--

-2.5 RCVD_IN_HOSTKARMA_WRBL: Sender listed in HOSTKARMA-WHITE
 [185.41.28.7 listed in 
hostkarma.junkemailfilter.com]

-1.0 BAYES_00   BODY: Bayes spam probability is 0 to 1%
[score: 0.]
-0.0 SPF_HELO_PASS  SPF: HELO matches SPF record
 0.2 HEADER_FROM_DIFFERENT_DOMAINS From and EnvelopeFrom 2nd level
mail domains are different
-0.0 SPF_PASS   SPF: sender matches SPF record
 0.1 HTML_MESSAGE   BODY: HTML included in message
-0.1 DKIM_VALID Message has at least one valid DKIM or DK 
signature
 0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not 
necessarily

valid
-0.1 DKIM_VALID_AU  Message has a valid DKIM or DK signature 
from

author\'s domain
-1.0 MAILING_LIST_MULTI Multiple indicators imply a widely-seen list
manager
 2.0 LOCAL_SPAM_TLD Domain originates a lot of spam


Looks like it's coming from some kind of bulk mail service which is 
whitelisted. Even after training with bayes, it will still be a false 
negative.


Any ideas on the best way to tackle these kinds of fake order spam?


Re: Getting "config: registryboundaries: no tlds defined, need to run sa-update" message when running mass-check

2021-04-25 Thread Steve Dondley

On 2021-04-25 01:47 PM, Henrik K wrote:

On Sun, Apr 25, 2021 at 01:28:31PM -0400, Steve Dondley wrote:


> mass-check -c parameter expects to find every config file in that single
> directory.  Now it's missing spamassassin updates and specifically
> 20_aux_tlds.cf from there.  You could copy it to /etc/spamassassin
> temporarily, but I'd rather make a completely separate directory that
> should
> include only the relevant *.pre and *.cf files you need for the scan.

OK, thanks. So I created a directory: /root/spam_rules

I copied over every .cf and .pre file from /etc/spamassassin into that 
dir
as well as every .cf and .pre file inside 
/var/lib/spamassassin/3.004004


Don't blindly copy all .cf files from /etc/spamassassin, there's no 
point

using AWL or bayes etc from that config.


OK. I'm setting up a test machine to duplicate a live machine. Not sure 
if that makes a difference or not.




I ran mass-check with "-c=~/root/spam_rules" and now get a ton of 
these

errors:


config: configuration file "/root/spam_rules/20_advance_fee.cf" 
requires
version 3.004004 of SpamAssassin, but this is code version 3.004006. 
Maybe
you need to use the -C switch, or remove the old config files? 
Skipping this

file at
/root/spamassassin-3.4/masses/../lib/Mail/SpamAssassin/Conf/Parser.pm 
line

414.


svn checkout http://svn.apache.org/repos/asf/spamassassin/trunk
spamassassin-trunk


I have the trunk downloaded via svn, but I have no idea how to find the 
revision for 3.4.4 and roll back to it.


I ended up just downloading the 3.4.4 version from metacpan. After 
downloading and using this version, the errors have gone away.


Re: Getting "config: registryboundaries: no tlds defined, need to run sa-update" message when running mass-check

2021-04-25 Thread Steve Dondley





spamassassin -V reports: "SpamAssassin version 3.4.4"

I imagine I have to checkout an older 3.4.4 point version from SVN and
use the mass-check command from that. It's been ages since I've used
SVN.

How can I get to the older version via SVN?


I solved this by downloading version 3.4.4 of SA from metacpan and then 
dropping the masses/ dir with the mass-check tool from SVN into the 
3.4.4 version.


Re: Getting "config: registryboundaries: no tlds defined, need to run sa-update" message when running mass-check

2021-04-25 Thread Steve Dondley



> On Apr 25, 2021, at 1:31 PM, Axb  wrote:
> 
> What are you trying to do?
> run masscheck for your rules or for the SA project?

I’m experimenting with writing my own rules. My machines are using SA 3.4.4 so 
I want to use the 3.4.4 rules.

Re: Getting "config: registryboundaries: no tlds defined, need to run sa-update" message when running mass-check

2021-04-25 Thread Steve Dondley



mass-check -c parameter expects to find every config file in that 
single

directory.  Now it's missing spamassassin updates and specifically
20_aux_tlds.cf from there.  You could copy it to /etc/spamassassin
temporarily, but I'd rather make a completely separate directory that 
should

include only the relevant *.pre and *.cf files you need for the scan.


OK, thanks. So I created a directory: /root/spam_rules

I copied over every .cf and .pre file from /etc/spamassassin into that 
dir as well as every .cf and .pre file inside 
/var/lib/spamassassin/3.004004


I ran mass-check with "-c=~/root/spam_rules" and now get a ton of these 
errors:



config: configuration file "/root/spam_rules/20_advance_fee.cf" requires 
version 3.004004 of SpamAssassin, but this is code version 3.004006. 
Maybe you need to use the -C switch, or remove the old config files? 
Skipping this file at 
/root/spamassassin-3.4/masses/../lib/Mail/SpamAssassin/Conf/Parser.pm 
line 414.
config: configuration file "/root/spam_rules/20_body_tests.cf" requires 
version 3.004004 of SpamAssassin, but this is code version 3.004006. 
Maybe you need to use the -C switch, or remove the old config files? 
Skipping this file at 
/root/spamassassin-3.4/masses/../lib/Mail/SpamAssassin/Conf/Parser.pm 
line 414.
config: configuration file "/root/spam_rules/20_compensate.cf" requires 
version 3.004004 of SpamAssassin, but this is code version 3.004006. 
Maybe you need to use the -C switch, or remove the old config files? 
Skipping this file at 
/root/spamassassin-3.4/masses/../lib/Mail/SpamAssassin/Conf/Parser.pm 
line 414.
config: configuration file "/root/spam_rules/20_dnsbl_tests.cf" requires 
version 3.004004 of SpamAssassin, but this is code version 3.004006. 
Maybe you need to use the -C switch, or remove the old config files? 
Skipping this file at 
/root/spamassassin-3.4/masses/../lib/Mail/SpamAssassin/Conf/Parser.pm 
line 414.



spamassassin -V reports: "SpamAssassin version 3.4.4"

I imagine I have to checkout an older 3.4.4 point version from SVN and 
use the mass-check command from that. It's been ages since I've used 
SVN.


How can I get to the older version via SVN?


Getting "config: registryboundaries: no tlds defined, need to run sa-update" message when running mass-check

2021-04-25 Thread Steve Dondley

I'm running this command:

./mass-check -n --rules='^LOCAL_AWK_INTRO' -o 
ham:dir:/spam/Maildir/.INBOX*  -c=/etc/spamassassin/ | grep '.  1'



Everything appears to work as expected but I'm getting this 
warning/error when I do:


"config: registryboundaries: no tlds defined, need to run sa-update"

Running sa-update doesn't fix the problem and a search didn't uncover 
anything useful.


Re: Two different machines running same versoin of SA giving different scores for scores that are commented out

2021-04-25 Thread Steve Dondley

On 2021-04-25 10:19 AM, RW wrote:

On Sun, 25 Apr 2021 00:40:59 -0400
Steve Dondley wrote:




On both machines, /usr/share/spasmassassin/72_active.cf has this rule
which is commented out:



This is the legacy rule directory from  before sa-update existed.

Have you not got another directory populated by sa-update?


Yeah, I got it working after Rendi gave me a clue. Thanks.


Re: Two different machines running same versoin of SA giving different scores for scores that are commented out

2021-04-25 Thread Steve Dondley

On 2021-04-25 05:57 AM, Reindl Harald wrote:

Am 25.04.21 um 07:09 schrieb Steve Dondley:

That rule has this line in the 72_active.cf file:


Look in 72_scores.cf and compare the modification dates on that file.

Their scores as of today (saturday):

72_scores.cf:score FSL_BULK_SIG  0.001 0.001 
0.001 0.001
72_scores.cf:score PP_MIME_FAKE_ASCII_TEXT   0.999 0.837 
0.999 0.837


The date is Jan 30, 2020. I'm running SA 3.4.4 (the version supplied 
by backports on my debian machine).
it's time to  learn about basics like sa-update and where the stuff is 
located


OK, heh. I had totally forgotten about SA updates and what they do. 
After figuring out sa-update and getting it working properly on both 
machines, the scores are the same now. Thanks.





Re: Two different machines running same versoin of SA giving different scores for scores that are commented out

2021-04-24 Thread Steve Dondley

On 2021-04-25 01:00 AM, John Hardin wrote:

On Sun, 25 Apr 2021, Steve Dondley wrote:

I'm running the same version of SA on the same email on two different 
machines and getting different scores in for some rules in the report:


Machine A gives: 0.0 FSL_BULK_SIG   Bulk signature with no 
Unsubscribe
Machine B gives: 1.0 FSL_BULK_SIG   Bulk signature with no 
Unsubscribe


On both machines, /usr/share/spasmassassin/72_active.cf has this rule 
which is commented out:


...

Machine A: 0.3 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to 
be ASCII
Machine B: 1.0 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to 
be ASCII


That rule has this line in the 72_active.cf file:


Look in 72_scores.cf and compare the modification dates on that file.

Their scores as of today (saturday):

72_scores.cf:score FSL_BULK_SIG  0.001 0.001 
0.001 0.001
72_scores.cf:score PP_MIME_FAKE_ASCII_TEXT   0.999 0.837 
0.999 0.837


The date is Jan 30, 2020. I'm running SA 3.4.4 (the version supplied by 
backports on my debian machine).


Two different machines running same versoin of SA giving different scores for scores that are commented out

2021-04-24 Thread Steve Dondley
I'm running the same version of SA on the same email on two different 
machines and getting different scores in for some rules in the report:


Machine A gives: 0.0 FSL_BULK_SIG   Bulk signature with no 
Unsubscribe
Machine B gives: 1.0 FSL_BULK_SIG   Bulk signature with no 
Unsubscribe


On both machines, /usr/share/spasmassassin/72_active.cf has this rule 
which is commented out:


#scoreFSL_BULK_SIG  3.000   # limit

Similarly, for another rule that's commented out, I'm getting:

Machine A: 0.3 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to 
be ASCII
Machine B: 1.0 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to 
be ASCII


That rule has this line in the 72_active.cf file:

#scorePP_MIME_FAKE_ASCII_TEXT  1.0


It appears Machine A is somehow caching the old scores for rules that 
have been commented out. Restarting spamassassin daemon doesn't help. 
The command I'm running to generate the report is:


spamc -R < 
/spam/Maildir/.Spam/cur/1619286920.M132164P23787.email.dondley.com\,S\=5093\,W\=5214\:2\,S


Re: Script or command for testing new rules to ensure new rules don't generate false positives/negatives?

2021-04-24 Thread Steve Dondley





And if you want to test your rules against a corpus rather than
testing against a few one-off spamples, then look into setting up a
local masscheck instance. You don't need to upload the results to SA,
but it will give you a good overview of how a rule behaves against
multiple messages.


I'm not sure what you mean by "Local masscheck instance". But I plan to 
do the following:


1) set up SA in a docker container which has a volume containing my 
spam/ham folders

2) run a script that syncs ham/spam with live server
2) set up a script that will compare scores before a rule is implemented 
and with scores after it is implemented
3) script will output a report that tells me the results and report 
whether a spam/ham email is "flipped"


Re: Script or command for testing new rules to ensure new rules don't generate false positives/negatives?

2021-04-24 Thread Steve Dondley

On 2021-04-23 05:41 PM, Martin Gregorie wrote:

On Fri, 2021-04-23 at 16:28 -0400, Steve Dondley wrote:

I'm experimenting with writing a library of my own SA rules and
scores.


I do this on a separate computer, which has Spamassassin installed but
not linked into anything else. It also has a copy of all the live SA
configuration files. Alongside this I have a directory filled with
examples of spam to function as testing input.

Along with I have a bash script or two which is used to do things like:

1) start SA in debug mode to check the testing config for errors. 
   No messages are processed - its just looking for configuration
   errors.

2) run SA against a spam sample and only display the list of spam hits

3) run SA against a spam sample and display the entire output message
   using less so it can be scrolled through

4) run SA against the complete spam collection and only display
   references to messages which are not scored as spam

5) replace the live SA configuration with with the current testing
  configuration, i.e. make the most set of changes live.

In practise (1) through (3) are east to combine into a single script
with an option to select the required action while (4) and (5) are best
kept separate.

It helps a lot of to name the items in the spam collection to relate
each set of similar spam to the local rule that's intended to trap this
spam type.


I'd like to be sure that the rules I write don't turn ham into spam
and vice versa.


It won't if you test the rules against related spam and give some
thought to the score you apply to each rule.


I imagine a utility like this must exists so figured I'd ask here
before re-inventing the wheel and writing my own (probably bugg)
script.


The sort of scripts I use are fairly short and simple.


The script would need to check against all email files in .INBOX.* and
.Spam directory in a user's IMAP directory.


No. Treat this like any other code development project: use a rule
development SA installation like I describe so you never develop rules
using the live mail stream. This way your rules will be better written
and tested and you'll cause fewer false positives in your live mail
stream.

Martin


Sounds like the best plan. Thanks for the advice.


Script or command for testing new rules to ensure new rules don't generate false positives/negatives?

2021-04-23 Thread Steve Dondley
I'm experimenting with writing a library of my own SA rules and scores. 
I'd like to be sure that the rules I write don't turn ham into spam and 
vice versa. I figured the best way to do this would be to run SA against 
an existing collection of ham and spam to make sure emails are still 
scored accurately with the new rules.


I imagine a utility like this must exists so figured I'd ask here before 
re-inventing the wheel and writing my own (probably bugg) script.


The script would need to check against all email files in .INBOX.* and 
.Spam directory in a user's IMAP directory.


Thanks again, everyone.


Re: Why single periods in regex in spamassassin rules?

2021-04-23 Thread Steve Dondley

On 2021-04-23 01:37 PM, Henrik K wrote:

On Fri, Apr 23, 2021 at 01:03:33PM -0400, Steve Dondley wrote:

I'm looking at KAM.cf. There is this rule:

body__KAM_WEB2  /INDIA based
IT|indian.based.website|certified.it.company/i

I'm wondering if there is a good reason why a singe period is used 
instead
of something like \s+ which would catch multiple spaces whereas a 
singe

period doesn't.


It would make no difference, because body is normalized from 
consecutive

spaces into single spaces.

https://cwiki.apache.org/confluence/display/SPAMASSASSIN/WritingRulesAdvanced


Makes sense. And thanks for the link. I was looking for some king of 
guidance on writing rules. Google didn't help much.


Re: how to disable spamcheck for Outgoing mail

2021-04-23 Thread Steve Dondley
On 2021-04-23 01:02 PM, mau...@gmx.ch wrote:

> Hello 
> 
> Please how its possible to disable the spam check from sending mails from 
> "privat to public" network? 
> 
> I was realy thinking if enable the trusted network this will pass over.  
> 
> trusted_networks 192.168.28. 
> 
> thanks

Are you using postfix? If so, you can do something like this: 

submission inet  n - y - -smtpd
 -o content_filter=spamassassin

Why single periods in regex in spamassassin rules?

2021-04-23 Thread Steve Dondley

I'm looking at KAM.cf. There is this rule:

body__KAM_WEB2  /INDIA based 
IT|indian.based.website|certified.it.company/i


I'm wondering if there is a good reason why a singe period is used 
instead of something like \s+ which would catch multiple spaces whereas 
a singe period doesn't.




Re: SA seems powerless against marketing emails for SEO/web development

2021-04-23 Thread Steve Dondley




I could add another point between BAYES_999 and BAYES_99 scores but
that seems reactionary. Is there a better way? Should I thrown in
another point for certain keywords in marketing emails like these?


add score to tags that score possitive 0.0

until it gives 5.0 and above


I like this idea. Seems reasonable. Thanks.


Re: SA seems powerless against marketing emails for SEO/web development

2021-04-22 Thread Steve Dondley

On 2021-04-22 02:31 PM, Matus UHLAR - fantomas wrote:

On 22.04.21 14:21, Steve Dondley wrote:

pts rule name  description
 -- 
--
-0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at 
https://www.dnswl.org/,

no trust
   [209.85.210.44 listed in list.dnswl.org]
-1.0 BAYES_00   BODY: Bayes spam probability is 0 to 1%
   [score: 0.]
-0.0 SPF_PASS   SPF: sender matches SPF record
0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends
   in digit
   [margaretkelly866[at]gmail.com]
0.0 FREEMAIL_FROM  Sender email is commonly abused enduser 
mail

   provider
   [margaretkelly866[at]gmail.com]
0.0 SPF_HELO_NONE  SPF: HELO does not publish an SPF Record
-0.0 RCVD_IN_MSPIKE_H3  RBL: Good reputation (+3)
   [209.85.210.44 listed in wl.mailspike.net]
0.0 HTML_MESSAGE   BODY: HTML included in message
-0.1 DKIM_VALID_EF  Message has a valid DKIM or DK signature 
from

   envelope-from domain
0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not 
necessarily

   valid
-0.1 DKIM_VALID_AU  Message has a valid DKIM or DK signature 
from

   author\'s domain
-0.1 DKIM_VALID Message has at least one valid DKIM or DK 
signature

-0.0 RCVD_IN_MSPIKE_WL  Mailspike good senders

This email is bit of an outlier as most of these emails will get 
flagged with bayes_99 and bayes_999 but this one actually gives it 
bayes_00.


My bayes filter has been trained with about 2000 examples of spam and 
ham.


now, train as needed - this one as spam.


OK, so I fixed my configuration issue. So now the bayes filtering is 
working when I flag an email as spam in my mail client:


Content analysis details:   (4.5 points, 5.0 required)

 pts rule name  description
 -- 
--


 1.0 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
[score: 1.]
 3.5 BAYES_99   BODY: Bayes spam probability is 99 to 100%
[score: 1.]


But as you can see, the email is still not hitting the 5.0 threshold.

I could add another point between BAYES_999 and BAYES_99 scores but that 
seems reactionary. Is there a better way? Should I thrown in another 
point for certain keywords in marketing emails like these?


SA seems powerless against marketing emails for SEO/web development

2021-04-22 Thread Steve Dondley
For whatever reason, solicitations from marketers for various web 
development services are easily slipping through my defenses. I figured 
bayes filtering would eventually do the job but after a reporting them 
for many days now, I'm still getting like 3 to half dozen a day. Here's 
one example: https://paste.debian.net/1194735/


The report for this email:

Content analysis details:   (-1.0 points, 5.0 required)

 pts rule name  description
 -- 
--
-0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at 
https://www.dnswl.org/,

 no trust
[209.85.210.44 listed in list.dnswl.org]
-1.0 BAYES_00   BODY: Bayes spam probability is 0 to 1%
[score: 0.]
-0.0 SPF_PASS   SPF: sender matches SPF record
 0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends
in digit
[margaretkelly866[at]gmail.com]
 0.0 FREEMAIL_FROM  Sender email is commonly abused enduser mail
provider
[margaretkelly866[at]gmail.com]
 0.0 SPF_HELO_NONE  SPF: HELO does not publish an SPF Record
-0.0 RCVD_IN_MSPIKE_H3  RBL: Good reputation (+3)
[209.85.210.44 listed in wl.mailspike.net]
 0.0 HTML_MESSAGE   BODY: HTML included in message
-0.1 DKIM_VALID_EF  Message has a valid DKIM or DK signature 
from

envelope-from domain
 0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not 
necessarily

valid
-0.1 DKIM_VALID_AU  Message has a valid DKIM or DK signature 
from

author\'s domain
-0.1 DKIM_VALID Message has at least one valid DKIM or DK 
signature

-0.0 RCVD_IN_MSPIKE_WL  Mailspike good senders

This email is bit of an outlier as most of these emails will get flagged 
with bayes_99 and bayes_999 but this one actually gives it bayes_00.



My bayes filter has been trained with about 2000 examples of spam and 
ham.


Not sure what to do at this point. I'm thinking about scoring up emails 
if the mention stuff like "SEO", "web design" etc. but I'm not sure if 
this is the best approach. Feels like a thumb in the dike approach.


Re: DCC license

2021-04-22 Thread Steve Dondley




The DCC FAQ at https://www.dcc-servers.net/dcc/FAQ.html#license
describes the definitive ways to get any questions answered regarding
DCC licensing. Any answers you could get here would be conjecture and
anecdote.


I found a form on their website for licensing questions. Waiting to hear 
back.


DCC license

2021-04-22 Thread Steve Dondley

Sorry if this is a bit off-topic.

I'm looking into installing DCC (Distributed Checksum Clearninghouse) 
software.


The page at https://www.dcc-servers.net/dcc/INSTALL.html says:

"The free license is intended to cover individuals and organizations 
including Internet service providers using DCC to filter their own mail. 
Organizations selling anti-spam appliances or managed mail services are 
not eligible for the free license."


However, when I look at the actual LICENSE file that ships with the 
software, it says:


 * Permission to use, copy, modify, and distribute this software without
 * changes for any purpose with or without fee is hereby granted, 
provided
 * that the above copyright notice and this permission notice appear in 
all

 * copies and any distributed versions or copies are either unchanged
 * or not called anything similar to "DCC" or "Distributed Checksum
 * Clearinghouse".
 *
 * __
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND RHYOLITE SOFTWARE, LLC DISCLAIMS 
ALL
 * WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED 
WARRANTIES
 * OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL RHYOLITE SOFTWARE, 
LLC

 * BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES
 * OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR 
PROFITS,
 * WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS 
ACTION,
 * ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS 
SOFTWARE.

 *


I don't see anything in there about disallowing usage of the software by 
"managed mail services."


Re: pyzor

2021-04-21 Thread Steve Dondley

On 2021-04-21 11:00 AM, Eric Broch wrote:

Does anyone one have a solution to this:

spamd[]: pyzor: check failed: internal error, python traceback
seen in response

I have this in my local.cf

#pyzor
use_pyzor 1
pyzor_path /usr/bin/pyzor


I don't have this in my config at all. Maybe you are following outdated 
advice?


Make sure you have the pyzor plugin line uncommented:

loadplugin Mail::SpamAssassin::Plugin::Pyzor

Also, ensure you have installed the pyzor package on your OS.


Spoofed amazon order email

2021-04-16 Thread Steve Dondley
First, thanks to everyone on the list how has given me a hand over the 
past couple of weeks as I get my "sea legs" with spamassassin. It's 
working well for me now but I obviously still have more to learn.


For one, I'm still uncertain on the best way to fine tune SA to beat 
back some tricky spam. Like this one that comes from a gmail account but 
spoofs a fake, expensive order on amazon to try to phish the user.


Return-Path: 
Delivered-To: s...@dondley.com
Received: from email.dondley.com
by email.dondley.com with LMTP
id Ev9rGkyheWBeegAAB604Gw
(envelope-from )
for ; Fri, 16 Apr 2021 10:38:04 -0400
Received: by email.dondley.com (Postfix, from userid 115)
id 5EFD521516; Fri, 16 Apr 2021 10:38:04 -0400 (EDT)
Authentication-Results: email.dondley.com;
	dkim=pass (2048-bit key; unprotected) header.d=gmail.com 
header.i=@gmail.com header.b="Fi/GiyLT";

dkim-atps=neutral
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on 
email.dondley.com

X-Spam-Level:
X-Spam-Status: No, score=0.9 required=5.0 tests=BAYES_20,DKIM_SIGNED,
DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,GB_FROM_NAME_FREEMAIL,
HTML_MESSAGE,MIME_HTML_MOSTLY,NAME_EMAIL_DIFF,RCVD_IN_DNSWL_NONE,
RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS
shortcircuit=no autolearn=no autolearn_force=no version=3.4.2
X-Spam-Language: en
Received-SPF: Pass (mailfrom) identity=mailfrom; 
client-ip=209.85.216.54; helo=mail-pj1-f54.google.com; 
envelope-from=gk5751...@gmail.com; receiver=
Received: from mail-pj1-f54.google.com (mail-pj1-f54.google.com 
[209.85.216.54])

by email.dondley.com (Postfix) with ESMTPS id 9DFB9210C1
for ; Fri, 16 Apr 2021 10:37:53 -0400 (EDT)
Received: by mail-pj1-f54.google.com with SMTP id 
kb13-20020a17090ae7cdb02901503d67f0beso3185770pjb.0

for ; Fri, 16 Apr 2021 07:37:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=20161025;
h=message-id:date:from:mime-version:subject:to;
bh=tbWgclEtavQLHj3b2u0ycLuH4u7X12CkOv+d/W8zWrs=;

b=Fi/GiyLThBU+Sf1M8Thsh4lWYqGeC2mX1d6uL+5grFufl8EA68jtMePxe1TsIetKPj
 
oCRdmdkjvxAGFA0Uny2lttK9Xhpmoa38zO0rLmFLN+tzKTHYuKKoiQx6ugByfCpk6A82
 
QDyDgRp7HpEkA34ztYXqR9Q0MH8eTPPaK7iNTbdq2Sb78PYR+XNX9UVDnWarVSmlQm6N
 
EwrQKnzaaT4WKuUrmXS8tkGJMLLfWxLQAu0oCxbKwDkjW7yLMVYGl1Zhk7tNjoi2Hk2r
 
xywZ0v6AyAbSTawCrUN052ps4xjKR/o0CLHrkk+FLbu9wENYbhrDNb/HMRu20aTzEgHn

 AvZA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20161025;

h=x-gm-message-state:message-id:date:from:mime-version:subject:to;

bh=tbWgclEtavQLHj3b2u0ycLuH4u7X12CkOv+d/W8zWrs=;

b=D4cfDeHF3n8JokVklJNHvyFD04InVRxq/DLHtB+xrMenRQZDQPHMqH5KdJBAgs4hAD
 
hc1YTl90K8wFUUAicyyzwhAzBTJqqCtmOZJczjjoXj9WXxEBqiJvgB5m2H+UvTejEX/0
 
AA/Exf6uvfuGP5hsrp7o4i22DBc/FlZDVArJt7wN+u+zjO1+rRFgrfbW6fdWzgYkb6Y2
 
jV/JTQywhNxSY6XaOSd4AA1i9ZC8LOaqkOLabUy1WI7uEWDOvzaO4MZuBzHi23vmdHlA
 
weh507+u6rXpN6BarAXZEZxnC+yev86JRqtQjJZL5qTpbjhb2s/1g6wSeRNF1Ri7qIXs

 zbfA==
X-Gm-Message-State: 
AOAM5322u+9pAxfsMRqYaM8FgbXE+0nBCEZeqd286+mfRDrabuuIhCVe

CLSzPPcNsg+v2Px14I1WF9r5vuoVLtg=
X-Google-Smtp-Source: 
ABdhPJw1ixhEhS6bCqFtjizgrTxFo6mCL1fEQPBSzQxIDGkIqIwR7np7Mgjy6ap0Lx6VHje5LfeKwQ==
X-Received: by 2002:a17:90a:5407:: with SMTP id 
z7mr10416174pjh.228.1618583872037;

Fri, 16 Apr 2021 07:37:52 -0700 (PDT)
Received: from 
1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa 
([104.143.92.92])
by smtp.gmail.com with ESMTPSA id 
t15sm5203451pgh.33.2021.04.16.07.37.49

for 
(version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
Fri, 16 Apr 2021 07:37:51 -0700 (PDT)
Message-ID: <6079a13f.1c69fb81.a9651.e...@mx.google.com>
Date: Fri, 16 Apr 2021 07:37:51 -0700 (PDT)
From: "or...@amazon.com" 
X-Google-Original-From: "or...@amazon.com" 
Content-Type: multipart/alternative; 
boundary="===2707982310301423984=="

MIME-Version: 1.0
Subject: IVK-1250703-9254770 | Apple Watch Series 6 Order Now Confirmed
To: s...@dondley.com

--===2707982310301423984==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

Hello there, S!

This is a test template...

--===2707982310301423984==
Content-Type: text/html; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit




href="https://go.pardot.com/unsubscribe/u/272832/9445773a5f7e92b64a4b106d30d12be4ec08e6d19850125ed1a094fe7f00100f/734801457"; 
target="_blank">List-Unsubscribe


cellspacing="0" cellpadding="0" align="center">













Your Order  | Your 
Account | Amazon.com
ORDER NUMBER
# 
IVK-1250703-9254770













Dear 
S
Thank you for shopping 
with us. You have ordered the Apple Watch Series 6 Space Gray 44 mm GPS + Cellular
In-case you require any 
change in order or like to ca

Re: DNSWL overriding bayes_99 and bayes_999 rules

2021-04-12 Thread Steve Dondley
On 2021-04-12 03:11 AM, Matthias Leisi wrote:

> -2.0 RCVD_IN_DNSWL_HI   RBL: Sender listed at
> https://www.dnswl.org/,
> high trust
> [203.160.71.180 listed in list.dnswl.org [1]] I looked up this, and the other 
> one, and didn't find them in dnswl.   As
> others said, if you are using public DNS, stop doing that immediately.
> And, run the dnswl queries with dig or host yourself on your own machine.

  Answering to this mail, I could have used any of the others. 

At dnswl.org [2], we have the fair use policy of 100'000 queries per 24
hours. Those that are consistently way above this threshold (either on
an individual IP, within a block of IPs or spread over multiple blocks)
may get blocked. 

„Blocked" is not straightforward in DNS - if you simply return REFUSE
status code, resolvers may retry on other nameservers, thus effectively
multiplying the (useless) traffic. To avoid this, we have a number of
strategies: 

* „pass" - for those we don't want to block
* „parentblock" - we do not return the actual NS records for the
list.dnswl.org [1] zone from the parent zone; all A records in
*.list.dnswl.org [1] return 127.0.0.255 and a corresponding TXT record -
that's the default strategy for most part of those that query above the
fair use threshold.
* „refuse" - see above, rarely used
* „empty" - we return NXDOMAIN. Not currently used.
* „ignore" - we don't return anything. Not currently used.
* „returnhi" - for those that try to evade „parentblock" (eg by
directly querying list.dnswl.org [1] nameservers), or who do not take
action after long times of „parentblock" (and which also did not change
behaviour on „refuse"), we return „hi" in order to make them go away
eventually.
* We may chose to escalate from single IPs to eg v4-/24 or
v6-/48-or-larger for active evaders (eg frequently changing nameserver
IPs), and we may also use „returnhi". Interstingly, we have a surprising
high number of „returnhi" cases which have been querying us for _years_
without a change in behaviour. From time to time we change them to one
of the other strategies. It would be interesting to dig in what they are
actually thinking... 

It's likely that the OP is using a nameserver where we have „returnhi". 

Obviously the advice given in this threat (use a local caching resolver
who does not forward queries) is correct and will that problem magically
go away :) 

-- Matthias 

Ah, thank you for the explanation. 

Following the advice on this list, I set up a locally running running
DNS server. Since that time, I have not seen the problem of _HI scores
in my spam emails. 

Links:
--
[1] http://list.dnswl.org
[2] http://dnswl.org

Re: Using spamassassin to thwart sharepoint phishing attacks

2021-04-12 Thread Steve Dondley




However, in 50_scores.cf, this line is commented out:

#score RCVD_IN_SORBS_SPAM 0 0.5 0 0.5

Maybe that's the problem?


no, there are other SORBS lists used:

score RCVD_IN_SORBS_DUL 0 0.001 0 0.001 # n=0 n=2
score RCVD_IN_SORBS_HTTP 0 2.499 0 0.001 # n=0 n=2
score RCVD_IN_SORBS_MISC 0 # n=0 n=1 n=2 n=3
score RCVD_IN_SORBS_SMTP 0 # n=0 n=1 n=2 n=3
score RCVD_IN_SORBS_SOCKS 0 2.443 0 1.927 # n=0 n=2
#score RCVD_IN_SORBS_SPAM 0 0.5 0 0.5
score RCVD_IN_SORBS_WEB  0 1.5 0 1.5
score RCVD_IN_SORBS_ZOMBIE 0 # n=0 n=1 n=2 n=3


have you set up own caching, non-forwarding DNS server?


Yes. And my SA scores have improved about 100% since I did this.


Re: Using spamassassin to thwart sharepoint phishing attacks

2021-04-11 Thread Steve Dondley




sorbs dnsbl missing, have you denied sorbs.net results ?, or is
spamassassin not testing sorbs.net anymore ?


Best I can tell, my SA config should be testing for sorbs. I've got this 
line in /etc/spamassassin/v3220.pre:


loadplugin Mail::SpamAssassin::Plugin::DNSEval

And in /usr/share/spamassassin/20_dnsbl_test.cf, I've got:

ifplugin Mail::SpamAssassin::Plugin::DNSEval

I see a bunch of SORBS rules in there.

However, in 50_scores.cf, this line is commented out:

#score RCVD_IN_SORBS_SPAM 0 0.5 0 0.5

Maybe that's the problem?




Re: Using spamassassin to thwart sharepoint phishing attacks

2021-04-11 Thread Steve Dondley




Also, I've heard of sorbs over the years but I'm not sure exactly what
it is. Is this the same block list run by Cisco?


OK, I was getting SORBS confused with SenderBase Reputation Score 
(SBRS). That's the one run by Cisco, I believe.


I actually have an account on the SORBS website that I set up long ago.


Re: Using spamassassin to thwart sharepoint phishing attacks

2021-04-11 Thread Steve Dondley





sorbs dnsbl missing, have you denied sorbs.net results ?, or is
spamassassin not testing sorbs.net anymore ?


How would I check if it's turned on? I tried grepping in 
/etc/spamassassin on "sorb" (case insensitive) and found nothing. So I 
guess it's not in my default config.


I see many mentions of "SORBS" in /usr/share/spamassassin, however. I'm 
guessing I may not have a needed SA plugin enabled. I'll try to figure 
out how to do it.


Also, I've heard of sorbs over the years but I'm not sure exactly what 
it is. Is this the same block list run by Cisco?


Re: Is pyzor recommended by folks on this list?

2021-04-11 Thread Steve Dondley




Second, I'm not sure if my tests will work on my spam samples which
have the spam encapsulated with the "report_safe" setting set to a
value of "1".


I wouldn't expect it to work at all. "report_safe" encapsulation
creates a new email which isn't a spam.


From what I read on pyzor's home page and how it works, pyzor strips off 
all headers. So I would assume it doesn't matter if it's encapsulated. I 
could be, and quite likely am, totally wrong about this, of course.


Re: Using spamassassin to thwart sharepoint phishing attacks

2021-04-11 Thread Steve Dondley

On 2021-04-11 04:19 PM, Benny Pedersen wrote:

On 2021-04-11 22:09, Steve Dondley wrote:


Content analysis details:   (4.4 points, 5.0 required)

 pts rule name  description
 -- 
--

 3.5 BAYES_99   BODY: Bayes spam probability is 99 to 100%
[score: 1.]
 0.5 BAYES_999  BODY: Bayes spam probability is 99.9 to 
100%

[score: 1.]
-0.0 RCVD_IN_MSPIKE_H2  RBL: Average reputation (+2)
[52.100.189.222 listed in 
wl.mailspike.net]
-0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at 
https://www.dnswl.org/,

 no trust
[52.100.189.222 listed in list.dnswl.org]
-0.0 SPF_HELO_PASS  SPF: HELO matches SPF record
-0.0 SPF_PASS   SPF: sender matches SPF record
 0.5 SUBJ_ALL_CAPS  Subject is all capitals
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.1 MIME_HTML_ONLY BODY: Message only has text/html MIME 
parts
-0.1 DKIM_VALID Message has at least one valid DKIM or DK 
signature
 0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not 
necessarily

valid
-0.1 DKIM_VALID_AU  Message has a valid DKIM or DK signature 
from

author's domain
-0.1 DKIM_VALID_EF  Message has a valid DKIM or DK signature 
from

envelope-from domain
 0.0 UPPERCASE_50_75message body is 50-75% uppercase


i see its as a local problem

http://multirbl.valli.org/lookup/52.100.189.222.html

do you use KAM.cf channel ?


OK, I added KAM.cf to my config. It has now pushed it over 5.0, barely:

Content analysis details:   (5.1 points, 5.0 required)

 pts rule name  description
 -- 
--
-0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at 
https://www.dnswl.org/,

 no trust
[52.100.189.222 listed in list.dnswl.org]
 3.5 BAYES_99   BODY: Bayes spam probability is 99 to 100%
[score: 1.]
 0.5 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
[score: 1.]
-0.0 RCVD_IN_MSPIKE_H2  RBL: Average reputation (+2)
[52.100.189.222 listed in wl.mailspike.net]
-0.0 SPF_HELO_PASS  SPF: HELO matches SPF record
 0.5 SUBJ_ALL_CAPS  Subject is all capitals
-0.0 SPF_PASS   SPF: sender matches SPF record
 0.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
 0.0 HTML_MESSAGE   BODY: HTML included in message
-0.1 DKIM_VALID Message has at least one valid DKIM or DK 
signature
-0.1 DKIM_VALID_EF  Message has a valid DKIM or DK signature 
from

envelope-from domain
-0.1 DKIM_VALID_AU  Message has a valid DKIM or DK signature 
from

author's domain
 0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not 
necessarily

valid
 0.0 UPPERCASE_50_75message body is 50-75% uppercase
 0.2 KAM_MANYTO Email has more than one To Header or more 
than 25

recipients
 0.5 KAM_NUMSUBJECT Subject ends in numbers excluding current 
years

 0.0 KAM_SHORT  Use of a URL Shortener for very short URL


Using spamassassin to thwart sharepoint phishing attacks

2021-04-11 Thread Steve Dondley
I've received about a dozen phishing attack emails from Microsoft's 
sharepoint service within the last couple of weeks. Only one of them was 
identified by SA as spam. After running the emails through sa-learn, 
they still only score a 4 to 4.5. But I could see that it would be easy 
for these emails to get classified as false positives and/or false 
negatives.


Has anyone developed a good way to identify these sharepoint phishing 
attacks without any false positives?


I'm leaning towards figuring out how I might inject some kind of 
prominent warning into the message to remind people not to click links 
they don't trust. That's not an ideal solution, but perhaps it is the 
best way to help protect users. I'm interested to hear what other 
options might be available.


Here is how SA scored one of the emails:

4.4/5.0
Spam detection software, running on the system "email.dondley.com",
has NOT identified this incoming email as spam.  The original
message has been attached to this so you can view it or label
similar future email.  If you have any questions, see
the administrator of that system for details.

Content preview:  Doris Feaster shared a file with you STRIP BANG THE 
ONLINE
   REAL & MOST POPULAR 100% TRUSTED NETWORK STRIPBANG GIVING FREE ELITE 
MEMBERSHIP

   AND 5000CR=$750 WINNER 2021 YOUR WINNING CODE - ( STBNG5000CR )

Content analysis details:   (4.4 points, 5.0 required)

 pts rule name  description
 -- 
--

 3.5 BAYES_99   BODY: Bayes spam probability is 99 to 100%
[score: 1.]
 0.5 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
[score: 1.]
-0.0 RCVD_IN_MSPIKE_H2  RBL: Average reputation (+2)
[52.100.189.222 listed in wl.mailspike.net]
-0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at 
https://www.dnswl.org/,

 no trust
[52.100.189.222 listed in list.dnswl.org]
-0.0 SPF_HELO_PASS  SPF: HELO matches SPF record
-0.0 SPF_PASS   SPF: sender matches SPF record
 0.5 SUBJ_ALL_CAPS  Subject is all capitals
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
-0.1 DKIM_VALID Message has at least one valid DKIM or DK 
signature
 0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not 
necessarily

valid
-0.1 DKIM_VALID_AU  Message has a valid DKIM or DK signature 
from

author's domain
-0.1 DKIM_VALID_EF  Message has a valid DKIM or DK signature 
from

envelope-from domain
 0.0 UPPERCASE_50_75message body is 50-75% uppercase


Re: Is pyzor recommended by folks on this list?

2021-04-11 Thread Steve Dondley

On 2021-04-11 03:09 PM, Bill Cole wrote:

On 11 Apr 2021, at 13:21, Steve Dondley wrote:


value of "1". By the way, anyone know of a CLI utility for extracting
the original spam email from these files?


spamassassin -d < wrappedspam.eml


Ah, ok. I was familiar with the -d option but did not know it could be 
used to redirect to output like this:


spamassassin -d < filtred_email > orig_email

I tried it and it did what I needed. Thanks.


Re: Is pyzor recommended by folks on this list?

2021-04-11 Thread Steve Dondley




value of "1". By the way, anyone know of a CLI utility for extracting
the original spam email from these files?


Here's a very crude perl script that does the trick:

#!/usr/bin/perl

use strict;
use warnings;

my $email;
while (<>) {
  $email .= $_;
}

my ($boundary) = $email =~ /boundary="(.*)"/;
my ($orig_content) = $email =~ 
/^--$boundary.*^--$boundary(.*)$boundary--/ms;


print $orig_content;

You would use it like this:

./spam_extractor.pl < email_file_with_encapsualted_spam



Re: Is pyzor recommended by folks on this list?

2021-04-11 Thread Steve Dondley

On 2021-04-11 09:34 AM, Benny Pedersen wrote:

On 2021-04-11 15:13, Steve Dondley wrote:


What do you think?


pyzor is usefull if running pyzord localy, design of pyzor was imho
ment to be local pyzord and have the pyzor client query local, but
pyzord could be get results from other pyzord server farms,


Interesting. I wonder if it might be worth it to set up my own pyzor 
server for my own network of mail servers. That's probably going to be 
easier than sharing spam/ham samples around between users.


Is pyzor recommended by folks on this list?

2021-04-11 Thread Steve Dondley
I just installed pyzor and did a random spot check of about 10 spam 
emails to try to evaluate it using this command:


pyzor check < some_spam

Only one message gave me a hit on pyzor.

But I take my results with a grain of salt because I may not have pyzor 
configured optimally.


For one, I'm using the public pyzor server. Maybe there are other more 
useful servers?


Second, I'm not sure if my tests will work on my spam samples which have 
the spam encapsulated with the "report_safe" setting set to a value of 
"1". By the way, anyone know of a CLI utility for extracting the 
original spam email from these files?


So before I explore pyzor any further, I'm wondering if the default 
rules built into SA are good enough or if pyzor improves the accuracy of 
SA enough to be worth the extra cycles to install it and keep it 
functional.


What do you think?









Re: Spamassassin reporting IP address is whitelisted by DNSWL.org but DNSWL.org reports it is not

2021-04-10 Thread Steve Dondley

On 2021-04-10 03:20 PM, Bill Cole wrote:

On 10 Apr 2021, at 14:53, Steve Dondley wrote:

I'm very, very sorry to beat a dead horse, but I'm deeply confused by 
the "RCVD_IN_DNSWL_HI" rule which appears to be reporting incorrectly 
on my system.


STOP USING ANY PUBLIC DNS RESOLVERS WITH ANY MAIL SERVERS!


For the record, my nameserver setting in /etc/resolv.conf was some local 
IP address which presumably used an Amazon Web Service (AWS) DNS server.


After changing the IP address to 127.0.0.1 in that file, it changed 
itself back to the original IP address after some short period of time. 
To fix this, follow the appropriate instructions here: 
https://aws.amazon.com/premiumsupport/knowledge-center/ec2-static-dns-ubuntu-debian/


Spamassassin reporting IP address is whitelisted by DNSWL.org but DNSWL.org reports it is not

2021-04-10 Thread Steve Dondley
I'm very, very sorry to beat a dead horse, but I'm deeply confused by 
the "RCVD_IN_DNSWL_HI" rule which appears to be reporting incorrectly on 
my system.


I ran this command:

sudo -u s -- spamassassin -t -d < some_email

It gives me this report:

 pts rule name  description
 -- 
--

 1.2 URIBL_ABUSE_SURBL  Contains an URL listed in the ABUSE SURBL
blocklist
[URIs: bizgrouplinknews.com]
 1.7 URIBL_BLACKContains an URL listed in the URIBL 
blacklist

[URIs: bizgrouplinknews.com]
 2.5 URIBL_DBL_SPAM Contains a spam URL listed in the Spamhaus 
DBL

blocklist
[URIs: bizgrouplinknews.com]
 0.0 RCVD_IN_MSPIKE_L5  RBL: Very bad reputation (-5)
[50.30.46.135 listed in bl.mailspike.net]
-2.0 RCVD_IN_DNSWL_HI   RBL: Sender listed at 
https://www.dnswl.org/,

high trust
[50.30.46.135 listed in list.dnswl.org]
 0.5 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
[score: 1.]
 3.5 BAYES_99   BODY: Bayes spam probability is 99 to 100%
[score: 1.]
 1.3 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in
bl.spamcop.net
   [Blocked - see 
]

-0.0 SPF_PASS   SPF: sender matches SPF record
 0.0 SPF_HELO_NONE  SPF: HELO does not publish an SPF Record
 2.6 DEAR_FRIENDBODY: Dear Friend? That's not very dear!
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.1 HTTPS_HTTP_MISMATCHBODY: No description available.
-0.1 DKIM_VALID_AU  Message has a valid DKIM or DK signature 
from

author's domain
 0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not 
necessarily

valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK 
signature

 0.0 RCVD_IN_MSPIKE_BL  Mailspike blacklisted
 3.5 URI_PHP_REDIR  PHP redirect to different URL (link 
obfuscation)



So it's showing the IP address 50.30.46.135 is whitelisted as shown by 
the RCVD_IN_DNSWL_HI rule.


However, the dnswl.org domain shows that the 50.30.46.135 is *not* 
whitelisted: https://www.dnswl.org/s/?s=50.30.46.135


So what would account for my system reporting it as whitelisted when the 
dnswl.org domain does not report it as whitelisted?


Re: DNSWL overriding bayes_99 and bayes_999 rules

2021-04-10 Thread Steve Dondley




You should fix URIBL_BLOCKED first.
You need a local, caching, non-forwarding DNS server for SpamAssassin.


Yeah, setting up a DNS server for SA is on my todo list. Thanks.

When you say local, it doesn't have to be on the same machine as 
spamassassin, does it? I assume I can have the DNS server on a local 
network and shared between many machines.


Re: DNSWL overriding bayes_99 and bayes_999 rules

2021-04-10 Thread Steve Dondley




It would be helpful to post an entire actual set of headers --
unmodified -- along with the spamassassin -t report.  I can't figure
out (from what you posted) the IP address of the server that was in
DNSWL_HI that delivered mail to your internal/trusted network.


OK, here is the entire output of this command:

sudo -u s -- spamassassin -t -d < the_spam_email

Note: I've changed the score of RCVD_IN_DNSWL_HI hits to -2.0 from -5.0 
until I get my misconfiguration figured out. Thanks for your patience.





Received: from localhost by email.dondley.com
with SpamAssassin (version 3.4.2);
Sat, 10 Apr 2021 12:41:17 -0400
From: 
=?shift_jis?B?kmqCzI/bkqWKZ5HljHaJ5iBBaXAxMA==?=

To: 
Subject: *SPAM* 
=?shift_jis?B?g0mDk4NpgqqLgYLfgumXQojqlrOT8YLMgZqDZoNKg2CDk4GagvCBSTA5?=

Date: Sat, 10 Apr 2021 18:50:01 +0900
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on 
email.dondley.com

X-Spam-Flag: YES
X-Spam-Level: ***
X-Spam-Status: Yes, score=23.2 required=5.0 tests=BASE64_LENGTH_79_INF,
BAYES_99,BAYES_999,DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,FREEMAIL_REPLYTO,
FREEMAIL_REPLYTO_END_DIGIT,FROM_MISSP_FREEMAIL,FROM_MISSP_REPLYTO,
LOCAL_SPAM_TLD,LOCAL_UNCOMMON_TLD,MISSING_MID,NML_ADSP_CUSTOM_MED,
RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_DNSWL_HI,RCVD_IN_MSPIKE_H2,RCVD_IN_PSBL,
RCVD_IN_RP_RNBL,RCVD_IN_VALIDITY_RPBL,RDNS_NONE,SPF_HELO_SOFTFAIL,
SPF_SOFTFAIL,SPOOFED_FREEMAIL,SPOOFED_FREEMAIL_NO_RDNS,
SPOOFED_FREEM_REPTO,TVD_SPACE_ENCODED shortcircuit=no autolearn=no
autolearn_force=no version=3.4.2
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="--=_6071D52D.C7B255FE"

This is a multi-part message in MIME format.

=_6071D52D.C7B255FE
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

Spam detection software, running on the system "email.dondley.com",
has identified this incoming email as possible spam.  The original
message has been attached to this so you can view it or label
similar future email.  If you have any questions, see
the administrator of that system for details.

Content preview:  
@„ª{„ª{„ª{„ª{„ª{„ª{„ª{„ª{„ª{„ª{„ª{„ª{„ª{„ª{„ª
   @@@@@@@™‹ÆŠEÅ‚‚̃{ƒ‹ƒ‚ƒ“¬’·Œø‰Ê™ 
@@@@@@@@@@šƒyƒjƒX‘‘åƒTƒvƒŠš



Content analysis details:   (23.2 points, 5.0 required)

 pts rule name  description
 -- 
--
-2.0 RCVD_IN_DNSWL_HI   RBL: Sender listed at 
https://www.dnswl.org/,

high trust
[203.160.71.180 listed in list.dnswl.org]
-0.0 RCVD_IN_MSPIKE_H2  RBL: Average reputation (+2)
[203.160.71.180 listed in wl.mailspike.net]
 2.7 RCVD_IN_PSBL   RBL: Received via a relay in PSBL
[203.160.71.180 listed in psbl.surriel.com]
 3.5 BAYES_99   BODY: Bayes spam probability is 99 to 100%
[score: 1.]
 0.5 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
[score: 1.]
 2.0 LOCAL_SPAM_TLD Domain originates a lot of spam
 1.0 LOCAL_UNCOMMON_TLD From address is not a common TLD
 1.3 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in
bl.spamcop.net
 [Blocked - see 
]

 1.3 RCVD_IN_VALIDITY_RPBL  RBL: Relay in Validity RPBL,
https://senderscore.org/blocklistlookup/
   [203.160.71.180 listed in 
bl.score.senderscore.com]

 0.0 FREEMAIL_FROM  Sender email is commonly abused enduser mail
provider (qy5cbma-yua06[at]yahoo.co.jp)
 0.2 FREEMAIL_REPLYTO_END_DIGIT Reply-To freemail username ends in
digit (qy5cbma-yua06[at]yahoo.co.jp)
 0.7 SPF_SOFTFAIL   SPF: sender does not match SPF record 
(softfail)

 0.0 DKIM_ADSP_CUSTOM_MED   No valid author signature, adsp_override is
 CUSTOM_MED
 0.7 SPF_HELO_SOFTFAIL  SPF: HELO does not match SPF record 
(softfail)

 1.5 BASE64_LENGTH_79_INF   BODY: base64 encoded email part uses line
length greater than 79 characters
 0.5 MISSING_MIDMissing Message-Id: header
 0.0 RCVD_IN_RP_RNBLRCVD_IN_RP_RNBL renamed to
RCVD_IN_VALIDITY_RPBL, please update local
 rules
 0.8 RDNS_NONE  Delivered to internal network by a host with 
no rDNS

 1.0 FREEMAIL_REPLYTO   Reply-To/From or Reply-To/body contain
different freemails
 0.9 NML_ADSP_CUSTOM_MEDADSP custom_med hit, and not from a mailing
list
 0.0 FROM_MISSP_REPLYTO From misspaced, has Reply-To
 2.5 TVD_SPACE_ENCODED 

Re: DNSWL overriding bayes_99 and bayes_999 rules

2021-04-10 Thread Steve Dondley

On 2021-04-10 12:10 PM, Greg Troxel wrote:

Steve Dondley  writes:


Here are the headers from some egregious spam. It scored a whopping
20.8 point despite being flagged with "RCVD_IN_DNSWL_HI."

Return-Path: 
Delivered-To: s...@example.com
Received: from email.example.com
by email.example.com with LMTP
id AnV2NSCZbmCTcQAAB604Gw
(envelope-from )
for ; Thu, 08 Apr 2021 01:48:16 -0400


really?  Those are the headers?


Yes. Why do you ask? Is it unusual that this egregious example of spam 
is on DNSWL_HI?




So my advice again is:

  Run spamassassin -t on the message so you see the metadata about the
  rules like which IP hit and the per-rule score.


I've already done that on selective email messages.


  If you got spam from a sender in DNSWL_HI, report it to dnswl.org.
  Give them a week and see if they take the IP out, or what happens, 
and

  tell us how it went.


I plan on it but first:

1) I want to verify with this list I don't have something misconfigured 
before I report 300+ emails. From what I've read in the emails last 
week, this would be highly unusual.


2) If I do have that many false positives, I need to figure out how to 
bulk report that many of them.


Re: DNSWL overriding bayes_99 and bayes_999 rules

2021-04-10 Thread Steve Dondley




I have been looking at this issue a little more. I just grepped my
spam folder. Out of 1000 emails I have flagged as spam, 321 have been
flagged with RCVD_DNSWL_HI, a rule which adds -5 points to the eamil.
That's almost 1 out of 3 emails which seems pretty insane.


Here are the headers from some egregious spam. It scored a whopping 20.8 
point despite being flagged with "RCVD_IN_DNSWL_HI."


Return-Path: 
Delivered-To: s...@example.com
Received: from email.example.com
by email.example.com with LMTP
id AnV2NSCZbmCTcQAAB604Gw
(envelope-from )
for ; Thu, 08 Apr 2021 01:48:16 -0400
Received: by email.example.com (Postfix, from userid 115)
id CDD3D210E1; Thu,  8 Apr 2021 01:48:16 -0400 (EDT)
Received: from localhost by email.example.com
with SpamAssassin (version 3.4.2);
Thu, 08 Apr 2021 01:48:16 -0400
From: 
=?shift_jis?B?i9aSZoLMl6BEVkSP7pXxIEFpcDA4jYY=?=

To: 
Subject: *SPAM* 
=?shift_jis?B?lrOPQ5Czi8mU6Zesj2+DVoOKgVuDWYFFg4KDVYNDg06UaonzgUWXTJa8QVaPl5dEgXmXoIOCg22JroF6MDc=?=

Date: Thu, 08 Apr 2021 14:48:09 +0900
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on 
email.example.com

X-Spam-Flag: YES
X-Spam-Level: 
X-Spam-Status: Yes, score=20.8 required=5.0 tests=BASE64_LENGTH_79_INF,
DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,FREEMAIL_REPLYTO,
FREEMAIL_REPLYTO_END_DIGIT,FROM_MISSP_FREEMAIL,FROM_MISSP_REPLYTO,
MISSING_MID,NML_ADSP_CUSTOM_MED,RCVD_IN_BL_SPAMCOP_NET,
RCVD_IN_DNSWL_HI,RCVD_IN_PSBL,RCVD_IN_RP_RNBL,RCVD_IN_SBL_CSS,
RCVD_IN_VALIDITY_RPBL,RCVD_IN_XBL,RDNS_NONE,SPF_HELO_SOFTFAIL,
SPF_SOFTFAIL,SPOOFED_FREEMAIL,SPOOFED_FREEMAIL_NO_RDNS,
SPOOFED_FREEM_REPTO,TVD_SPACE_ENCODED,URIBL_ABUSE_SURBL,URIBL_BLOCKED
shortcircuit=no autolearn=unavailable autolearn_force=no version=3.4.2
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="--=_606E9920.15B94EAE"
Message-Id: <20210408054816.cdd3d21...@email.example.com>


Re: DNSWL overriding bayes_99 and bayes_999 rules

2021-04-10 Thread Steve Dondley

On 2021-04-06 11:48 AM, Steve Dondley wrote:

I have emails that have been flagged as spam in the past but that are
still getting through, presumably because the servers are on some
DNSWL.

Example:

X-Spam-Status: No, score=0.9 required=5.0 tests=BAYES_99,BAYES_999,
DATE_IN_PAST_03_06,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,

HTML_IMAGE_RATIO_02,HTML_MESSAGE,RCVD_IN_DNSWL_HI,RCVD_IN_MSPIKE_H2,

SPF_HELO_NONE,SPF_SOFTFAIL shortcircuit=no autolearn=no
autolearn_force=no version=3.4.2

What's the recommended way to handle these? Do I turn on shortcircuit?
Do I bump up the score for BAYES_99, BAYES_999? Or might there be a
way to ignore DNSWL scores if they have a high bayes score?


I have been looking at this issue a little more. I just grepped my spam 
folder. Out of 1000 emails I have flagged as spam, 321 have been flagged 
with RCVD_DNSWL_HI, a rule which adds -5 points to the eamil. That's 
almost 1 out of 3 emails which seems pretty insane.


Is anyone else seeing spam getting flagged with RCVD_DNSWL_HI resulting 
in so many false positives?


Re: Getting different SA scores when using -R argument with spamc

2021-04-06 Thread Steve Dondley

It can only do so if report_safe is set to 0. With non-zero
report_safe settings, the original mail is encapsulated as an
attachment inside a wrapper message also including the report. That
wrapper message containing the SA report is "safe" because it is fully
local, the text/plain part won't look like spam to any spam filter,
and the original, encapsulated as a message/rfc822 attachment, should
be skipped by any filter. If you want to test the *original* message,
you have to extract the message/rfc822 part into its own file and test
that.


OK, did some more googling on this. Let me spell this out and help clear 
up those who may be as confused as I was:


1) sa-learn *will* "unwrap" the original encapsulated spam emails when 
they are encapsulated by SA: 
https://cwiki.apache.org/confluence/display/SPAMASSASSIN/LearningMarkedUpMessages
2) However, the spamassassin command (or spamc/spamd) does not do this 
for you. You must use the -d option to remove any spam markup.


What this means is if that report_safe is set to "1"  (the default) in 
your SA config file, you must pull the original spam email out with the 
-d option if you wish to run it through spamassassin/spamc again. You do 
*not* have to worry about doing this with the sa-learn command.


If I got this wrong, let me know. Thanks.



Re: Getting different SA scores when using -R argument with spamc

2021-04-06 Thread Steve Dondley

On 2021-04-06 04:19 PM, Steve Dondley wrote:

It seems to have done so. Thank you.

Some MUAs have a "Reply to List" function that uses the List-Post
header (and sometimes heuristics when that header is missing) to send
replies only to a list itself.


I've recently switched to Roundcube from gmail. I didn't see that
option but I think I've figured out I just need to hit "reply". Thanks
for pointing out you were getting dupes.



It can only do so if report_safe is set to 0. With non-zero
report_safe settings, the original mail is encapsulated as an
attachment inside a wrapper message also including the report. That
wrapper message containing the SA report is "safe" because it is fully
local, the text/plain part won't look like spam to any spam filter,
and the original, encapsulated as a message/rfc822 attachment, should
be skipped by any filter. If you want to test the *original* message,
you have to extract the message/rfc822 part into its own file and test
that.


OK, so that's the problem, I guess. That config option is commented
out in my local.cf file:

# report_safe 1


I should read the documentation before asking questions. So '1' is the 
default which encapsulates the original spam as an attachment.


Re: Getting different SA scores when using -R argument with spamc

2021-04-06 Thread Steve Dondley




Some MUAs have a "Reply to List" function that uses the List-Post
header (and sometimes heuristics when that header is missing) to send
replies only to a list itself.


Ah! I see that option now under the little down arrow next to "Reply 
all". My day is made. Thanks!


Re: Getting different SA scores when using -R argument with spamc

2021-04-06 Thread Steve Dondley




It seems to have done so. Thank you.

Some MUAs have a "Reply to List" function that uses the List-Post
header (and sometimes heuristics when that header is missing) to send
replies only to a list itself.


I've recently switched to Roundcube from gmail. I didn't see that option 
but I think I've figured out I just need to hit "reply". Thanks for 
pointing out you were getting dupes.




It can only do so if report_safe is set to 0. With non-zero
report_safe settings, the original mail is encapsulated as an
attachment inside a wrapper message also including the report. That
wrapper message containing the SA report is "safe" because it is fully
local, the text/plain part won't look like spam to any spam filter,
and the original, encapsulated as a message/rfc822 attachment, should
be skipped by any filter. If you want to test the *original* message,
you have to extract the message/rfc822 part into its own file and test
that.


OK, so that's the problem, I guess. That config option is commented out 
in my local.cf file:


# report_safe 1

So what do you recommend setting this to '1'? Any downsides to that? I'm 
just a little leery of changing a default setting. But I'll do whatever 
the pros suggest.


It says a value of '2' sets it "use text/plain instead" but I don't know 
what that is referring to.





Re: Getting different SA scores when using -R argument with spamc

2021-04-06 Thread Steve Dondley

On 2021-04-06 02:55 PM, Steve Dondley wrote:

On 2021-04-06 02:32 PM, Bill Cole wrote:

PLEASE NOTE:

I read the mailing list obsessively and DO NOT NEED (or want) the
extra copies sent when you send both to me and to the list.


Sorry, I still haven't figured out how to properly respond. When I hi
"reply all" it cc's the list and sends to you. When I hit just "reply"
it only sends to you. I've manually deleted you from the "To" box and
sending it directly to the list here. Hopefully that fixes things up.


Since the scores being added during delivery are much richer,
detecting enough info to do SPF and DKIM analysis, I am 99.9% certain
that the format of 'some_email' is mangled, probably missing critical
headers or using CR linebreaks instead of proper LFs.




I just noticed the date in the email header was from about a week ago.


Re: Getting different SA scores when using -R argument with spamc

2021-04-06 Thread Steve Dondley

On 2021-04-06 02:32 PM, Bill Cole wrote:

PLEASE NOTE:

I read the mailing list obsessively and DO NOT NEED (or want) the
extra copies sent when you send both to me and to the list.


Sorry, I still haven't figured out how to properly respond. When I hi 
"reply all" it cc's the list and sends to you. When I hit just "reply" 
it only sends to you. I've manually deleted you from the "To" box and 
sending it directly to the list here. Hopefully that fixes things up.



Since the scores being added during delivery are much richer,
detecting enough info to do SPF and DKIM analysis, I am 99.9% certain
that the format of 'some_email' is mangled, probably missing critical
headers or using CR linebreaks instead of proper LFs.


Hmm, this is on a linux box, so I'm not sure how it could be screwing up 
the line breaks. Is it possible that when spamd injects the scores 
before the body of the email, it is screwing things up?


Here is email as it sits in my inbox now, which is after it gets 
processed by spamd. I was under the impression that an email that had 
already been processed by SA could be processed again and it would 
ignore any modifications made by earlier passes through SA.


Return-Path: 


Delivered-To: s...@exmaple.com
Received: from email.exmaple.com
by email.exmaple.com with LMTP
id kAhSKc1dY2BCKgAAB604Gw
(envelope-from 
)

for ; Tue, 30 Mar 2021 13:20:13 -0400
Received: by email.exmaple.com (Postfix, from userid 115)
id A64BE200C8; Tue, 30 Mar 2021 13:20:13 -0400 (EDT)
Received: from localhost by email.exmaple.com
with SpamAssassin (version 3.4.2);
Tue, 30 Mar 2021 13:20:13 -0400
From: "Home Warranty - AHS" 
To: 
Subject: *SPAM* It's getting warmer, are you covered?
Date: Tue, 30 Mar 2021 05:18:34 -0700
Message-Id: 
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on 
email.exmaple.com

X-Spam-Flag: YES
X-Spam-Level: *
X-Spam-Status: Yes, score=5.2 required=5.0 tests=BAYES_99,BAYES_999,
DATE_IN_PAST_03_06,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,

HTML_IMAGE_RATIO_02,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H2,

SPF_HELO_NONE,SPF_SOFTFAIL shortcircuit=no autolearn=no
autolearn_force=no version=3.4.2
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="--=_60635DCD.A0F5D194"

This is a multi-part message in MIME format.

=_60635DCD.A0F5D194
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

Spam detection software, running on the system "email.exmaple.com",
has identified this incoming email as possible spam.  The original
message has been attached to this so you can view it or label
similar future email.  If you have any questions, see
the administrator of that system for details.

Content preview:  Your AHS Home Warranty covers the repair or 
replacement of
   many system and appliance breakdowns, but not necessarily the entire 
system
   or appliance. Please refer to your contract for details. American 
Home Shield
   150 Peabody Pl., Memphis, TN 38103. Unsubscribe | Privacy Policy © 
2021

  American Home Shield Corporation. All rights reserved.

Content analysis details:   (5.2 points, 5.0 required)

 pts rule name  description
 -- 
--

 0.2 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
[score: 1.]
 3.5 BAYES_99   BODY: Bayes spam probability is 99 to 100%
[score: 1.]
 0.7 SPF_SOFTFAIL   SPF: sender does not match SPF record 
(softfail)
-0.7 RCVD_IN_DNSWL_LOW  RBL: Sender listed at 
https://www.dnswl.org/,

low trust
[69.252.207.38 listed in list.dnswl.org]
-0.0 RCVD_IN_MSPIKE_H2  RBL: Average reputation (+2)
[69.252.207.38 listed in wl.mailspike.net]
 1.6 DATE_IN_PAST_03_06 Date: is 3 to 6 hours before Received: date
 0.0 SPF_HELO_NONE  SPF: HELO does not publish an SPF Record
 0.0 HTML_IMAGE_RATIO_02BODY: HTML has a low ratio of text to image
area
 0.0 HTML_MESSAGE   BODY: HTML included in message
-0.1 DKIM_VALID_AU  Message has a valid DKIM or DK signature 
from

author's domain
-0.1 DKIM_VALID Message has at least one valid DKIM or DK 
signature
 0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not 
necessarily

valid

The original message was not completely plain text, and may be unsafe to
open with some email clients; in particular, it may contain a virus,
or confirm that your address can receive spam.  If you wish to view
it, it may be safer to save it to a file and open it with an editor.


=_60635DCD.A0F5D194
Content-Type: message/rfc822; x-spam-type=original
Content-D

Re: Getting different SA scores when using -R argument with spamc

2021-04-06 Thread Steve Dondley




Can you provide a working example message AND the operative user prefs?


OK, I was being very stupid. It finally dawned on me that the SA scores 
that appeared above the message body and below the headers when spamc 
was run without the -R option were SA scores embedded in the message by 
the postfix software and were not getting generated by spamc.


But that doesn't change the fact that the spamassassin score that is 
generated by the postfix command is different than what I'm getting 
directly on the command line. Here's is what is in my postfix master.cf 
file:


spamassassin unix - n   n   -   -   pipe
 user=debian-spamd argv=/usr/bin/spamc -u ${user} -e 
/usr/sbin/sendmail -oi -f ${sender} ${recipient}





spamassassin --prefs-file user_prefs_file -D all < some_email

Does the score and hits match one of your spamc tests?


No. The headers have a different score and the tests are different. It's 
scored only as 2.6 with BAYES_50 while what was embedded in the email by 
postfix had a BAYES_99  and BAYES_999 ans scored 5.2. postfix score also 
shows RCVD_IN_DNSWL_LOW while running from the command line does not 
show any such test hit.


And I cannot reproduce the SA scores embedded in the email by postfix 
even if I log in as user "s" and run this command:


spamassassin --prefs-file=/home/s/.spamassassin/user_prefs  -t < 
some_email


So I'm not sure what's going on.


Getting different SA scores when using -R argument with spamc

2021-04-06 Thread Steve Dondley

When I run spamc without -R option like this:

spamc -u some_user  < some_email

I get the following output:





This is a multi-part message in MIME format.




Content analysis details:   (5.2 points, 5.0 required)

 pts rule name  description
 -- 
--

 0.2 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
[score: 1.]
 3.5 BAYES_99   BODY: Bayes spam probability is 99 to 100%
[score: 1.]
 0.7 SPF_SOFTFAIL   SPF: sender does not match SPF record 
(softfail)
-0.7 RCVD_IN_DNSWL_LOW  RBL: Sender listed at 
https://www.dnswl.org/,

low trust
[69.252.207.38 listed in list.dnswl.org]
-0.0 RCVD_IN_MSPIKE_H2  RBL: Average reputation (+2)
[69.252.207.38 listed in wl.mailspike.net]
 1.6 DATE_IN_PAST_03_06 Date: is 3 to 6 hours before Received: date
 0.0 SPF_HELO_NONE  SPF: HELO does not publish an SPF Record
 0.0 HTML_IMAGE_RATIO_02BODY: HTML has a low ratio of text to image
area
 0.0 HTML_MESSAGE   BODY: HTML included in message
-0.1 DKIM_VALID_AU  Message has a valid DKIM or DK signature 
from

author's domain
-0.1 DKIM_VALID Message has at least one valid DKIM or DK 
signature
 0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not 
necessarily




===



However, when I run this command on the same email with the -R command 
to get the SA scores only like this:


spamc -R -u some_user  < some_email


I get this output:


===

2.6/5.0
Spam detection software, running on the system "email.dondley.com",
has NOT identified this incoming email as spam.  The original
message has been attached to this so you can view it or label
similar future email.  If you have any questions, see
the administrator of that system for details.

Content preview:  Spam detection software, running on the system 
"email.dondley.com",
   has identified this incoming email as possible spam. The original 
message

   has been attached to this so you can view it or label simi [...]

Content analysis details:   (2.6 points, 5.0 required)

 pts rule name  description
 -- 
--

 0.8 BAYES_50   BODY: Bayes spam probability is 40 to 60%
[score: 0.5000]
-0.0 NO_RELAYS  Informational: message was not relayed via 
SMTP

 0.2 HEADER_FROM_DIFFERENT_DOMAINS From and EnvelopeFrom 2nd level
mail domains are different
 1.6 DATE_IN_PAST_03_06 Date: is 3 to 6 hours before Received: date
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.0 HTML_IMAGE_RATIO_02BODY: HTML has a low ratio of text to image
area





Notice the scores are totally different. According to man page, -R says:

Just output the SpamAssassin report text to stdout, for all messages.  
See -r for details of the output format used.


So why are the scores different with and without the -R option?


DNSWL overriding bayes_99 and bayes_999 rules

2021-04-06 Thread Steve Dondley
I have emails that have been flagged as spam in the past but that are 
still getting through, presumably because the servers are on some DNSWL.


Example:

X-Spam-Status: No, score=0.9 required=5.0 tests=BAYES_99,BAYES_999,
DATE_IN_PAST_03_06,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,
HTML_IMAGE_RATIO_02,HTML_MESSAGE,RCVD_IN_DNSWL_HI,RCVD_IN_MSPIKE_H2,
SPF_HELO_NONE,SPF_SOFTFAIL shortcircuit=no autolearn=no
autolearn_force=no version=3.4.2

What's the recommended way to handle these? Do I turn on shortcircuit? 
Do I bump up the score for BAYES_99, BAYES_999? Or might there be a way 
to ignore DNSWL scores if they have a high bayes score?


What makes this email spam and how do I train myself to find markers for spam so I can train spamassassin properly?

2021-03-28 Thread Steve Dondley

The email below slipped through my spam filter.

It has malicious content attached which purports to be a voicemail from 
comcast (I've snipped the attachment from the example) but it is 
actually a phishing attack. The attachment contains a link that goes to 
a web page at an obscure domain that prompts you to log into your 
comcast account.


As you can see by the headers, this email was well-trusted by SA with a 
score of -2.7.


I don't think I can rely much on bayes filtering for these kinds of 
emails since the body has so little text (or do I make a bad assumption 
here?). And to my untrained eye, the only thing that looks suspicious is 
line 40 which says: "smtprelay.hostedemail.com".


So what's the giveaway that this is spam and what rule can I add to get 
SA to recognize it as such? And what is the best way for me to learn how 
to analyze the headers so I can recognize spam myself? Any good 
tutorials for this?




  1 Return-Path: 
  2 Delivered-To: catch...@example.org
  3 Received: from email.example.org
  4 by email.example.org with LMTP
  5 id EkqVDIVdYGCceQAAW5pcLQ
  6 (envelope-from 
)

  7 for ; Sun, 28 Mar 2021 06:42:13 -0400
  8 Received: by email.example.org (Postfix, from userid 115)
  9 id 2489422533; Sun, 28 Mar 2021 06:42:13 -0400 (EDT)
 10 Authentication-Results: email.example.org;
 11 dkim=pass (2048-bit key; secure) header.d=comcast.net 
header.i=@comcast.net header.b="PSvQlJTc";

 12 dkim-atps=neutral
 13 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on 
email.example.org

 14 X-Spam-Level:
 15 X-Spam-Status: No, score=-2.7 required=4.0 
tests=BAYES_50,DKIM_SIGNED,
 16 
DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,INVALID_MSGID,

 17 MSGID_FROM_MTA_HEADER,OBFU_TEXT_ATTACH,RCVD_IN_DNSWL_HI,
 18 RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS 
autolearn=unavailable

 19 autolearn_force=no version=3.4.2
 20 Received-SPF: Pass (mailfrom) identity=mailfrom; 
client-ip=96.114.154.164; helo=resqmta-po-05v.sys.comcast.net; 
envelope-from=x-flnltycomcastvoicemail_ref.no01...@comcast.net; 
receiver=
 21 Received: from resqmta-po-05v.sys.comcast.net 
(resqmta-po-05v.sys.comcast.net [96.114.154.164])
 22 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 
(256/256 bits))

 23 (No client certificate requested)
 24 by email.example.org (Postfix) with ESMTPS id F22E6215BD
 25 for ; Sun, 28 Mar 2021 06:42:11 -0400 
(EDT)

 26 Received: from resimta-po-42v.sys.comcast.net ([96.114.154.212])
 27 by resqmta-po-05v.sys.comcast.net with ESMTP
 28 id QSrxlUJdvoWleQSrxlMdfB; Sun, 28 Mar 2021 10:42:09 +
 29 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net;
 30 s=20190202a; t=1616928129;
 31 bh=vkwV5ud3feChWZLQsYrnwAqC5q/gOtq5c2+sZwvKGUI=;
 32 
h=Received:Received:Message-ID:Received:Received:From:Subject:To:

 33  Content-Type:MIME-Version:Date;
 34 
b=PSvQlJTcBWsdJnqw5X2ghcFhFC/KDs9orh5uzVOpepDAf2rxUTc3bG03diY25hkLB
 35  
fKraMiHrMsG0UjujPtZPBZ10Wvs+b/pCliySBbDhG4hPak0kJwkoe8INCCabIiNkCc
 36  
8LcCU2x8x5mK0WrbPxGQatIXplKMnAjK7Tr/v27aGvxFxfBjkeDL7DrG6AHNvjtv+P
 37  
N8/WmgYIX2MldH9NM5DFb1OIsENAGdRT2SQnBW+t67wJ9JvIl6D8ZpAXLK0Ra8rrZw
 38  
GbL3gsz49PAoDxAJTuMpWnvmef6J7o/xwV98mMj9s0Dyk3Y+IF2xtoz6CVzDjK/nHy

 39  7YHOQjMWIrXJQ==
 40 Received: from smtprelay.hostedemail.com ([216.40.44.63])
 41 by resimta-po-42v.sys.comcast.net with ESMTP
 42 id QSrwlZX7FX3qEQSrwlyoxt; Sun, 28 Mar 2021 10:42:08 +
 43 X-Xfinity-VAAS: 
gggruggvucftvghtrhhoucdtuddrgeduledrudehiedgfeduucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuvehomhgtrghsthdqtfgvshhinecuuegrihhlohhuthemuceftddunecuogfntfdquehouhhnugdqtfefvdculdehmdenucfjughrpefhuffvtgggffesmhdttdertddttdenucfhrhhomhepfdgiqdfhlhfplhfvjggtohhmtggrshhtvhhoihgtvghmrghilhgprhgvfhdrnhhotddujfffufestghomhgtrghsthdrnhgvthdfuceoigdqhfhlpfhlvfgjtghomhgtrghsthhvohhitggvmhgrihhlpghrvghfrdhnohdtudfjfffusegtohhmtggrshhtrdhnvgh 
   
tqeenucggtffrrghtthgvrhhnpeduvddtkeduleehvdejkeeludfhhffghefhgeegjeefgeejveeiuedtgfeitdelieenucfkphepvdduiedrgedtrdeggedrieefpdeivddrudekvddrleelrdelgeenucevlhhushhtvghrufhiiigvpeefnecurfgrrhgrmhephhgvlhhopehsmhhtphhrvghlrgihrdhhohhsthgvuggvmhgrihhlrdgtohhmpdhinhgvthepvdduiedrgedtrdeggedrieefpdhmrghilhhfrhhomhepgidqfhhlnhhlthihtghomhgtrghsthhvohhitggvmhgrihhlpghrvghfrdhnohdtudhhughssegtohhmtggrshhtrdhnvghtpdhrtghpthhtohepihgsvgifgeehheestghomhgtrg 
   hsthdrnhgvthdprhgtphhtthhopehofhhfihgtvgesihgsvgifgeehhedrohhrgh

 44 X-Xfinity-VMeta: sc=5.00;st=legit
 45 X-Xfinity-Message-Heuristics: IPv6:N;TLS=1;SPF=4;DMARC=F
 46 Message-ID: 
qsrwlzx7fx3qeqsrwlyoxt.1616928128.bcb9cc98f861a2c7a8b119d18ed7fa74.missin...@comcast.net
 47 Received: from omf14.hostedemail.com (clb03-v110.bra.tucows.net 
[216.40.38.60])
 48 by smtprelay03.hostedemail.com (Postfix) with ESMTP id 
03D8F837F24D
 49

Why no points for SPF_NONE?

2021-03-21 Thread Steve Dondley
I'm learning a bit about spamassassin rules and taking a peek at how my 
inbound mail is scored. I noticed that PF_NONE scores zero points by 
default. I'm wondering if there is a good reason for not giving it a 
score and whether I should set that to something much higher like 1.0.


I'm curious to know what more experienced people have this set to. 
Thanks.


Re: Workflow for adding new ham/spam to existing site-wide database?

2021-03-16 Thread Steve Dondley
You covered a lot of ground here. Thanks.. If you have some spare 
cycles, I have follow up questions to get an understanding of how you 
process your email:



21 seconds at that includes fetch the samples via imap from two
folders, fire them against a bayes-only spamassasin instance,


What is a "bayes-only" instance? I don't follow. What other kinds of 
instances are there?



ignore

BAEYS_00/BAYES_99 messages, move the rest to the both training
folders, anonymize them, strip useless headers, fire sa-learn against


OK, so it looks like you are suggesting that emails get kind of 
pre-screened to determine if they are obvious spam or not.


And by anonymize, what do you mean? Remove the headers that contain 
email addresses? What other headers are useless? What exactly is the 
goal of anonymizing and removing the headers? I think I have a vague 
idea why but can't quite crystallize it in my head.



both folders, fire bogfilkter training against both folders and verify
that the new sampel files score with BEYS_99/BAYES_00 now


bogfilkter training?

So the goal is to get all the new emails to score either 99 (spam) or 00 
(ham).


So once I verify they score 00 or 99, do I then throw them on the larger 
collection of ham/spam with all headers restored? And what do I do if 
they still don't score 00 or 99?





Workflow for adding new ham/spam to existing site-wide database?

2021-03-16 Thread Steve Dondley
I have been accumulating spam/ham samples and sorting them out into 
different directories on my server. As new spam/ham comes in, I throw it 
into the existing pile and then run "sa-learn --spam|--ham" on the whole 
pile.


It dawned on me that this will get very slow as I eventually collect 
tens of thousand of emails. So I'm wondernig if it's better to:


1) Place all new, incoming spam/ham into empty directories
2) Run sa-learn only on these directories with small samples
3) Once done, move these new emails to an archive of spam/ham samples
4) Repeat

Is this typically how it's done?


Scoring for "look alike" characters in subject?

2021-03-15 Thread Steve Dondley
I'm noticing a fair amount of spam getting through using letters in the 
subject line that are outside the standard set of ASCII characters in an 
effort to bypass spam filters. For example, instead of a capital "R", 
there will be a letter that closely approximates a capital "R" but when 
you look closely at it, you'll see the bottom of the rounded part of the 
"R" never connects to the line running along the left side of the 
letter.


I don't want to use a rule that is too-restrictive (like maybe banning 
all non-standard ascii characters) but I also want to increase the 
likelihood of email using these tactics getting flagged as spam.


I'm new to spamasssassin so I'm not sure if a rule like this already 
exists or how I might go about finding this rule or what I should weight 
it. I'm wondering if others on the list have rules to address this same 
issue and can share their rule. Thanks.


Re: Can a .spamassassin directory in a user's home directory override the site-wide configuration?

2021-03-15 Thread Steve Dondley
OK, thanks for the additional info. It looks like I was having a 
permissions issue and the bayes_* files were not both r/w for users 
despite having bayes_file_mode set to 0666. I'm thinking probably 
because the bayes_path was originally created manually with root.


spamassassin reads site-wide config, then users' 
~/.spamassassin/user_prefs


spamd can do the same, if it runs under root without the '-x' flag 
(which

disables this behavior).

spamc connects to spamd passing the username to it, so you can override
current user by passing the "-u username" flag to it.


Can a .spamassassin directory in a user's home directory override the site-wide configuration?

2021-03-14 Thread Steve Dondley
I'm learning to understand how to properly set up a site-wide bayes 
database on my server. Thanks for everyone's help and patience so far.


I've discovered that the SA score assigned to a user's incoming email is 
different than the SA score run through the "spamc" or "spamassassin" 
command. For example, the SA headers for email "A" will show a score of 
only 1.4 (non-spam) in the user's inbox. It shows as non-span despite 
the fact that I have run it through sa-learn as spam. When I run the 
same email through "spamc -R < " on the command line as 
the same user that received the original message, I see a score of 6.8 
and it is properly getting classified as spam.


I'm trying to determine what accounts for the different scores and fix 
this problem so the correct score is assigned to mail coming into the 
user's inbox.



After doing some investigating, I discovered the user still had a 
.spamassassin directory in their home directory. The directory has only 
a single "user_prefs" file. But I'm wondering if the existence of this 
directory might cause spamassassin filter to ignore site-wide bayes 
database. If that's not the problem, what might account for the 
different scores and how might I fix the issue?


Re: How do I determine if user's email is being checked against the side-wide database?

2021-03-13 Thread Steve Dondley




Are there any BAYES hits on their messages, ham or spam? BAYES_{not
50} would be a positive confirmation. I'm not sure offhand if BAYES_50
hits when bayes is enabled but insufficiently trained...


In one email, I'm seeing this:

3.0 BAYES_95   BODY: Bayes spam probability is 95 to 99%

So I guess it's working. It looks like it got scored +3 points for 
having a greater than 95% probability of being spam according to the 
Bayes algorithm.


How do I determine if user's email is being checked against the side-wide database?

2021-03-13 Thread Steve Dondley
I *think* I now I have site-wide bayes filtering working now for all 
users on a server. I've edited /etc/spamassassin/local.cf to include 
"bayes_path" and "bayes_file_mode" and I don't see any errors about 
permissions being wrong from debian-spamd in mail.log.


But rather than guessing, I'm wondering if there is there a way I can 
objectively confirm that email for a particular user is getting checked 
against the site-wide bayes database. Thanks.


How do I efficiently share a database with all users?

2021-03-11 Thread Steve Dondley
I have a few different mail servers. I harvest mail from the servers and 
periodically sort them into ham/spam folders and then share the sorted 
mail back out to the servers and run sa-learn on each of the servers to 
coach spamassassin. After doing this a few days, I notice that stuff 
that I know I have classified as spam is still getting into inboxes. So 
clearly I'm doing something wrong. I did a little reading and discovered 
that sa-learn only applies for the user sa-learn is run under. It seems 
wasteful to run sa-learn over the same emails for every users on the 
system.


How can I run sa-learn once on the system and then share the generated 
database with each user?


Re: Training spamassassin past 5,000 emails

2021-03-09 Thread Steve Dondley

On 2021-03-09 08:28 AM, Greg Troxel wrote:

Steve Dondley  writes:


I've read through
https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html which
states that "anything over about 5000 messages does not improve
accuracy significantly in our tests."


I would take that with a grain of salt.   Based on my experience 
running

SA for many years, I'd say that if you have new spam  that isn't like
the spam you already have, learning on it will help.

Also, I take it as a comment about "there's no need to try hard to get
more the 5K messages".  It doesn't say, "if you train on more than 5000
bad things will happen".


So once I hit 5,000, what do? Do I run --forget on say the 500 oldest
emails, delete those from my ham/spam folders and then add in a batch
of 500 newer ham/spam emails and then run sa-learn on all the emails
in my spam/ham folders?


I've been running sa-learn daily over my ham folders and my spam 
folders

for years.  I refile spam and ham so that it will be learned.  I find
the bayes scoring is quite good except for novel spam.  My bayes_* 
files

are about 83M in total.

So I don't think you necessarily have a problem to solve.


OK, thanks for the advice. Appreciated.



Training spamassassin past 5,000 emails

2021-03-09 Thread Steve Dondley
I've read through 
https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html which 
states that "anything over about 5000 messages does not improve accuracy 
significantly in our tests."


So once I hit 5,000, what do? Do I run --forget on say the 500 oldest 
emails, delete those from my ham/spam folders and then add in a batch of 
500 newer ham/spam emails and then run sa-learn on all the emails in my 
spam/ham folders?


Re: How to get mailbox to use Bayesian filter?

2005-02-05 Thread Steve Dondley
Thanks, Thomas.  Did I do the right thing by soft linking the files?
Thomas Arend wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Am Freitag, 4. Februar 2005 22:57 schrieb Steve Dondley:
 

Hi,
I've been feeding spamassassin spam and ham.  But the Bayesian filter is
not having any effect on my e-mail account.  Now, I've been feeding the
bayesian filter logged in as one user "A" on my server and the soft
linking the 'bayes_seen', 'bayes_toks' and 'user_prefs' files for user
"B" over to user "A's".  But I'm still having no success.  What else do
I need to be doing?
My procmail looks like this:
:0fw
spamassassin -P
   

   ^ Here is the failure! Missing "|". And I found no -P in the maunuals.
Try
:0fw:spamassassin.lock
| /usr/bin/spamassassin
Or use spamc / spamd (it's faster)
:0fw:spamassassin.lock
| /usr/bin/spamc
BTW: You should avoid to check long messages.
That makes:
:0fw:spamassassin.lock
* < 10
| /usr/bin/spamc
 

#:0
#* ^X-Spam-Status: Yes
#[/dev/null]
   

- -- 
icq:133073900
http://www.t-arend.de
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCBIZxHe2ZLU3NgHsRAjnSAJ94087kXHijjRiDq7nf1iv3N60/pwCfd19P
ECflMcykmVgMt/u2Xm4KheM=
=Yzh9
-END PGP SIGNATURE-
 




How to get mailbox to use Bayesian filter?

2005-02-04 Thread Steve Dondley
Hi,
I've been feeding spamassassin spam and ham.  But the Bayesian filter is 
not having any effect on my e-mail account.  Now, I've been feeding the 
bayesian filter logged in as one user "A" on my server and the soft 
linking the 'bayes_seen', 'bayes_toks' and 'user_prefs' files for user 
"B" over to user "A's".  But I'm still having no success.  What else do 
I need to be doing?

My procmail looks like this:
:0fw
spamassassin -P
#:0
#* ^X-Spam-Status: Yes
#[/dev/null]


sa-learn report its learning from only 1 message

2004-12-13 Thread Steve Dondley
I'm using sa-learn for the first time.  I uploaded mail from my t-bird 
mail client on my Windows machine to my Linux box in ascii mode.  There 
was a little over 200 message in each of the two mailboxes I uploaded.

When I ran sa-learn --ham and sa-learn --spam on the boxes, I got a 
report that spam assassin only learned from 1 message.  What have I done 
wrong?


Should tagged spam be fed back to server?

2004-12-12 Thread Steve Dondley
When training spamassassin, is it a good idea to feed spam already 
marked as spam back to SpamAssassin?  Will this help it or hinder it or 
do neither?


Resending mail Outlook still strips out headers

2004-12-11 Thread Steve Dondley
I'm trying to train SpamAssassin.  I've set up two mailboxes on my server.
One for spam and one for non-spam.  I'm trying to figure out how to deliver
mail there from my client (Outlook 2000).

There is some advice given on the SpamAssassin web site to not forward mail
but to resend it.  I do that.  But when I look at the raw mail on my server,
none of the original headers are there.  The e-mails all look like the
originally came from me.

What am I doing wrong (besides still using a piece of shit like Outlook)?