RE: Question on Rule

2020-01-27 Thread Charles Amstutz
> 
> > Hello,
> >
> > Can someone explain what this actually means and maybe provide an
> > example?
> >
> > Rule Name: FROM_MISSP_DYNIP
> > Rule Definition: misspaced + dynamic rDNS
> >
> > Getting a high score on this and having trouble finding an actual real
> > definition and example. I get the dynamic rDNS I believe, but not sure
> > about the misspaced meaning for sure.
> 
> It means that there is no space between the display name and the '<', e.g.
> 
>From: John Smith
> 
> If you are seeing anything very different?

Thanks, however, I do see a space between the name and the '<'

This is what it looks like:

From: =?UTF-8?Q?Name?= 


RE: Question on Rule

2020-01-27 Thread Charles Amstutz
> Am 27.01.20 um 17:22 schrieb Charles Amstutz:
> > Can someone explain what this actually means and maybe provide an
> example?
> >
> > Rule Name: FROM_MISSP_DYNIP
> >
> > Rule Definition: misspaced + dynamic rDNS
> >
> > Getting a high score on this and having trouble finding an actual real
> > definition and example. I get the dynamic rDNS I believe, but not sure
> > about the misspaced meaning for sure
> 
> misspaced FROM header which leave sthe question open why you don't
> provide any useful information like, well, the headers or better raw-mail at
> pastebin

>From your explanation, I think I found what might be causing the rule to 
>trigger. 

I think it is the Weird characters in subject, from and to?

This is redacted a bit, of course. 

Return-Path: 
Delivered-To: recipi...@email.com
Received: (qmail 4989 invoked by alias); 25 Jan 2020 15:13:45 -0600
Delivered-To: recipi...@email.com
Received: (qmail 4975 invoked from network); 25 Jan 2020 15:13:45 -0600
Received: from SMTP Server (HELO SMTP Server) (internal IP)
  by mailserver with ESMTP; 25 Jan 2020 15:13:45 -0600
Received: (qmail 81888 invoked from network); 25 Jan 2020 15:13:35 -0600
Received: from dynamic RDNS (HELO HP511DF8) (Dynamic IP)
  by smtp external DNS name with ESMTP; 25 Jan 2020 15:13:35 -0600
Received-SPF: softfail (SMTP Server: transitioning SPF record at domain does 
not designate dynamic IP as permitted sender)
From: =?UTF-8?Q?Sender_name?= 
To: =?UTF-8?Q?Recipient_name?= 

Subject: =?UTF-8?Q?Subject?=
Date: Sat, 25 Jan 2020 19:35:07 +
Message-ID: <1815052843-1579980907@>
Content-Type: multipart/mixed;
boundary="=_Part_Boundary_004b_6b102fb7.6b102fb7"
MIME-Version: 1.0


Question on Rule

2020-01-27 Thread Charles Amstutz
Hello,

Can someone explain what this actually means and maybe provide an example?

Rule Name: FROM_MISSP_DYNIP
Rule Definition: misspaced + dynamic rDNS

Getting a high score on this and having trouble finding an actual real 
definition and example. I get the dynamic rDNS I believe, but not sure about 
the misspaced meaning for sure.

Thanks


RE: MISSING_SUBJECT rule on email with subject

2019-06-24 Thread Charles Amstutz
> But as has already been pointed out it has the combination of 
> MISSING_FROM and HK_RANDOM_FROM, and the latter is based on a 
> From:addr test.

I saw this too, however,  I thought I noticed a potentially bad regex (from 
another custom rule) breaking mine. I think this is the case as when I removed 
the rule, it stopped the missing_subject  stopped hitting. 
However, I'm still testing.


low scoring spam

2017-07-14 Thread Charles Amstutz
Hello,

I keep having spam come through that hits on almost zero rules, (or very few) . 
 I get this is definitely possibly, but it's annoying as its obviously spam. I 
guess my question is, if what we have in place isn't hitting on much, then 
aside from learning it to Bayes, what do we do? Even that isn't enough it seems 
as it learns it to Bayes_50 and not Bayes_99.  Even Bayes_99 is not enough to 
catch it as spam typically if it doesn't trip anything else. (as it only 3.5 
for Bayes_99 and many users are set to default to 4 or 5)


RE: "bout u" campaign

2017-07-13 Thread Charles Amstutz
I'm starting mine out at 0.5 until I see what happens.


    Infinite Systems
    Charles Amstutz | Systems Administrator
    charl...@infinitesys.com 402.477.2474
    134 S 13th Street, Suite 302 | Lincoln, NE 68508
 


-Original Message-
From: David Jones [mailto:djo...@ena.com] 
Sent: Thursday, July 13, 2017 11:13 AM
To: users@spamassassin.apache.org
Subject: Re: "bout u" campaign

On 07/13/2017 10:56 AM, RW wrote:
> On Thu, 13 Jul 2017 09:33:04 -0400
> Alex wrote:
> 
>> On Thu, Jul 13, 2017 at 9:29 AM, Charles Amstutz 
>> <charl...@infinitesys.com> wrote:
>>> How do you use lashback? It says that it is free to use for 
>>> commercial and non commercial use. How do I set it up?
>>
>> Drop this into your local.cf or similar:
>>
>> header   RCVD_IN_LASHBACK eval:check_rbl('LASHBACK',
>> 'ubl.unsubscore.com')
> 
> I have it as lastexternal:
> 
> header RCVD_IN_UNSUBBL  eval:check_rbl('ubl-lastexternal', 
> 'ubl.unsubscore.com')
> 
> I've found there to be quite a lot of ISP pool addresses in it, so 
> deep checks are probably unsafe.
> 

I started mine with lastexternal and didn't find much added value over other 
major RBLs and since my MTA was blocking mostly with IVM and Spamhaus RBLs that 
overlapped Lashback.  I also wanted to check outbound mail where the second or 
more hop was from an infected device most likely under botnet control.  It 
would have helped in the OP spam.


> I've also found it has quite a high FP rate of ~2%.
> 

I am working with them to fix these FPs (they include major mail providers like 
Comcast, Microsoft and Google which are pointless) and potentially be included 
in the default SA rules.  It's still a valuable RBL to help with an overall 
score even with a ~2% FP.  I wouldn't score it too high like you can with 
Spamhaus and IVM.  I also have it at 1.2.

-- 
David Jones



RE: "bout u" campaign

2017-07-13 Thread Charles Amstutz
Hello,

For the inexeperienced, what is the difference between lashback and 
lastexternal.


    Infinite Systems
    Charles Amstutz | Systems Administrator
    charl...@infinitesys.com 402.477.2474
    134 S 13th Street, Suite 302 | Lincoln, NE 68508
 


-Original Message-
From: RW [mailto:rwmailli...@googlemail.com] 
Sent: Thursday, July 13, 2017 10:57 AM
To: users@spamassassin.apache.org
Subject: Re: "bout u" campaign

On Thu, 13 Jul 2017 09:33:04 -0400
Alex wrote:

> On Thu, Jul 13, 2017 at 9:29 AM, Charles Amstutz 
> <charl...@infinitesys.com> wrote:
> > How do you use lashback? It says that it is free to use for 
> > commercial and non commercial use. How do I set it up?
> 
> Drop this into your local.cf or similar:
> 
> header   RCVD_IN_LASHBACK eval:check_rbl('LASHBACK',
> 'ubl.unsubscore.com')

I have it as lastexternal:

header RCVD_IN_UNSUBBL  eval:check_rbl('ubl-lastexternal', 'ubl.unsubscore.com')

I've found there to be quite a lot of ISP pool addresses in it, so deep checks 
are probably unsafe. 

I've also found it has quite a high FP rate of ~2%.










RE: "bout u" campaign

2017-07-13 Thread Charles Amstutz
Thanks,

I was looking at the default RBL lists

https://wiki.apache.org/spamassassin/DnsBlocklists

But was looking for other things that are free for commercial use. I found this 
that is possible.

http://0spam.fusionzero.com/

but don't know if wanyone had experience with it, or could make other 
recommendations. 


>Drop this into your local.cf or similar:

>header   RCVD_IN_LASHBACK eval:check_rbl('LASHBACK', 'ubl.unsubscore.com')
>describe RCVD_IN_LASHBACK LashBack Unsubscribe Blacklist
>tflags   RCVD_IN_LASHBACK net
>scoreRCVD_IN_LASHBACK 1.2

> I've scored it at 1.2. You may wish to change that, perhaps lower for a 
> while, while you see how it works in your organization.



RE: "bout u" campaign

2017-07-13 Thread Charles Amstutz
Thanks


    Infinite Systems
    Charles Amstutz | Systems Administrator
    charl...@infinitesys.com 402.477.2474
    134 S 13th Street, Suite 302 | Lincoln, NE 68508
 


-Original Message-
From: Alex [mailto:mysqlstud...@gmail.com] 
Sent: Thursday, July 13, 2017 8:33 AM
To: Charles Amstutz <charl...@infinitesys.com>; SA Mailing list 
<users@spamassassin.apache.org>
Subject: Re: "bout u" campaign

On Thu, Jul 13, 2017 at 9:29 AM, Charles Amstutz <charl...@infinitesys.com> 
wrote:
> How do you use lashback? It says that it is free to use for commercial and 
> non commercial use. How do I set it up?

Drop this into your local.cf or similar:

header   RCVD_IN_LASHBACK eval:check_rbl('LASHBACK', 'ubl.unsubscore.com')
describe RCVD_IN_LASHBACK LashBack Unsubscribe Blacklist
tflags   RCVD_IN_LASHBACK net
scoreRCVD_IN_LASHBACK 1.2

I've scored it at 1.2. You may wish to change that, perhaps lower for a while, 
while you see how it works in your organization.



RE: "bout u" campaign

2017-07-13 Thread Charles Amstutz
As a follow up, it says how to do the DNS, just now how to list in the .cf 
files, maybe I can copy another blacklist syntax?


    Infinite Systems
    Charles Amstutz | Systems Administrator
    charl...@infinitesys.com 402.477.2474
    134 S 13th Street, Suite 302 | Lincoln, NE 68508
 


-Original Message-
From: David Jones [mailto:djo...@ena.com] 
Sent: Thursday, July 13, 2017 8:17 AM
To: users@spamassassin.apache.org
Subject: Re: "bout u" campaign

On 07/12/2017 09:50 PM, Alex wrote:
> Hi,
> 
>> pretty high mainly due to DCC and BAYES_99.
> 
> Are you paying for DCC? I think we're over their limit and they 
> blacklisted us long ago, lol.

I have my own DCC server joined into the DCC network.

https://www.dcc-servers.net/dcc/

> 
>> I guess I have well trained Bayes.
> 
> I think you just don't have many one-liner emails as a regular course 
> of business?

I am classifying about 10K ham and 8K spam each day which I also use in the 
masscheck processing (currently on hold).  Since I have started doing this 
about a month or so ago, my BAYES scores seem to be more accurate.  Maybe I 
wasn't training enough ham/spam before?  I don't know for sure yet.

> 
>>   1.2 RCVD_IN_LASHBACK   RBL: Received is listed in Lashback
>>  usb.unsubscore.com
>>  [204.29.186.60 listed in 
>> ubl.unsubscore.com]
> 
> I forgot about this. I have it in postscreen (+1) but now also added it in SA.
> 
>>   2.2 RCVD_IN_SORBS_SPAM RBL: SORBS: sender is a spam source
> 
> We do have some in SORBS, but only score it 0.5.  Do you really 
> recommend scoring it so high?
> Obviously I do because it's working well in my platform.  I have other
WL rules that subtract points to offset this one.  If there are no other WL 
(i.e. list.dnswl.org) hits then this will stand out more.

Do some analysis of your emails that hit this rule and what the scores were.  
My threshold for blocking is 6.0 (default for MailScanner).  If your threshold 
is 5.0 and your ham with this rule his is scoring below
3.3 (5.0 - 1.7), then you would be fine setting this to score 2.2.

>>   0.0 OS_UNKNOWN Relay runs on unknown OS
> 
> That's an interesting one. Fingerprinting?
> 
Yeh.  I thought it might be a useful data point for making meta rules but it 
turns out to not be.  I will probably leave this out when I rebuild my filters 
in the next couple of months on CentOS 7.

>>   1.2 FREEMAIL_FROM  Sender email is commonly abused enduser mail
> 
> This is also scored *much* lower here - we have many freemail senders.
> The default score is 0.001, so you must have changed it.
> 
Yep.  Again my block threshold is 6.0 in MailScanner and I have less default 
trust for FREEMAIL senders.  I also have meta rules based on FREEMAIL and other 
hits that add to the score based on combinations I have seen over the years.

FREEMAIL senders are very difficult to accurately filter but I feel like my 
rules are pretty good.  I have to postwhite exclude most freemail providers 
since they are listed on some RBLs which makes no sense to me. 
  You can't block the big ones like Yahoo, Hotmail, Comcast, etc. just because 
they are so large and there are many legit senders in the middle of the 
spammers.

>> -2.2 RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of 90 to 100
> 
> For 90_100, I think we're only subtracting -0.2.
> 
For my mail flow, I have noticed that senders in the 90's are normally very 
trustworthy.

If you separate your rules into 2 main categories, then you can setup scores 
based on their category to balance out the other category.

1. IP and domain reputation
2. Message content

Good IP reputation can offset questionable message content and vice versa.  I 
tend to go heavy on the reputation side at the MTA and in SA which has serve me 
well in the past several years.  Before that, I was constantly adjusting 
content rule scores and writing custom rules to react to the latest spam 
campaign where I was always behind.

I have a huge list of whitelist_auth based on domain reputation which allows me 
to crank up some content scores and not let Bayes block good reputation senders 
based on content.


>>   2.2 ENA_DIGEST_FREEMAILFreemail account hitting message digest spam
>> seen by the Internet (DCC, Pyzor, or Razor).
> 
> The problem I always had with pyzor/dcc was that it works on very 
> small blocks of text, no? Perhaps it works well for small messages, 
> but isn't it problematic for larger messages?
> 
I have no idea.  I just analyzed my mail scoring and noticed combinations like 
DCC and FREEMAIL are common in my spam.

>>   1.2 ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or Pyzor hits from servers
>>  

RE: "bout u" campaign

2017-07-13 Thread Charles Amstutz
How do you use lashback? It says that it is free to use for commercial and non 
commercial use. How do I set it up?


    Infinite Systems
    Charles Amstutz | Systems Administrator
    charl...@infinitesys.com 402.477.2474
    134 S 13th Street, Suite 302 | Lincoln, NE 68508
 


-Original Message-
From: David Jones [mailto:djo...@ena.com] 
Sent: Thursday, July 13, 2017 8:17 AM
To: users@spamassassin.apache.org
Subject: Re: "bout u" campaign

On 07/12/2017 09:50 PM, Alex wrote:
> Hi,
> 
>> pretty high mainly due to DCC and BAYES_99.
> 
> Are you paying for DCC? I think we're over their limit and they 
> blacklisted us long ago, lol.

I have my own DCC server joined into the DCC network.

https://www.dcc-servers.net/dcc/

> 
>> I guess I have well trained Bayes.
> 
> I think you just don't have many one-liner emails as a regular course 
> of business?

I am classifying about 10K ham and 8K spam each day which I also use in the 
masscheck processing (currently on hold).  Since I have started doing this 
about a month or so ago, my BAYES scores seem to be more accurate.  Maybe I 
wasn't training enough ham/spam before?  I don't know for sure yet.

> 
>>   1.2 RCVD_IN_LASHBACK   RBL: Received is listed in Lashback
>>  usb.unsubscore.com
>>  [204.29.186.60 listed in 
>> ubl.unsubscore.com]
> 
> I forgot about this. I have it in postscreen (+1) but now also added it in SA.
> 
>>   2.2 RCVD_IN_SORBS_SPAM RBL: SORBS: sender is a spam source
> 
> We do have some in SORBS, but only score it 0.5.  Do you really 
> recommend scoring it so high?
> Obviously I do because it's working well in my platform.  I have other
WL rules that subtract points to offset this one.  If there are no other WL 
(i.e. list.dnswl.org) hits then this will stand out more.

Do some analysis of your emails that hit this rule and what the scores were.  
My threshold for blocking is 6.0 (default for MailScanner).  If your threshold 
is 5.0 and your ham with this rule his is scoring below
3.3 (5.0 - 1.7), then you would be fine setting this to score 2.2.

>>   0.0 OS_UNKNOWN Relay runs on unknown OS
> 
> That's an interesting one. Fingerprinting?
> 
Yeh.  I thought it might be a useful data point for making meta rules but it 
turns out to not be.  I will probably leave this out when I rebuild my filters 
in the next couple of months on CentOS 7.

>>   1.2 FREEMAIL_FROM  Sender email is commonly abused enduser mail
> 
> This is also scored *much* lower here - we have many freemail senders.
> The default score is 0.001, so you must have changed it.
> 
Yep.  Again my block threshold is 6.0 in MailScanner and I have less default 
trust for FREEMAIL senders.  I also have meta rules based on FREEMAIL and other 
hits that add to the score based on combinations I have seen over the years.

FREEMAIL senders are very difficult to accurately filter but I feel like my 
rules are pretty good.  I have to postwhite exclude most freemail providers 
since they are listed on some RBLs which makes no sense to me. 
  You can't block the big ones like Yahoo, Hotmail, Comcast, etc. just because 
they are so large and there are many legit senders in the middle of the 
spammers.

>> -2.2 RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of 90 to 100
> 
> For 90_100, I think we're only subtracting -0.2.
> 
For my mail flow, I have noticed that senders in the 90's are normally very 
trustworthy.

If you separate your rules into 2 main categories, then you can setup scores 
based on their category to balance out the other category.

1. IP and domain reputation
2. Message content

Good IP reputation can offset questionable message content and vice versa.  I 
tend to go heavy on the reputation side at the MTA and in SA which has serve me 
well in the past several years.  Before that, I was constantly adjusting 
content rule scores and writing custom rules to react to the latest spam 
campaign where I was always behind.

I have a huge list of whitelist_auth based on domain reputation which allows me 
to crank up some content scores and not let Bayes block good reputation senders 
based on content.


>>   2.2 ENA_DIGEST_FREEMAILFreemail account hitting message digest spam
>> seen by the Internet (DCC, Pyzor, or Razor).
> 
> The problem I always had with pyzor/dcc was that it works on very 
> small blocks of text, no? Perhaps it works well for small messages, 
> but isn't it problematic for larger messages?
> 
I have no idea.  I just analyzed my mail scoring and noticed combinations like 
DCC and FREEMAIL are common in my spam.

>>   1.2 ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or Pyzor hits from servers
>>  

RE: "bout u" campaign

2017-07-13 Thread Charles Amstutz
I find it challenging to constantly keep up with campaign's.  My guess with the 
phone number is to try to make it seem more legitimate. 
More recent, I try to look for general characteristics and go for that, in 
order to futureproof rules. However, there are always legitimate emails being 
sent that would trigger a potential rule (depending on what you are matching on)


>> What is even the point of spam with a phone number?



RE: Random word spams and wiki spams

2017-07-07 Thread Charles Amstutz
Mostly autolearn ham and train some spam, have found that one account needed 
ham though. 

Most user accounts in question are at least 200/200, most are well over a few 
thousand each (I believe) 

>> I need to read up bayes a bit, I was surprised to learn that after 
>> using sa-learn --spam, then bayes only tagged it at Bayes_50 instead 
>> of Bayes_99, Unless I did something incorrect.

>There is a minimum level of both spam *and ham* that Bayes must be trained 
>with before it will start providing scoreable analysis.

>How much have you trained it with?




RE: Random word spams and wiki spams

2017-07-07 Thread Charles Amstutz

>> I find many don't contribute (despite it being open source) for fear of 
>> spammers using these ideas against us, but the project suffers as a result.

I think others don't due to IP rights. I'm glad people do though.


RE: Random word spams and wiki spams

2017-07-07 Thread Charles Amstutz
I need to read  up bayes a bit, I was surprised to learn that after using 
sa-learn --spam, then bayes only tagged it at Bayes_50  instead of Bayes_99, 
Unless I did something incorrect.

Note: I do not use bayes files in user profiles, I use it in mysql database


RE: Random word spams and wiki spams

2017-07-07 Thread Charles Amstutz

Has anyone ever got something like machine learning (I get that is what bayes 
kind of is) or R working with spam assassin? I’ve seen Books on this and maybe 
was refering to Bayes, but not sure.


RE: Random word spams and wiki spams

2017-07-07 Thread Charles Amstutz
I setup spamdyke to block .top and many other TLDs where mostly spam came from. 
Unfortunately, I had to remove them, and now have to rely on content analysis 
with the use of *BL's. 

With setting up pattern matching, in efforts to future proof blocking, it will 
catch legit email that use characters to form tables (happens occasionally). 

The only thing I could think of was to set individual scores lower, but high 
meta scores.  I appreciate the options for postfix, but I do not run that on 
incoming mail servers.


    Infinite Systems
    Charles Amstutz | Systems Administrator
    charl...@infinitesys.com 402.477.2474
    134 S 13th Street, Suite 302 | Lincoln, NE 68508
 


-Original Message-
From: David Jones [mailto:djo...@ena.com] 
Sent: Friday, July 7, 2017 11:15 AM
To: Charles Amstutz <charl...@infinitesys.com>; 'users@spamassassin.apache.org' 
<users@spamassassin.apache.org>
Subject: Re: Random word spams and wiki spams

On 07/07/2017 11:04 AM, Charles Amstutz wrote:
> Thank you everyone for the suggestions, I will look into it. One thing 
> I've noticed is that sometimes it takes a day for any *BL's to pick up 
> some of the spam, and by that time, the run could be done. Greylisting 
> isn't an option. It sometimes feels like always reactive vs pro-active 
> in filtering.  For example, I try to block the old runs of "Ford 
> Warranties", write a few rules, then never receive them again :)
> 
> This is a slight over exaggeration, but close.
> 

No. I completely understand.  A couple of years ago I was doing the same thing 
always reacting to new spam campaigns.  It took a lot of my time and I never 
felt like I was winning those one-day battles.

Now I have tuned my MTA (Postfix with postscreen) to reject the majority of 
junk before it ever reaches SA.  See the archives for these Postscreen weighted 
RBLs if you are running Postfix.  With about 24 RBLs including invaluement, I 
am able to be aggressive with many RBLs adding up to a block threshold of 8 in 
postscreen.

On the other side of this, you have to setup postwhite to whitelist major mail 
providers like comcast.net, aol, google, yahoo.com, etc. and let SA score them.

Now I rarely get any reports of spam getting through unless it's from a 
compromised account.  These will always be difficult to block for zero-hour 
spam campaigns from botnets.

Also, setup the KAM.cf rules and extra signatures for ClamAV from Sanesecurity. 
 These often help with new spam campaigns.  I can post which signature DBs I am 
using if that would be helpful.

--
Dave



RE: Random word spams and wiki spams

2017-07-07 Thread Charles Amstutz
Thank you everyone for the suggestions, I will look into it. One thing I've 
noticed is that sometimes it takes a day for any *BL's to pick up some of the 
spam, and by that time, the run could be done. Greylisting isn't an option. It 
sometimes feels like always reactive vs pro-active in filtering.  For example, 
I try to block the old runs of "Ford Warranties", write a few rules, then never 
receive them again :)

This is a slight over exaggeration, but close.  


Random word spams and wiki spams

2017-07-07 Thread Charles Amstutz
Hello,

I am new to the group, but have experience with writing some rules and some 
meta rules.

Has anyone come up with a good way to detect spam that has random words in 
paragraph forms (usually at the bottom of the message body) or they look like 
they copy parts from various wiki's or other news sources?

Thanks

Charles