Re: missing something in new SA config

2023-12-28 Thread AJ Weber




what is in the /etc/mail/spamassassin/.razor/razor-agent.conf ?


debuglevel = 3
identity   = identity
ignorelist = 0
listfile_catalogue = servers.catalogue.lst
listfile_discovery = servers.discovery.lst
listfile_nomination    = servers.nomination.lst
logfile    = /var/log/spamassassin/razor-agent.log
logic_method   = 4
min_cf = ac
razordiscovery = discovery.razor.cloudmark.com
razorhome  = /etc/mail/spamassassin/.razor
rediscovery_wait   = 172800
report_headers = 1
turn_off_discovery = 0
use_engines    = 4,8
whitelist  = razor-whitelist


Re: missing something in new SA config

2023-12-27 Thread AJ Weber

Thanks for the reply.

SA v3.4.6

razor is installed:

optional module installed: Razor2::Client::Agent, version 2.84

razor plugin is enabled in v310.pre:

loadplugin Mail::SpamAssassin::Plugin::Razor2

I don't see any "logs" in the first page of the lint output.

Would you be so kind as to describe how my "razor_config" is incorrect?  
That might be helpful.


Thanks again.



missing something in new SA config

2023-12-27 Thread AJ Weber

Migrating a mailserver with SA and I see this in my log when testing:

spamd[30912]: razor2: razor2 check failed: No such file or directory 
razor2: Can't read: /var/lib/razor/ at 
/usr/share/perl5/vendor_perl/Mail/SpamAssassin/Plugin/Razor2.pm line 331.


My local.cf has the following:

use_razor2 1
razor_config /etc/mail/spamassassin/.razor/razor-agent.conf

In the config:

razorhome  = /etc/mail/spamassassin/.razor

So I can't for the life of me understand what is looking in 
/var/lib/razor and for what?


Must be something stupid I missed?

Thanks for any pointers.

-AJ




sane max value for message size in 2023?

2023-09-11 Thread AJ Weber
I realize this is very much an "it depends", but recently I'm getting a 
lot of messages bypassing spamc because they're a few KB over the 
default, 500KB limit (spamassassin 3.4.x).


Can I bump this to maybe 750KB, and if so, will spamc read that from one 
of my .pre files, or do I have to somehow add that to a scan command-line?


Thanks in advance,

AJ



Spamhaus DQS usage portal update frequency?

2022-11-09 Thread AJ Weber

Does anyone know how often the DQL usage tab is updated by spamhaus?

I believe my SA was misconfigured, and didn't have anything showing for 
usage.  I think this is fixed now and sent test emails from their 
"Blocklist Tester Verification" tool.  All emails were correctly 
categorized as SPAM, and I see the relevant headers referring to the 
Spamhaus rules in them.


However, I still do not see any usage reflected in the portal at all.  
So I'm just trying to determine whether my config is correct now.


Thanks in advance,

AJ



Re: [Spamhaus notice] New plug-in is now available for use with Spamhaus Domain Blocklist with hostnames which goes into production on February 1st.

2022-01-11 Thread AJ Weber

Sorry for not having followed as closely as maybe I should have, but...

Is there a list of "legacy" Spamhaus cf/pm/plugin entries we would 
remove if we were to install the new DBL plug-in?  I don't see anything 
on the github page, but maybe it's documented elsewhere?


Thanks


On 1/11/2022 8:24 AM, Riccardo Alfieri wrote:
As promised, here is the new release of the Spamhaus plug-in. This 
will help you easily integrate Spamhaus’ Domain Blocklist (DBL) with 
hostnames into your email infrastructure when the revised blocklist 
goes into production from February 1st. You can update your 
configuration with this newly released plug-in before the blocklist 
goes into production.


The plugin is available here: 
https://github.com/spamhaus/spamassassin-dqs/tree/dbl-beta


Reminder: If you have changed your configuration to test the beta DBL, 
you will need to update your config to use the production DBL, which 
goes live on February 1st. If you are currently using the beta version 
of this plug-in, please do not switch to using this production version 
until February 1st. We will continue to make the beta zone available 
for two further weeks giving you additional time to make any required 
changes.




Re: Happy Thanksgiving and Announcing the Apache SpamAssassin Channel for the KAM Rule Set

2020-12-14 Thread AJ Weber




if you are using RH based Linux distros, just put the attached configuration 
file under /etc/mail/spamassassin/channels.d/


Apologies for the naive question;  I'm running CentOS 7, SA 3.4.3.  I 
don't have that channels.d directory by default.  I've been running a 
more traditional cron update:


9 3 * * * /usr/local/bin/sa-update --gpgkey 6C6191E3 --channel 
updates.spamassassin.org && /etc/init.d/spamassassin restart


Can I simplify by putting a conf file for the default updates and the 
KAM updates config into that location, then just run "sa-update && 
spamassassin restart" in cron?


Thanks for any tips.



Re: Apache SpamAssassin and Spammers 1st Amendment Rights

2020-11-20 Thread AJ Weber

On 11/20/2020 9:28 AM, @lbutlr wrote:

A whole lot of people have decided their right to free speech means an 
obligation from others to listen to them. It's not just spammers, it's also 
racists, fascists, republicans, and god-botherers.
I think you should keep politics out of this.  If I want to hear 
opinions from the liberal-left, I'll be sure to circle back with you.  
That's not what this is about.


Re: score sender domains with 4+ chars in TLD?

2020-06-12 Thread AJ Weber

Cool.  Thanks.


On 6/12/2020 11:04 AM, Kris Deugau wrote:

AJ Weber wrote:
I want to try adding a score for a sender whose address uses a TLD 
with  > 3 chars.


I realize there are some legit ones, but I'm going to test it with a 
low score and see what it catches.


Is it just something like:
header   From =~   /\.\w{4,}$/


You'll probably want to use the :addr specifier to match only on the 
actual address:


header LONG_TLD    From:addr /\.\w{4,}$/

Otherwise your rule won't match much mail at all unless the From: 
header consists of a completely bare email address.


-kgd


score sender domains with 4+ chars in TLD?

2020-06-12 Thread AJ Weber
I want to try adding a score for a sender whose address uses a TLD with 
> 3 chars.


I realize there are some legit ones, but I'm going to test it with a low 
score and see what it catches.


Is it just something like:
header   From =~   /\.\w{4,}$/


Thanks in advance.

- AJ



Re: another extortion email check

2020-05-02 Thread AJ Weber
Yes, noticed that as well and considered making it simple with that 
rule.  Probably best thing to do anyway.


Thank you both.

-AJ

On 5/1/2020 5:08 PM, John Hardin wrote:

On Fri, 1 May 2020, Loren Wilton wrote:


Please help, apparently this person "knows everything about me" :)


I got a rash of these a year or two ago, and have started getting a 
few more reciently. I haven't looked at the body of the recient ones, 
so don't know if they are still using the standard text. However, the 
identiying feature is that the subject is a single word, which is the 
stolen password.


You should be able to catch these with a single custom rule along the 
lines of


header    STOLEN_PASSWORD    Subject    =~ /old_password/
score    STOLEN_PASSWORD    10


That's what I do.



another extortion email check

2020-05-01 Thread AJ Weber
I am seeing a number of extortion emails where the hacker has gotten my 
email address and an old password from "the dark web". (Probably one of 
many lists that are out there from one of the many mega-hacks that have 
occurred.)


Is there a way to check for a specific 1-2 words in the body being 
repeated > n times?  The emails seem to be camouflaging their body with 
random HTML and encoding chars.  But they like to repeat my old username 
and an old password a large number of times pretty clearly (I guess to 
get our attention).


If I can check for these terms (individually would be fine), I think I 
could setup some meta rules that would score the number of hits in 
ranges.  Once or twice would probably be no score.  3-5 times would be a 
reasonable score. >5 hits would be an almost automatic spam score.


Please help, apparently this person "knows everything about me" :)

-AJ




help with simple test?

2020-01-15 Thread AJ Weber

I'm hoping this is a relatively simple test...

I'm seeing emails "From Me, To Me", typically extortion types. I'm not 
even seeing which of the SA tests are getting hit, because I have my own 
email in my Whitelist.


Is there a way I can check IF From = m...@staticinfo.com AND Return-Path 
!= FROM in a rule?


I guess no matter what, I would have to remove my own email address from 
the Whitelist?  Or can this be checked and override the 
whitelist-shortcircuit somehow?


Thanks.




Re: phishing by deceptive From address detection

2019-12-18 Thread AJ Weber




The following header is the FROM in the message envelope.

From: =?utf-8?Q?B=CC=B7B=CC=B7?=



I'm not sure what you mean by disguise, and what you expect should have
been done.


I suppose you're right.  I wonder if there's a rule I could develop that 
goes like, [if the descriptive From is entirely different to the name 
(not domain) part of the smtp address - give it some moderate score].


In this particular case, there is nothing close to "BB" in the smtp 
address, which could be an attempt to deceive the user and the spam 
filters.  Not always, I entirely agree, but maybe something I can "play 
with" for my setup.



The 'B' characters have been overlaid with a clearly visible slash,
which isn't very clever in a phishing email.
Interesting, Thunderbird does not show any visible slash.  Just "BB" - 
though the font looks different.




phishing by deceptive From address detection

2019-12-17 Thread AJ Weber
Just looking at a phishing email I received and at first glance I wasn't 
sure how SA (or more-specifically my SA install/configuration) didn't 
score this as spam.


Looks like I have a whitelist setup for alerts from comcast (probably a 
bad idea, but let's address that separately).


The following header is the FROM in the message envelope.

From: =?utf-8?Q?B=CC=B7B=CC=B7?= 

And the email is supposedly one telling me my credit card has been 
compromised, click here to restore access, yada, yada, yada. (I do not 
bank with BB at all.)


I am using the KAM and many of the other rules recommended by those on 
this list.  Besides the whitelist mistake, would this "disguised From" 
be detected by some of the other rulesets (I also use KAM)?  I thought I 
read a post or announcement that this type of disguise was detected 
pretty-well?


Thanks for any help.

-AJ




Re: Spamhaus Technology contributions to SpamAssassin

2019-07-03 Thread AJ Weber

So the (probably obvious to perl folks) fix on RedHat/CentOS is:

yum install perl-List-MoreUtils

All is well after that!

(Posting that in hopes it helps someone else in the future.)

-AJ

On 7/3/2019 8:47 AM, AJ Weber wrote:

Trying to follow the instructions, I got the following error:

spamassassin --lint
Jul  3 08:29:08.089 [26120] warn: plugin: failed to parse plugin 
/etc/mail/spamassassin/SH.pm: Can't locate List/MoreUtils.pm in @INC 
(@INC contains: lib /usr/share/perl5/vendor_perl 
/usr/local/lib64/perl5 /usr/local/share/perl5 
/usr/lib64/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5) at 
/etc/mail/spamassassin/SH.pm line 32.
Jul  3 08:29:08.089 [26120] warn: BEGIN failed--compilation aborted at 
/etc/mail/spamassassin/SH.pm line 32.


Are there more pre-requisites that I'm not aware of?

Thanks,

AJ


On 7/3/2019 5:43 AM, Riccardo Alfieri wrote:

Hello everyone,

I'm sure that many of you are aware that our datasets are already in 
use with SpamAssassin's default config, but I wanted to reach out and 
let you know that we have developed a SpamAssassin plugin that helps 
you get more out of our DNSBLs.


The plugin works with our Data Query Service (DQS). The DQS provides 
you with additional feeds: Zero Reputation Domain & AuthBL, and it 
also receives updates in 'realtime.' This last point is key, because, 
as you can see in the latest Virus Bulletin report 
(https://www.virusbulletin.com/testing/results/latest/vbspam-email-security), 
DQS catches 42% more spam than our RSYNC service or public mirrors.


Last but not least, the usage terms for the DQS are the same as for 
our public mirrors, meaning that if you already use our public 
mirrors, you can register for a personal DQS key free of charge.


You can find all the needed files here: 
https://github.com/spamhaus/spamassassin-dqs


Have fun with our data, and if there are difficulties in installing 
the plugin, or if you have suggestions, you can drop us a line at 
datafeed-supp...@spamteq.com or post here. I'll try to keep the list 
monitored to deliver as much help as I can.




Re: Spamhaus Technology contributions to SpamAssassin

2019-07-03 Thread AJ Weber

Trying to follow the instructions, I got the following error:

spamassassin --lint
Jul  3 08:29:08.089 [26120] warn: plugin: failed to parse plugin 
/etc/mail/spamassassin/SH.pm: Can't locate List/MoreUtils.pm in @INC 
(@INC contains: lib /usr/share/perl5/vendor_perl /usr/local/lib64/perl5 
/usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/lib64/perl5 
/usr/share/perl5) at /etc/mail/spamassassin/SH.pm line 32.
Jul  3 08:29:08.089 [26120] warn: BEGIN failed--compilation aborted at 
/etc/mail/spamassassin/SH.pm line 32.


Are there more pre-requisites that I'm not aware of?

Thanks,

AJ


On 7/3/2019 5:43 AM, Riccardo Alfieri wrote:

Hello everyone,

I'm sure that many of you are aware that our datasets are already in 
use with SpamAssassin's default config, but I wanted to reach out and 
let you know that we have developed a SpamAssassin plugin that helps 
you get more out of our DNSBLs.


The plugin works with our Data Query Service (DQS). The DQS provides 
you with additional feeds: Zero Reputation Domain & AuthBL, and it 
also receives updates in 'realtime.' This last point is key, because, 
as you can see in the latest Virus Bulletin report 
(https://www.virusbulletin.com/testing/results/latest/vbspam-email-security), 
DQS catches 42% more spam than our RSYNC service or public mirrors.


Last but not least, the usage terms for the DQS are the same as for 
our public mirrors, meaning that if you already use our public 
mirrors, you can register for a personal DQS key free of charge.


You can find all the needed files here: 
https://github.com/spamhaus/spamassassin-dqs


Have fun with our data, and if there are difficulties in installing 
the plugin, or if you have suggestions, you can drop us a line at 
datafeed-supp...@spamteq.com or post here. I'll try to keep the list 
monitored to deliver as much help as I can.




Re: FSL_BULK_SIG tweak?

2018-03-12 Thread AJ Weber



That's it.  exists:List-Unsubscribe means does the email have a
List-Unsubscribe header.




Thank you.


FSL_BULK_SIG tweak?

2018-03-12 Thread AJ Weber

I started down the rabbit hole looking to see how this rule works...

Besides checking if one of the bulk mail rules hit (like DCC), it uses 
"72_active.cf:header   __FSL_HAS_LIST_UNSUB exists:List-Unsubscribe" (It 
negates that test.)


That seems logical, but how do I find the List-Unsubscribe logic/rule?  
I tried grep'ing the places I normally would expect, but didn't find any 
more information (KAM also uses this, I found).


The issue is that the email received IS a bulk email, but there is an 
"opt-out" link in the bottom of the html body.  I would think this 
qualifies as a way to Unsubscribe and thus this rule is wrong (in this 
case).


So I was hoping to review this "List-Unsubscribe" rule and possibly make 
an enhancement suggestion.  (If this rule is purely looking at headers 
and not at whether there is a valid unsubscribe link in the body, that's 
a different "problem".  But I don't know.)


Thanks if you can point me in the right direction!

-AJ



Re: From name containing a spoofed email address

2018-01-19 Thread AJ Weber

False Positive


On 1/19/2018 2:55 PM, Jeffs Chips wrote:
I am trying to follow this interesting thread - can someone tell me 
what "FP" means?


__
 "Perhaps sleep did not evolve. Perhaps it was the thing from which 
wakefulness emerged.” -- Matthew Walker, Sleep Scientist


On Jan 19, 2018 12:02 AM, "Pedro David Marco" > wrote:




>!~ matches are dangerous because they match by default if you
>don't anticipate all the legitimate formats. The above will FP on a
>simple email address. It could be rewritten as a __FROM_DOMAINS_MATCH
>and used in a meta rule.

fool me, your are right, RW, thanks...

>It's also not a complete solution as it doesn't handle third-level
>domains correctly e.g. in
>
>"supp...@paypal.co.uk "
>
>
>"co" will match "co". This is why it's probably best to do it in perl
>where the tlds from 20_aux_tlds.cf  can be
used.

you are right as well...  but his problem is hard to solve becasue
subdomains can be almost unlimited
and even worse... domains can be different but valid, outlook.com
 and hotmail.com  for example.









Re: check utf-8 subjects/from?

2017-12-14 Thread AJ Weber

On 12/13/2017 6:58 PM, Reindl Harald wrote:

> There seems to be a large disparity between your (10%) result and my
> (2%) result.  Can you explain how that could be?

surely, from the moment you have not only english messages it looks 
completly different and don't forget that the corpus where i run the 
quick grep is only a very low subset of real mailflow for training as 
ham when needed



I'm not sure I understand what you are saying now.

Are you saying you ran a flawed/inaccurate test but sent the results 
anyway in order to make a point that no one asked you about?


Or are you saying that every mail environment is (necessarily) 
different, and whatever your opinion and results in your local 
environment are, they may not be applicable to another environment in 
another country, so you probably should not make your assumptions and 
opinions sound like facts?


In my OPINION, the aforementioned rule that I will test is likely NOT a 
good candidate for many environments - but I never promoted it as such 
in the first place.


Apologies to all whose inboxes were cluttered with this tangent.


Re: check utf-8 subjects/from?

2017-12-13 Thread AJ Weber

On 12/13/2017 5:18 PM, Reindl Harald wrote:


my statements are based on a decade expierinece with a lot of users 
from all over the world, on you personal server you can even reject 
anything not whitelisted, from the moment on when other peoples 
mailflow is affected it's no longer that easy
It's true.  At first I noticed a pattern and decided to look-into how I 
could write a rule, probably starting with a low score, to test its 
effectiveness.


However, I ran your test to determine how many emails it would actually 
affect.  In a folder of just over 5100 emails, there would be < 2% 
false-positives.  That's actually better than I expected!  If you 
offered me a rule that only anticipated 2% false positives to try, I 
would say it was worth it for sure!





this would be a rule with a majority of false positives
you really should also look at your HAM
I didn't see the basis for your "majority" of false positives.  Did you 
run your test against a spam folder as well?  What were the results there?


cat *.eml | grep UTF-8 | grep -i subject | wc -l
2150

that tells me that rougly 10% of all ham mails would hit
There seems to be a large disparity between your (10%) result and my 
(2%) result.  Can you explain how that could be?


Thank you again!


Re: check utf-8 subjects/from?

2017-12-13 Thread AJ Weber
Would you be so kind as to tell me how you hacked into my mail server to 
determine the basis for your statements?




On 12/13/2017 4:52 PM, Reindl Harald wrote:



Am 13.12.2017 um 19:44 schrieb AJ Weber:
Is there an easy way to check if the Subject or From is UTF-8 -- or 
non-ASCII -- char set?


I see in some of my recent spam, either the Subject or the From 
(sometimes both) starts with "=?UTF-8?" (in these cases the rest is 
Base64 encoded, but I don't want to qualify on that).


If I check a header with a "header ... =~" regex rule, is it the raw 
text that I will check, or is it the decoded characters I will be 
checking against?


If it's the raw text, I can probably just look for that prefix to 
indicate the UTF-8 encoding.


I do get some legitimate emails with encoded chars and emojis, 
etc...but I think I'd like a rule to support it being SPAM in general


based on what?

this would be a rule with a majority of false positives
you really should also look at your HAM

cat *.eml | grep UTF-8 | grep -i subject | wc -l
2150

that tells me that rougly 10% of all ham mails would hit




check utf-8 subjects/from?

2017-12-13 Thread AJ Weber
Is there an easy way to check if the Subject or From is UTF-8 -- or 
non-ASCII -- char set?


I see in some of my recent spam, either the Subject or the From 
(sometimes both) starts with "=?UTF-8?" (in these cases the rest is 
Base64 encoded, but I don't want to qualify on that).


If I check a header with a "header ... =~" regex rule, is it the raw 
text that I will check, or is it the decoded characters I will be 
checking against?


If it's the raw text, I can probably just look for that prefix to 
indicate the UTF-8 encoding.


I do get some legitimate emails with encoded chars and emojis, etc...but 
I think I'd like a rule to support it being SPAM in general.


Thanks again,
AJ


help with phishing email?

2017-12-08 Thread AJ Weber

I'm trying to decide the best way to detect something like this.

https://pastebin.com/hCX9MWNg

Looking at the raw headers and body it's pretty easy to tell this is a 
spoof, but when it shows-up in an inbox, it looks pretty good.


Something specific to Amazon (where this is purported to come from) 
would be to check if their domain is in the From and Reply-To and at 
least score that relatively high if it's not correct - but compared to 
what?  Maybe if From text contains amazon/i and from-address does not 
end with amazon.com (for me in the US at least)?


That feels forced.  Does anyone have any suggestions to help me out on 
this fine Friday?


Thanks,

AJ




Re: NOTE: Warning to Abusers of Update Servers

2017-11-21 Thread AJ Weber


The major offenders are sa-update 3.3.x and generic curl clients based 
on the user agent in the logs running from every minute to every 15 
minutes and blindly pulling down the same rulesets over and over.


My "vote" counts for very, very little, but since these clients already 
have the latest rules (multiple times, apparently), why not just block them?


If they are actually monitoring their update scripts at all (seems 
doubtful), it should get their heads out of the sand (was going to use a 
similar metaphor but wanted to be nice).  They will probably look for a 
resolution and find these latest posts.


If they're not monitoring their updates on any regular basis, it doesn't 
seem like they care if they get them anyway.


Re: URIBL_BLOCKED - which one?

2017-10-13 Thread AJ Weber

On 10/13/2017 9:23 AM, Reindl Harald wrote:
next time make a notice in your first post that you don#t have a 
serious mailserver but "maybe because I have a DHCP address from a 
major ISP and that's a problem"


OK, I can do that, but there isn't anything in the troubleshooting for 
DNSBL regarding how your IP address is assigned.  It just recommends 
that you use your own, caching DNS server.  If that is important, maybe 
it should be mentioned in the docs?



Am 13.10.2017 um 15:20 schrieb AJ Weber:

I put the following in my local.cf.  This does not work?

dns_available yes
# - REDIRECT DNS LOOKUPS TO LOCAL "unbound" service to avoid RBL bans
dns_server 127.0.0.1

then your machine is *not* using 127.0.0.1 as the only DNS server
So does this "dns_server" directive in my local.cf file work as 
expected?  If so, my SA *is* using 127.0.0.1 as the only DNS server.


Re: URIBL_BLOCKED - which one?

2017-10-13 Thread AJ Weber

I put the following in my local.cf.  This does not work?

dns_available yes
# - REDIRECT DNS LOOKUPS TO LOCAL "unbound" service to avoid RBL bans
dns_server 127.0.0.1



On 10/13/2017 8:48 AM, Reindl Harald wrote:



Am 13.10.2017 um 14:40 schrieb AJ Weber:
I guess this qualifies as a newbie question...I've been running SA 
for a while, but haven't really dug into some of the workings...


I occasionally see the URIBL_BLOCKED notice in some of my spam 
results. I read the related web page, and started using unbound as a 
local DNS, but I'm still seeing this


then your machine is *not* using 127.0.0.1 as the only DNS server




Re: URIBL_BLOCKED - which one?

2017-10-13 Thread AJ Weber

On 10/13/2017 8:57 AM, David Jones wrote:

On 10/13/2017 07:47 AM, Markus Clardy wrote:
URIBL_BLOCKED is in reference to multi.uribl.com 
.

--
  - Markus


To disable queries to multi.uribl.com, put this in your local.cf or 
equivalent in /etc/mail/spamassassin:


score URIBL_BLACK 0
score URIBL_GREY 0
score URIBL_RED 0

Based on my mail flow and other RBLs, I didn't miss this RBL when I 
disabled it years ago.  It may be valuable to some but Spamhaus and 
IVM do most of the heavy lifting on my mail filters.


@Markus, @David: Thank you both.  I started digging into the .cf files 
and did find that reference to multi.uribl.com.


Strange that they are denying my queries.  Maybe because I have a DHCP 
address from a major ISP and that's a problem?  I don't really 
understand how they determine who is querying their RBLs.  I thought 
running unbound locally would help mitigate that problem, but I guess not.


Thanks again.


URIBL_BLOCKED - which one?

2017-10-13 Thread AJ Weber
I guess this qualifies as a newbie question...I've been running SA for a 
while, but haven't really dug into some of the workings...


I occasionally see the URIBL_BLOCKED notice in some of my spam results.  
I read the related web page, and started using unbound as a local DNS, 
but I'm still seeing this.


Since I have a number of RBL's setup, is there a way to determine which 
of the RBLs blocked my query?  Maybe I have one configured that I need 
to "license" or subscribe-to in some way?


Thanks for the troubleshooting assistance.

-AJ



Re: improving detection to cloudmark-like levels?

2017-10-12 Thread AJ Weber

On 10/12/2017 11:33 AM, Ian Zimmerman wrote:

I don't know how you got the supposition about pyzor.

pyzor is completely independent of Cloudmark (unlike razor) and AFAIK
pyzor scores are  based on participating users' reports and nothing
else.
Sorry.  It is razor2 that is (or was - according to the website) 
supported by Cloudmark.


Re: improving detection to cloudmark-like levels?

2017-10-12 Thread AJ Weber

On 10/12/2017 10:07 AM, Kevin A. McGrail wrote:

On 10/12/2017 9:25 AM, AJ Weber wrote:
I'm open to new rules, plug-ins, etc. Spam volume is only getting 
worse, and these spammers are getting more creative. 


Hi AJ,

I have to say that 3.3.0 is pretty old.  I'd look to run a newer 
version, invest some time into researching a few RBLs and consider 
adding my KAM.cf file.
OK, I'll look into the update procs.  I don't see an updated package 
available via yum (CentOS), but maybe I'm not looking in the right place.


I do use an RBL or two, I think "bl.mailspike.net", but I haven't 
figured out how to test that they're working correctly.


Thanks for the quick reply.


improving detection to cloudmark-like levels?

2017-10-12 Thread AJ Weber

OK, please, this is meant with all good intentions...

I have been running SA 3.3.0 on my server for years.  Using the standard 
rule updates channel and "sought.rules.yerp.org".  (I don't see those 
updated too often, maybe I need to check on that update process.)  Also 
enabled:  DCC, Pyzor and Razor2.


This does a very good job as currently configured.  However, I also have 
Cloudmark's "DesktopOne" client-product installed for years.  They are 
discontinuing that product on Dec 1.  I certainly would see the 
cloudmark-product catch _additional_ spam on a daily basis (very 
accurately).


So I'm sure they have some "secret sauce" and I'm not asking for that to 
be revealed, but since pyzor is supposedly using their database, I'm 
just trying to figure out if there's a way to get my SA filter to 
improve even further and close the gap?


So it's a very open-ended ask, but I thought maybe I could start a 
conversation and see if there are any ideas out there.  I'm open to new 
rules, plug-ins, etc.  Spam volume is only getting worse, and these 
spammers are getting more creative.


Thanks in advance,

AJ



Re: rule to test body length?

2012-01-08 Thread AJ Weber


 Please don't top-post.

Sorry.  Even though I subscribed, and sent the confirmation email, I still
don't get any of the messages in my email, so I'm posting via the Old
Nabble web form.  That doesn't allow me to automate indenting/quoting
previous messages, so I will manually put gt's in front of all the lines if
you want.

Body tests are run per paragraph, so you would need one of then to
have 100 chars. 

Wow.  I would've thought I would have run across this info in all the
searching I've done about rules and custom rules.  Good to know, thanks.

Also they are just run on just the text that the reader
would see, if that matters to you. If you are intending to give this a
significant score, then it seems a bit reckless to me. Do you never
receive terse emails?

I sometimes receive terse emails, but very rarely to the accounts I'm trying
to protect with SA.  Since no spam filter is 100%, this just seems to be a
rule that I could use, with an appropriate score.

If you are new to SA I would suggest you start with making sure that
Bayes is properly trained, and you have have the infrastructure to
keep it trained without much effort. Razor DCC etc are fairly minor
components compared to BAYES.

I can train Bayes, but keeping it trained might be a bit of effort for the
install size I'm dealing with (small).  Since this is a combination of work-
and non-work mailboxes, the breadth of email types that the users would
consider ham is probably not going to make Bayes training very accurate, but
I would love to be wrong.

Thanks for the reply.
-- 
View this message in context: 
http://old.nabble.com/rule-to-test-%22body%22-length--tp33092865p33104550.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: rule to test body length?

2012-01-08 Thread AJ Weber



John Hardin wrote:
 
 The thread subject is Short body rules on 11/25/2011
 
Thanks for the pointer.  Using the Old Nabble1 website, there are ZERO
threads/emails archived for 11/25/11.  :(

When I get some time, I'll see where the other archives are for this list
and search there.  Thanks again.
-- 
View this message in context: 
http://old.nabble.com/rule-to-test-%22body%22-length--tp33092865p33104565.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



rule to test body length?

2012-01-06 Thread AJ Weber

Is there a way to check if the body of an email is less than some threshold
(length of chars)?  I'm seeing some spam slip through because it's purposely
too short to hit a lot of rules, and too short for DCC and other networked
systems to get a fingerprint on.

For example:
Any body where len  50, score...

Would it be something like /.{1,50}/ ???

Thanks,
-AJ
-- 
View this message in context: 
http://old.nabble.com/rule-to-test-%22body%22-length--tp33092865p33092865.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: rule to test body length?

2012-01-06 Thread AJ Weber

Didn't find it, but I'll keep looking.  While searching, I noticed you had
some updated chickenpox rules, but I didn't see them in your sandbox (at
least from the link I looked at).  I know this is a tangent, but could you
direct me to that rule-set?  I have the one from the SA wiki, but it doesn't
seem enough.

Thanks for the reply,
AJ


John Hardin wrote:
 
 On Fri, 6 Jan 2012, AJ Weber wrote:
 
 Is there a way to check if the body of an email is less than some
 threshold
 (length of chars)?
 
 Check the archives. This came up a month or two ago and I suggested a rule 
 set to detect a short body. Karsten then suggested a minor refinement.
 
 You can't do it in a single rule.
 
 -- 
   John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
   jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
   key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
 ---
Windows Genuine Advantage (WGA) means that now you use your
computer at the sufferance of Microsoft Corporation. They can
kill it remotely without your consent at any time for any reason;
it also shuts down in sympathy when the servers at Microsoft crash.
 ---
   11 days until Benjamin Franklin's 306th Birthday
 
 

-- 
View this message in context: 
http://old.nabble.com/rule-to-test-%22body%22-length--tp33092865p33093677.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: rule to test body length?

2012-01-06 Thread AJ Weber

BTW: To expound upon my previous guess at matching short messages, what's
wrong with:
body MY_TOO_SHORT /^.{1,100}$/

(Which I mean to check for a message where the length is  100 chars)


AJ Weber wrote:
 
 Didn't find it, but I'll keep looking.  While searching, I noticed you had
 some updated chickenpox rules, but I didn't see them in your sandbox (at
 least from the link I looked at).  I know this is a tangent, but could you
 direct me to that rule-set?  I have the one from the SA wiki, but it
 doesn't seem enough.
 
 Thanks for the reply,
 AJ
 
 
 John Hardin wrote:
 
 On Fri, 6 Jan 2012, AJ Weber wrote:
 
 Is there a way to check if the body of an email is less than some
 threshold
 (length of chars)?
 
 Check the archives. This came up a month or two ago and I suggested a
 rule 
 set to detect a short body. Karsten then suggested a minor refinement.
 
 You can't do it in a single rule.
 
 -- 
   John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
   jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
   key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
 ---
Windows Genuine Advantage (WGA) means that now you use your
computer at the sufferance of Microsoft Corporation. They can
kill it remotely without your consent at any time for any reason;
it also shuts down in sympathy when the servers at Microsoft crash.
 ---
   11 days until Benjamin Franklin's 306th Birthday
 
 
 
 

-- 
View this message in context: 
http://old.nabble.com/rule-to-test-%22body%22-length--tp33092865p33093814.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: razor2 and cloudmark?

2012-01-05 Thread AJ Weber

Yes, I still have other rules enabled.  I have found the Cloudmark product to
be extremely accurate, and really my question is specific to whether razor
== cloudmark or to what extent they are related and how, so I can better
understand the results I'm seeing.

Thanks for the reply,
AJ


Martin Hepworth-2 wrote:
 
 Of course razor2 checks only provide part of the score to SA , have you
 checked the other rules fired on that email and the nothing else is
 marking
 the score down?
 
 Martin
 
 On Thursday, 5 January 2012, AJ Weber awe...@comcast.net wrote:

 I am testing the Razor2 plugin and am surprised that some obvious spam
 is
 getting through.  The reason I'm most surprised is that the SA install
 (3.3.1) seems to be checking the message with Razor2 and passing it.
 However, I have Cloudmark Desktop One running on my PC, and when the
 message gets to my PC, that client flags it as spam immediately.

 Can someone tell me the relationship between Razor2 and Cloudmark?  It
 appears to be somewhat supported by the same people.  So is the Razor
 data
 purposely not refreshed as often in order to sell Cloudmark Authority
 licenses?

 I was sort of thinking it all used the same reference-db, but maybe not.
 (If they DO, then I have some follow-on questions about how to debug why
 the
 Razor2 plugin is not flagging a message that Cloudmark One is flagging
 only
 milliseconds later.)

 Thanks for any info on the above and any troubleshooting techniques that
 you
 can share.

 -AJ

 --
 View this message in context:
 http://old.nabble.com/razor2-and-cloudmark--tp33082922p33082922.html
 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


 
 -- 
 -- 
 Martin Hepworth
 Oxford, UK
 
 

-- 
View this message in context: 
http://old.nabble.com/razor2-and-cloudmark--tp33082922p33086148.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: razor2 and cloudmark?

2012-01-05 Thread AJ Weber

OK, fair-enough, and your theory seems very valid.  I wish they (Cloudmark)
made a SA plugin for us SOHO users that can't afford (and don't need) a full
Cloudmark Authority server/setup.  I'd pay a license fee if it were
reasonable and it performed anywhere near as accurately as their windows
desktop product.

Guess I'll go add-in DCC as well (Pyzor already also included).

Thanks for the response.

-AJ


Kevin A. McGrail wrote:
 
 On 1/5/2012 8:49 AM, AJ Weber wrote:
 Yes, I still have other rules enabled.  I have found the Cloudmark
 product to
 be extremely accurate, and really my question is specific to whether
 razor
 == cloudmark or to what extent they are related and how, so I can better
 understand the results I'm seeing.
  From my perspective, Cloudmark is an overall anti-spam solution whereas 
 Razor is a single score in the anti-spam framework.
 
 Cloudmark likely uses their /Cloudmark Collaborative Security Network/ 
 which is what Razor queries as well.  However, for Cloudmark, the CCSN 
 query is just one part of their framework for testing messages if I had 
 to make a slightly educated guess.
 
 The first step is getting SA's framework built.  Then you start looking 
 at tweaking and adding things that improve the framework.  Cloudmark has 
 likely done this for you so comparing Cloudmark to Razor is apples to 
 oranges.  A framework can't be compared to one test.
 
 There are lots of things that can use the framework of SA from 
 content-based heuristic tests to pathway analysis via DNSBLs to even 
 things like OCR checks on images.
 
 Hope this helps.
 
 Regards,
 KAM
 
 
 
 

-- 
View this message in context: 
http://old.nabble.com/razor2-and-cloudmark--tp33082922p33086300.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



razor2 and cloudmark?

2012-01-04 Thread AJ Weber

I am testing the Razor2 plugin and am surprised that some obvious spam is
getting through.  The reason I'm most surprised is that the SA install
(3.3.1) seems to be checking the message with Razor2 and passing it. 
However, I have Cloudmark Desktop One running on my PC, and when the
message gets to my PC, that client flags it as spam immediately.

Can someone tell me the relationship between Razor2 and Cloudmark?  It
appears to be somewhat supported by the same people.  So is the Razor data
purposely not refreshed as often in order to sell Cloudmark Authority
licenses?

I was sort of thinking it all used the same reference-db, but maybe not. 
(If they DO, then I have some follow-on questions about how to debug why the
Razor2 plugin is not flagging a message that Cloudmark One is flagging only
milliseconds later.)

Thanks for any info on the above and any troubleshooting techniques that you
can share.

-AJ

-- 
View this message in context: 
http://old.nabble.com/razor2-and-cloudmark--tp33082922p33082922.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.