Re: Hive Mind: postfix prescreen and SA ruleqa

2019-04-16 Thread RW
On Tue, 16 Apr 2019 15:16:59 +0300
Jari Fredriksson wrote:

> John Hardin kirjoitti 15.4.2019 1:33:
> > On Sun, 14 Apr 2019, Jari Fredriksson wrote:
> >   
> >> Now, I am part of RuleQA. Should I accept everything and pass it
> >> so SpamAssassin and to my corpus or not?  
> > 
> > I would suggest yes, you should accept everything that reaches your
> > spamtrap addresses and include it in your corpora. Don't worry about
> > that, worry about whether or not the messages get correctly
> > classified.  
> 
> Thanks. I might test next weekend about dropping the postscreen 
> scanning. 

Before you do that, I would suggest you read what Henrik K wrote:

  "There are already major spamtrappers etc contributing to ruleqa, I
  think most of the "easy dialup" spam is seen there too.  Just try to
  look for the hard to catch spam not ending up in ham corpus."

IMO the corpus should contain a bit of everything, but ideally it should
be dominated by the spam that would reach SA on a server following
best practice. A corpus generated from unfiltered spamtraps is very
heavily biased in the wrong direction.

Also if someone is processing mail from an MTA they don't control it
doesn't mean that there's no upstream MTA filtering. My experience is
that lighter (low FP) spam filtering is something you have to pay extra
for these days. 



Re: Hive Mind: postfix prescreen and SA ruleqa

2019-04-16 Thread Jari Fredriksson

John Hardin kirjoitti 15.4.2019 1:33:

On Sun, 14 Apr 2019, Jari Fredriksson wrote:

Now, I am part of RuleQA. Should I accept everything and pass it so 
SpamAssassin and to my corpus or not?


I would suggest yes, you should accept everything that reaches your
spamtrap addresses and include it in your corpora. Don't worry about
that, worry about whether or not the messages get correctly
classified.


Thanks. I might test next weekend about dropping the postscreen 
scanning. It has been bearable these days with spam, not a hard labor to 
work with the ham & spam into dedicated folders. I am subscribed to tons 
of domestic and foreign ham delivery sources just because of this hobby.


--
ja...@iki.fi


Re: Hive Mind: postfix prescreen and SA ruleqa

2019-04-16 Thread Jari Fredriksson

David Jones kirjoitti 14.4.2019 23:31:

Once you get this type of platform setup, it can be used for other spam
fighting techniques on the primary mail filters like:

- train your shared redis Bayes DB with the ham and spam folder


I have similar system including the Redis, SpamCop, Pyzor and Razor2 
reporting, but the following is not (yet :D) implemented.


- fail2ban - run a custom script on the secondary server to block IPs 
on

the primary filters
- swatch - run custom scripts when certain rules are hit:
- add entries to your own private RBL to catch zero hour spam
- auto release certain messages from quarantine as an attachment
- etc...


Thanks for your post!

--
ja...@iki.fi


Re: Hive Mind: postfix prescreen and SA ruleqa

2019-04-16 Thread Jari Fredriksson

Bill Cole kirjoitti 14.4.2019 21:13:

On 14 Apr 2019, at 4:03, Jari Fredriksson wrote:



How can I best support SpamAssassin besides having a mass check 
automation and mirrors for the sa-update?


Those are both large contributions. Thank you for that support.

The obvious repository of things we need fixed is the Bugzilla. There
are a lot of open bugs, most of which require some Perl prowess and
substantial time to fix, because we've done pretty well on attacking
simple bugs as they are reported. There are gaps and flaws in the
documentation, both on the Wiki and internal to the code, where many
authors have used only standard commenting instead of POD, effectively
hiding documentation. rule development is also potentially quite
helpful, if you are good at it and have the patience to deal with the
testing process (e.g. like John Hardin.)


Thank you for your post. I am a professional software developer and work 
currently as a Senior Java Developer in a large multinational. I know 
"tons of" programming languages but sadly Perl or Regexp are not part of 
it. I can READ Perl, but I can not read the Perl book I have to the 
amount that I could actually be productive in a Perl software project. 
The language just feels too backwards to me that I can not bother to 
learn.


So I will stick to my day job and ponder other options for this.

--
ja...@iki.fi