Re: mass check tips and tricks - need advice

2013-02-17 Thread Marc Perkel
OK - I'm getting mass checking set up and working. I'm still in the 
testing phase.


Right now the process of selecting spam and ham is automated. It's not 
manually selected. Is that a problem?


I'm only including email streams that I'm sure of. The spam comes from 
sources that are on multiple black lists, URIBL links, and committed 
other sins that only spammers do, and SA scores over 15.. The white list 
is from 100% trusted sources. Eventually I hope to include some hand 
sorting of messages in the middle but for now these are extreme ham and 
spam.


Looks like it takes me 70 minutes to process 46k messages. I'll probably 
process 100k messages nightly and they will all be fresh.


Right now I'm going through to verify the ham and spam just to ensure 
it's accurate and doesn't contain anything that shouldn't be there. Not 
reading every message but not finding any errors.


Looking for advice at this point about anything I should be doing that 
I'm not, or any useful feedback.



--
Marc Perkel - Sales/Support
supp...@junkemailfilter.com
http://www.junkemailfilter.com
Junk Email Filter dot com
415-992-3400



Re: DKIM scoring with spamassassin

2013-02-17 Thread Patrick Ben Koetter
Quanah,

* Quanah Gibson-Mount :
> --On Friday, February 15, 2013 5:01 PM -0800 John Hardin
>  wrote:
> 
> >On Fri, 15 Feb 2013, Quanah Gibson-Mount wrote:
> >
> >>Does anyone tweak the DKIM scores given by SA?  There are plenty of
> >>scenarios  where DKIM has failed, yet SA does not give the email a
> >>particularly high  spam mark.  3 example test cases below.  I guess I
> >>was expecting SA would  score DKIM failures more aggressively if there
> >>are problems with the signing:
> >
> >DKIM and SPF are anti-forgery tools, not anti-spam tools.
> >
> >If you take a DKIM-signed email that is whitelisted because of
> >whitelist_auth and make a change that invalidates the signature, does it
> >still get whitelisted? If not, then SA is doing all that it can
> >reasonably be expected to do with the invalid signature.
> >
> >DKIM or SPF pass or fail *by itself* is not useful as a spam sign. Taken
> >together with other factors (such as DKIM invalid + claims to be from
> >Wells Fargo) it's useful.
> 
> Ok, thanks.  If any of our users ask, this is a good summary. :)

if you want your spam filters to benefit from DKIM, you need to build
reputation. You need to account if or if not a domain uses DKIM and what the
average spam score of that sender domains is.

The OpenDKIM reputation project has introduced a local reputation database and
uses SpamAssassin to get the spam score. You might want to investigate in the
project if you want to use DKIM (as one of many methods) to filter spam.

p@rick

-- 
[*] sys4 AG
 
http://sys4.de, +49 (89) 30 90 46 64
Franziskanerstraße 15, 81669 München
 
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
Aufsichtsratsvorsitzender: Joerg Heidrich
 


Reporting whole-mailbox with spamc

2013-02-17 Thread Dan Mahoney
Hey there all,

I recently switched from using alpine exclusively to using imap on my iDevices. 
 I've converted alpine to read my mailboxes via imap instead of 
local-file-system.

In alpine, I was able to pipe a message to spamc -d servername -C report, which 
would feed bayes, as well as reporting to pyzor, razor, and the like.  While I 
can still do this, I'd like to be able to report/learn from any device I'm on.

Since I'm doing imap more, I've decided to go to the route of having a "learn 
spam" and "learn ham" folder, but after checking the wiki, I don't see a good 
way of going about what I need.

1) While I've found that spamassassin and sa-learn can take a mailbox as an 
argument, I haven't found a good way to do this with spamc.  Also, I'd like it 
if the mere presence of a message in the folder is a sigil of whether or not 
its been processes.

2) While I could take a tool server-side and "mv" the mailbox and then split 
it, I don't know how imap would react to this.

I *think* the right answer is to connect to the mailbox with server-side tools 
that actually implement the correct locks (so as to be imap-compatible, and so 
that they don't process an incomplete message), and delete messages as they're 
piped to spamc -C report.  The thing is, I haven't found any tools that do 
this, and while it's probably a trivial amount of work to implement, I'd rather 
not reinvent the wheel.

Noting as well that my plan is to make this a system-wide thing once this works 
for me, via cron (once every half-hour or so), has anyone else come up with a 
good answer to this problem?

-Dan