Kevin wrote:
> David wrote:
>   
>> Relying only on the website linked, I see 2 things that ASSP doesn't have:
>>
>> 1) "Images embedded in emails are scanned for spam, even if they are 
>> contained in PDFs."
>> It could either use something like the MSRBL clamav defs to detect image 
>> spam, or it appears to use OCR on images and PDFs, and then passes the 
>> text on to the filters, which, depending on its implementation and 
>> effectiveness, could be a pretty good feature
>>     
>
> ASSP, being a proxy, could probably never handle this.
> It would be best handled in a full MTA with a queue.
>
>   
The point of ASSP is to do all the spam filtering on the front end and 
pass only clean messages to the MTA. Getting the MTA do to additional 
analysis defeats the purpose and doubles the work involved, especially 
since the MTA has no way of passing the text from the queued emails back 
to ASSP for analysis. If OCR were ever to be done, it would be done on 
ASSP's end. Whether or not it is feasible to do it in the timeframe that 
ASSP needs to process a message, and whether or not anyone is willing to 
write the code, is another question.
>> 2) Logfiles processable through Sawmill. I would LOVE to see an 
>> improvement in ASSPs logging feature. The raw log is nice, and the "Info 
>> & Stats" page is good, but they are lacking. I'd love to see breakdowns 
>> of the effectiveness of different filters, such as which ones are the 
>> most effective and which ones generate the most false positives (by 
>> looking at errors/notspam). I'd like to see how much email my clients 
>> send and receive so I can see if anyone is abusing the service.
>>     
>
> I believe Sawmill had a filter for ASSP at one point.
> http://www.sawmill.net/formats/anti_spam_smtpproxy.html (google!)
>
> Also the logging in ASSP is fine, what you want is log analysis.
>
>   
Yes, please, let's play with semantics. In that case, what I am 
interested in is log analysis and fine-tuning the effectiveness of the 
filters.

As far as log analysis goes, it kind of falls apart because 1) there are 
so many logging options that any log analyzer will need the logging to 
be done in a very specific manner so it can parse it; and if anything 
changes then the parsing goes to hell and 2) It all breaks anyways when 
Fritz decides to change how the logging is done and how the mail headers 
are added. Something as small as a space or spelling change will 
completely befuddle an analyzer trying to parse through plaintext. I've 
seen several iterations of the "X-ASSP-Spam:" header itself, I'm not 
sure what else fluctuates.

Sawmill entirely choked on my logfiles and didn't know what to make of them.

A move to something more standardized, like XML logging, would be a 
positive move. XSL can be simply used to make the XML human readable, 
and with a more standard logging format it would make data analysis that 
much easier without having to parse plaintext so laboriously.
>> Analyzing the errors folder could be a goldmine of useful information in 
>> terms of which filters are the most effective and which ones aren't. If 
>> I see that a certain filter is giving me 0 false positives (like say 
>> Spam Helo or Forged Helo), then I might want to increase the scoring on 
>> that filter. If I see that a certain filter is too aggressive and making 
>> too many false positives, then I'd want to lower the scoring on that 
>> one. If certain RBLs are making too many false positives, then I'd want 
>> to remove them from the mix. 
>>     
>
> The CCSpam and a few grep searches would probably give you the info you 
> want.
>
>   
While manually grepping through a few thousand mail messages does seem 
like a good time, an automated system really would be better.
>> That sort of thing would be very useful in 
>> fine-tuning ASSP's performance.
>>     
>
> How does adding/removing a DNSBL or changing the scoring of a test 
> "fine-tune" performance?
> You're talking about effectiveness. They are not the same.
>
>   
Again, semantics. I was talking about performance in the sense of how 
well a filter performs at stopping spam.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Assp-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/assp-user

Reply via email to