Re: [SLUG] {Spam?} Anti spam software?

2008-06-24 Thread Jamie Wilkinson
2008/6/20 Sean Murphy [EMAIL PROTECTED]:
 All,

 I am after a good Anti spam software program for Linux which is shell based.  
 I am aware of Spam assassign.  But I would like to know if there is anything 
 else which is better?

I had amazingly good results using only bogofilter, after about a
month training it on spam, and by hooking up my mail reader to also
train it on non-spam (every message I replied to got pumped to
bogofilter as non-spam training).  My circumstantial evidence then is
that bayesian filters work really really well when you train on both
positive and negative inputs.  I never got any non-spam marked as
spam, and rarely did it misclassify spam.
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] {Spam?} Anti spam software?

2008-06-19 Thread Mary Gardiner
On Thu, Jun 19, 2008, Sean Murphy wrote:
 All,
 
 I am after a good Anti spam software program for Linux which is shell
 based.  I am aware of Spam assassign.  But I would like to know if
 there is anything else which is better?

Better could mean a few things in the context of spam filtering, could
you clarify which of these features is more important to you:

 1. overall accuracy (false negatives and false positives)
 2. fewest number of false negatives (spam that gets through to your
inbox)
 3. fewest number of false positives (good mail that ends up marked as
spam)
 4. good accuracy when in the default configuration, no twiddling
required
 5. good for processing large volumes of mail without insane resources

I use SpamAssassin and find it does really well, but it falls down at #4
and #5. I have to train the Bayesian[1] classifiers on all my mail in order
to get good-to-me accuracy, so I am certainly not relying on the default
configuration. (I suspect I'd do just as well switching entirely to a
Bayesian system, but since SpamAssassin is now doing fine I have not
done so).

And it's a resource hog, it sometimes takes 8 seconds to scan a mail on
my OK-standard desktop system. So if you were receiving more than an
email about every 8 seconds you'd be looking at performance tuning and
additional less hoggy measures, or at alternatives. (Everything that
processes the full body of an email is somewhat resource intensive, but
I understand that SA is not great.)

-Mary

[1] Automatically trained and dynamic rules, rather than fixed rules.
Has the downside that you need to fairly regularly show it current
examples of good and bad mail so it can continue to learn to
discriminate between them. The Bayesian machine learning techniques are
not exactly cutting edge machine learning, but they only have to
discriminate between two categories for this use.
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] {Spam?} Anti spam software?

2008-06-19 Thread david
On Fri, 2008-06-20 at 07:57 +1000, Mary Gardiner wrote:
 On Thu, Jun 19, 2008, Sean Murphy wrote:
  All,
  
  I am after a good Anti spam software program for Linux which is shell
  based.  I am aware of Spam assassign.  But I would like to know if
  there is anything else which is better?
 
 Better could mean a few things in the context of spam filtering, could
 you clarify which of these features is more important to you:
 
  1. overall accuracy (false negatives and false positives)
  2. fewest number of false negatives (spam that gets through to your
 inbox)
  3. fewest number of false positives (good mail that ends up marked as
 spam)
  4. good accuracy when in the default configuration, no twiddling
 required
  5. good for processing large volumes of mail without insane resources
 
 I use SpamAssassin and find it does really well, but it falls down at #4
 and #5. I have to train the Bayesian[1] classifiers on all my mail in order
 to get good-to-me accuracy, so I am certainly not relying on the default
 configuration. (I suspect I'd do just as well switching entirely to a
 Bayesian system, but since SpamAssassin is now doing fine I have not
 done so).
 
 And it's a resource hog, it sometimes takes 8 seconds to scan a mail on
 my OK-standard desktop system. So if you were receiving more than an
 email about every 8 seconds you'd be looking at performance tuning and
 additional less hoggy measures, or at alternatives. (Everything that
 processes the full body of an email is somewhat resource intensive, but
 I understand that SA is not great.)


I'm using Bogofilter (bayesian filter) to sort spam into good, bad
or unsure at a user level. I've got a cron shell script that passes
manually sorted unsure email through the filter hourly for training
purposes and it works really well at the client level.

I don't know if there is a package to do that. I wrote the script
myself, which means that it is crude and simple :)

This doesn't stop spam... it just means you never have to read it.

David.

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] {Spam?} Anti spam software?

2008-06-19 Thread Alan L Tyree
On Fri, 20 Jun 2008 12:50:20 +1000
david [EMAIL PROTECTED] wrote:

 On Fri, 2008-06-20 at 07:57 +1000, Mary Gardiner wrote:
  On Thu, Jun 19, 2008, Sean Murphy wrote:
   All,
   
   I am after a good Anti spam software program for Linux which is
   shell based.  I am aware of Spam assassign.  But I would like to
   know if there is anything else which is better?
  
  Better could mean a few things in the context of spam filtering,
  could you clarify which of these features is more important to you:
  
   1. overall accuracy (false negatives and false positives)
   2. fewest number of false negatives (spam that gets through to your
  inbox)
   3. fewest number of false positives (good mail that ends up marked
  as spam)
   4. good accuracy when in the default configuration, no twiddling
  required
   5. good for processing large volumes of mail without insane
  resources
  
  I use SpamAssassin and find it does really well, but it falls down
  at #4 and #5. I have to train the Bayesian[1] classifiers on all my
  mail in order to get good-to-me accuracy, so I am certainly not
  relying on the default configuration. (I suspect I'd do just as
  well switching entirely to a Bayesian system, but since
  SpamAssassin is now doing fine I have not done so).
  
  And it's a resource hog, it sometimes takes 8 seconds to scan a
  mail on my OK-standard desktop system. So if you were receiving
  more than an email about every 8 seconds you'd be looking at
  performance tuning and additional less hoggy measures, or at
  alternatives. (Everything that processes the full body of an email
  is somewhat resource intensive, but I understand that SA is not
  great.)
 
 
 I'm using Bogofilter (bayesian filter) to sort spam into good, bad
 or unsure at a user level. I've got a cron shell script that passes
 manually sorted unsure email through the filter hourly for training
 purposes and it works really well at the client level.
 
 I don't know if there is a package to do that. I wrote the script
 myself, which means that it is crude and simple :)
 
 This doesn't stop spam... it just means you never have to read it.

I use bogofilter in connection with Sylpheed as a simple pop client. It
is remarkably effective, and with the default settings has almost no
false positives.

It is interesting with a Bayesian filter to notice that sometimes the
spammers seem to try something new. A few spam messages get through,
but with Sylpheed you just mark them as spam as they come in and it
learns very quickly.

Alan
 
 David.
 
 -- 
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
 


-- 
Alan L Tyreehttp://www2.austlii.edu.au/~alan
Tel:  04 2748 6206  Fax: +61 2 4782 7092
FWD: 615662
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html