Re: Clustering spamassassin + autolearning

2008-11-26 Thread Peter Fastré
Thank you all for your (quick) answers!
@Kai: mailwatch has a training facility built in. But this is only possible
on messages in quarantine. If a message is passed by mailscanner (for
example, because of BAYES_00, which is sometimes the case), it is sent to
the mailbox server, and it's not possible to train the message as spam on
the mailwatch server.

Peter


On Tue, Nov 25, 2008 at 9:11 PM, Kai Schaetzl [EMAIL PROTECTED]wrote:

 Peter Fastré wrote on Tue, 25 Nov 2008 16:04:19 +0100:

  2. On my mailbox server I'd like to have a script which goes into the
  mailfolders, searches for a folder with the name 'Spam', feeds the
 message
  to sa-learn (which should be feeding it to the same bayes database of
  course), and then delete the message. Do you think this is a well-thought
  approach of having my users train the spam filters this way?

 Generally yes, but since you are already using MailScanner+Mailwatch:
 That's
 already built-in and users can just train any messages from MailWatch. Why
 duplicate that?

 Kai

 --
 Kai Schätzl, Berlin, Germany
 Get your web at Conactive Internet Services: http://www.conactive.com






night of pleasure spam

2008-11-26 Thread Lists

Hi all,

The system here is getting heaps of variations of this night of pleasure 
spam. Some is getting stopped by spamassassin but still quite a bit 
getting through.

Here is an example of one that only scored low.
http://www.pastebin.ca/1267866

If anybody has time to run it through their system and tell me what t 
hit on for them - or if someone knows a ruleset that I could implement 
to better stop these it would be much appreciated.


Thanks
Kate


Re: night of pleasure spam

2008-11-26 Thread John Hardin

On Thu, 27 Nov 2008, Lists wrote:


Here is an example of one that only scored low.
http://www.pastebin.ca/1267866


There was some discussion on the list of spaces.live.com URI spam a few 
weeks back, and some rules posted. Those might help.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Bother, said Pooh as he struggled with /etc/sendmail.cf, it never
  does quite what I want. I wish Christopher Robin was here.
   -- Peter da Silva in a.s.r
---
 29 days until Christmas


Re: night of pleasure spam

2008-11-26 Thread DAve

Lists wrote:

Hi all,

The system here is getting heaps of variations of this night of pleasure 
spam. Some is getting stopped by spamassassin but still quite a bit 
getting through.

Here is an example of one that only scored low.
http://www.pastebin.ca/1267866

If anybody has time to run it through their system and tell me what t 
hit on for them - or if someone knows a ruleset that I could implement 
to better stop these it would be much appreciated.


Thanks
Kate




Content analysis details:   (5.4 points, 5.0 required)

 pts rule name  description
 -- 
--

 1.9 KAM_THEBAT Abused X-Mailer Header for The Bat! MUA
 2.0 RCVD_IN_PBLRBL: Received via a relay in Spamhaus PBL
[200.219.72.83 listed in zen.spamhaus.org]
 1.0 RCVD_IN_BRBL   RBL: Received via relay listed in Barracuda 
  RBL [200.219.72.83 listed in 
b.barracudacentral.org]
 0.0 UNPARSEABLE_RELAY  Informational: message has unparseable 
relay lines

 0.4 URI_HEXURI: URI hostname has long hexadecimal sequence
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.0 BAYES_50   BODY: Bayesian spam probability is 40 to 60%
[score: 0.4989]
 0.1 RDNS_NONE  Delivered to trusted network by a host with 
no rDNS


Looks like it would have been close for us as well. Bayes would get it 
after a few hits I think.


DAve

--
The whole internet thing is sucking the life out of me,
there ain't no pony in there.


Re: night of pleasure spam

2008-11-26 Thread Kate Kleinschafer

John Hardin wrote:

On Thu, 27 Nov 2008, Lists wrote:


Here is an example of one that only scored low.
http://www.pastebin.ca/1267866


There was some discussion on the list of spaces.live.com URI spam a 
few weeks back, and some rules posted. Those might help.



Thanks I will check that out.

Kate


Re: night of pleasure spam

2008-11-26 Thread Kate Kleinschafer



DAve wrote:

Lists wrote:

Hi all,

The system here is getting heaps of variations of this night of 
pleasure spam. Some is getting stopped by spamassassin but still 
quite a bit getting through.

Here is an example of one that only scored low.
http://www.pastebin.ca/1267866

If anybody has time to run it through their system and tell me what t 
hit on for them - or if someone knows a ruleset that I could 
implement to better stop these it would be much appreciated.


Thanks
Kate




Content analysis details:   (5.4 points, 5.0 required)

 pts rule name  description
 -- 
--

 1.9 KAM_THEBAT Abused X-Mailer Header for The Bat! MUA
 2.0 RCVD_IN_PBLRBL: Received via a relay in Spamhaus PBL
[200.219.72.83 listed in zen.spamhaus.org]
 1.0 RCVD_IN_BRBL   RBL: Received via relay listed in 
Barracuda   RBL [200.219.72.83 listed 
in b.barracudacentral.org]
 0.0 UNPARSEABLE_RELAY  Informational: message has unparseable 
relay lines
 0.4 URI_HEXURI: URI hostname has long hexadecimal 
sequence

 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.0 BAYES_50   BODY: Bayesian spam probability is 40 to 60%
[score: 0.4989]
 0.1 RDNS_NONE  Delivered to trusted network by a host 
with no rDNS


Looks like it would have been close for us as well. Bayes would get it 
after a few hits I think.


DAve


Interesting - will keep training against them then (for bayes)

Thanks for your results Dave.

Kate


Re: night of pleasure spam

2008-11-26 Thread Lists



John Hardin wrote:

On Thu, 27 Nov 2008, Lists wrote:


Here is an example of one that only scored low.
http://www.pastebin.ca/1267866


There was some discussion on the list of spaces.live.com URI spam a 
few weeks back, and some rules posted. Those might help.




Can you post a link to this thread - I have not used the spaces.live.com 
list before.


Thanks
Kate


Re: night of pleasure spam

2008-11-26 Thread Bill Randle
On Thu, 2008-11-27 at 09:37 +1300, Kate Kleinschafer wrote:
 John Hardin wrote:
  On Thu, 27 Nov 2008, Lists wrote:
 
  Here is an example of one that only scored low.
  http://www.pastebin.ca/1267866
 
  There was some discussion on the list of spaces.live.com URI spam a 
  few weeks back, and some rules posted. Those might help.
 
 Thanks I will check that out.

We got some this morning too, but they appear to be getting caught by
XBL and Botnet now. Here's how your message scored:

X-Spam-Report:
*  0.9 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL
*  [200.219.72.83 listed in zen.spamhaus.org]
*  3.0 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
*  5.5 BOTNET Relay might be a spambot or virusbot
*  [botnet0.8,ip=200.219.72.83,nordns]
*  0.0 UNPARSEABLE_RELAY Informational: message has unparseable
relay lines
*  0.4 URI_HEX URI: URI hostname has long hexadecimal sequence
* -0.2 BAYES_40 BODY: Bayesian spam probability is 20 to 40%
*  [score: 0.3341]
*  0.0 HTML_MESSAGE BODY: HTML included in message
*  0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
*  0.1 RDNS_NONE Delivered to trusted network by a host with no
rDNS
*  0.6 HELO_MISMATCH_COM HELO_MISMATCH_COM

-Bill




Re: night of pleasure spam

2008-11-26 Thread Martin Gregorie
On Wed, 2008-11-26 at 12:32 -0800, John Hardin wrote:

 There was some discussion on the list of spaces.live.com URI spam a few 
 weeks back, and some rules posted. Those might help.
 
I've also been seeing it.

I have a similar set of spaces.live.com rules, but although one or to of
the sub-rules have been tripping, this stuff has been sliding through
with quite a low score. So far I haven't spotted enough commonality in
the messages to consider altering my rules.

Martin
 



Re: night of pleasure spam

2008-11-26 Thread Lists

Bill Randle wrote:

On Thu, 2008-11-27 at 09:37 +1300, Kate Kleinschafer wrote:
  

John Hardin wrote:


On Thu, 27 Nov 2008, Lists wrote:

  

Here is an example of one that only scored low.
http://www.pastebin.ca/1267866

There was some discussion on the list of spaces.live.com URI spam a 
few weeks back, and some rules posted. Those might help.


  

Thanks I will check that out.



We got some this morning too, but they appear to be getting caught by
XBL and Botnet now. Here's how your message scored:

X-Spam-Report:
*  0.9 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL
*  [200.219.72.83 listed in zen.spamhaus.org]
*  3.0 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
*  5.5 BOTNET Relay might be a spambot or virusbot
*  [botnet0.8,ip=200.219.72.83,nordns]
*  0.0 UNPARSEABLE_RELAY Informational: message has unparseable
relay lines
*  0.4 URI_HEX URI: URI hostname has long hexadecimal sequence
* -0.2 BAYES_40 BODY: Bayesian spam probability is 20 to 40%
*  [score: 0.3341]
*  0.0 HTML_MESSAGE BODY: HTML included in message
*  0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
*  0.1 RDNS_NONE Delivered to trusted network by a host with no
rDNS
*  0.6 HELO_MISMATCH_COM HELO_MISMATCH_COM

-Bill


  
I will look into the BOTNET as I don't believe we are using this at the 
moment. Do you get many fp's with this?


Kate


Re: night of pleasure spam

2008-11-26 Thread John Hardin

On Thu, 27 Nov 2008, Lists wrote:


John Hardin wrote:

 On Thu, 27 Nov 2008, Lists wrote:

  Here is an example of one that only scored low.
  http://www.pastebin.ca/1267866

 There was some discussion on the list of spaces.live.com URI spam a few
 weeks back, and some rules posted. Those might help.


Can you post a link to this thread - I have not used the spaces.live.com 
list before.


I meant there was some discussion on the SA list of spams containing 
spaces.live.com URIs. Search the SA archives.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Bother, said Pooh as he struggled with /etc/sendmail.cf, it never
  does quite what I want. I wish Christopher Robin was here.
   -- Peter da Silva in a.s.r
---
 29 days until Christmas


Re: night of pleasure spam

2008-11-26 Thread Bill Randle
On Thu, 2008-11-27 at 09:51 +1300, Lists wrote:
 Bill Randle wrote:
  On Thu, 2008-11-27 at 09:37 +1300, Kate Kleinschafer wrote:

  John Hardin wrote:
  
  On Thu, 27 Nov 2008, Lists wrote:
 

  Here is an example of one that only scored low.
  http://www.pastebin.ca/1267866
  
  There was some discussion on the list of spaces.live.com URI spam a 
  few weeks back, and some rules posted. Those might help.
 

  Thanks I will check that out.
  
 
  We got some this morning too, but they appear to be getting caught by
  XBL and Botnet now. Here's how your message scored:
 
  X-Spam-Report:
  *  0.9 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL
  *  [200.219.72.83 listed in zen.spamhaus.org]
  *  3.0 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
  *  5.5 BOTNET Relay might be a spambot or virusbot
  *  [botnet0.8,ip=200.219.72.83,nordns]
  *  0.0 UNPARSEABLE_RELAY Informational: message has unparseable
  relay lines
  *  0.4 URI_HEX URI: URI hostname has long hexadecimal sequence
  * -0.2 BAYES_40 BODY: Bayesian spam probability is 20 to 40%
  *  [score: 0.3341]
  *  0.0 HTML_MESSAGE BODY: HTML included in message
  *  0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
  *  0.1 RDNS_NONE Delivered to trusted network by a host with no
  rDNS
  *  0.6 HELO_MISMATCH_COM HELO_MISMATCH_COM
 
  -Bill
 
 

 I will look into the BOTNET as I don't believe we are using this at the 
 moment. Do you get many fp's with this?

Not that I'm aware of. If you're concerned, you can lower the score. I
keep it fairly high as sometimes it's the only thing of any significance
that hits.

You can also search the list archives BOTNET and pick up some of the
discussion about effectiveness and potential for false positives.

-Bill




Re: night of pleasure spam

2008-11-26 Thread mouss
Lists a écrit :
 Hi all,
 
 The system here is getting heaps of variations of this night of pleasure
 spam. Some is getting stopped by spamassassin but still quite a bit
 getting through.
 Here is an example of one that only scored low.
 http://www.pastebin.ca/1267866
 
 If anybody has time to run it through their system and tell me what t
 hit on for them - or if someone knows a ruleset that I could implement
 to better stop these it would be much appreciated.
 

Content analysis details:   (12.0 points, 5.0 required)

 pts rule name  description
 --
--
 0.5 COUNTRY_BR Relayed via Brazil
 4.0 RCVD_IN_PBLRBL: Received via a relay in Spamhaus PBL
[200.219.72.83 listed in zen.spamhaus.org]
 3.0 RCVD_IN_XBLRBL: Received via a relay in Spamhaus XBL
 3.0 RCVD_IN_BRBL   RBL: Received via a relay in Barracuda BRBL
[200.219.72.83 listed in
bb.barracudacentral.org]
 0.0 UNPARSEABLE_RELAY  Informational: message has unparseable relay
lines
 0.4 URI_HEXURI: URI hostname has long hexadecimal sequence
 1.0 BAYES_60   BODY: Bayesian spam probability is 60 to 80%
[score: 0.7049]
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.1 RDNS_NONE  Delivered to trusted network by a host with
no rDNS



sa-learn from internal mail server ?

2008-11-26 Thread Sam Ami
hi all

our current setup.
primary mx for all out email domains installation: qmail,spamassasin,clamav
all email is inline scanned and then relayed to the internal server
for delivery to users mailbox

question.
is it possible to use sa-learn in this situtation ?
we still get a lot of spam and i'd like to teach SA if possible ny
using sa-learn.

any suggestions ?


Re: sa-learn from internal mail server ?

2008-11-26 Thread Steven Stern
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/26/2008 04:14 PM, Sam Ami wrote:
 hi all
 
 our current setup.
 primary mx for all out email domains installation: qmail,spamassasin,clamav
 all email is inline scanned and then relayed to the internal server
 for delivery to users mailbox
 
 question.
 is it possible to use sa-learn in this situtation ?
 we still get a lot of spam and i'd like to teach SA if possible ny
 using sa-learn.
 
 any suggestions ?
 

Here's how we handle it with Exchange

http://sstern.ccim.com/2006/07/14/training-sitewide-spam-filters/

- --

  Steve
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iEYEARECAAYFAkktzDoACgkQeERILVgMyvC/+wCeLNbijG3RpsSqzkGmhxPfS8Uk
w0AAnjKWoP4EmZi7wE0kS2PvtvHCaGlF
=ggNo
-END PGP SIGNATURE-


Re: sa-learn from internal mail server ?

2008-11-26 Thread John Hardin

On Thu, 27 Nov 2008, Sam Ami wrote:

primary mx for all out email domains installation: 
qmail,spamassasin,clamav all email is inline scanned and then relayed to 
the internal server for delivery to users mailbox


question.
is it possible to use sa-learn in this situtation ?
we still get a lot of spam and i'd like to teach SA if possible ny
using sa-learn.

any suggestions ?


I assume you're not talking about autolearn...

Check the archives for user training via IMAP spam folders. Basically, you 
set up IMAP folders on the final mail server for users to copy FNs and FPs 
to, and SA retrieves the messages from those folders for training.


The big problem is user education and reliability. You will probably want 
to manually review what the users are trying to train SA with.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Bother, said Pooh as he struggled with /etc/sendmail.cf, it never
  does quite what I want. I wish Christopher Robin was here.
   -- Peter da Silva in a.s.r
---
 29 days until Christmas


Re: sa-learn from internal mail server ?

2008-11-26 Thread Martin Gregorie
On Thu, 2008-11-27 at 09:14 +1100, Sam Ami wrote:
 hi all
 
 our current setup.
 primary mx for all out email domains installation: qmail,spamassasin,clamav
 all email is inline scanned and then relayed to the internal server
 for delivery to users mailbox
 
 question.
 is it possible to use sa-learn in this situtation ?
 we still get a lot of spam and i'd like to teach SA if possible ny
 using sa-learn.
 
Off the top of my head, two suggestions:

1) simply enable auto-learn if it isn't already on.
2) write a kick-sort program to sit downstream of clamav. 
   It would route all mail to the internal server but in addition, 
   send everything thats obviously spam to a spam bin and everything
   that's obviously ham to a ham bin. 

   Use a cron job to run sa_learn on the contents of the two bins.

3) Provide a submission method so users can submit personally selected
   mail to the spam bin

You don't say how the clamav output gets delivered to the internal
server, but it should be possible to use procmail to do the kicksorting.

HTH
Martin




New PYZOR

2008-11-26 Thread Chris
Has anyone else seen this? I'm trying to find out from the new author whether 
it will work with SA. I would think that a new plug-in would have to be 
written?

With great pleasure I present to you the long discussed, wanted  now  
finished new version of PYZOR now dubbed Bohuno. And for those of  
you thinking, duh, _yes_ we could not come up with a better name.

Here is a link to the site, http://www.bohuno.com

-- 
Chris
KeyID 0xE372A7DA98E6705C


pgp634SMzPmpw.pgp
Description: PGP signature


Re: night of pleasure spam

2008-11-26 Thread Chris
On Wednesday 26 November 2008 2:18 pm, Lists wrote:
 Hi all,

 The system here is getting heaps of variations of this night of pleasure
 spam. Some is getting stopped by spamassassin but still quite a bit
 getting through.
 Here is an example of one that only scored low.
 http://www.pastebin.ca/1267866

 If anybody has time to run it through their system and tell me what t
 hit on for them - or if someone knows a ruleset that I could implement
 to better stop these it would be much appreciated.

 Thanks
 Kate

Here is how one I received scores on my stand-alone box:

Content analysis details:   (23.6 points, 5.0 required)

 pts rule name              description
 -- --
 0.9 RCVD_IN_PBL            RBL: Received via a relay in Spamhaus PBL
                            [79.52.75.164 listed in zen.spamhaus.org]
 3.0 RCVD_IN_XBL            RBL: Received via a relay in Spamhaus XBL
 5.0 BAYES_99               BODY: Bayesian spam probability is 99 to 100%
                            [score: 0.9998]
 1.0 RELAYED_BY_DIALUP      Sent directly from dynamic IP address
 0.4 URI_HEX                URI: URI hostname has long hexadecimal sequence
 0.0 HTML_MESSAGE           BODY: HTML included in message
 2.2 DCC_CHECK              listed in DCC (http://rhyolite.com/anti-spam/dcc/)
                            [cpollock 1117; Body=many Fuz1=many]
                            [Fuz2=many]
  10 CLAMAV                 Clam AntiVirus detected a virus
 0.1 RDNS_DYNAMIC           Delivered to trusted network by host with
                            dynamic-looking rDNS
 1.0 SAGREY                 Adds 1.0 to spam from first-time senders

Chris

-- 
Chris
KeyID 0xE372A7DA98E6705C


pgpS0oP8xUNtu.pgp
Description: PGP signature


Re: New PYZOR

2008-11-26 Thread Robert Fleming
--On November 26, 2008 9:38:27 PM -0600 Chris is rumoured to have written:

 Has anyone else seen this? I'm trying to find out from the new author
 whether  it will work with SA. I would think that a new plug-in would
 have to be  written?

In the Forum section of the Buhono site, I found the following:
' The Bohuno service is backwards compatible to the original PyzorD and
' Pyzor client. To use Bohuno, update the ~/.pyzor/servers file and add
' 82.94.255.100:24441
' 
' If you want to run a local copy of the database, register/login and
' download the version for your OS. your local copy will be updated
' regularly.. 
' 
' http://www.bohuno.com/forum/read.php?3,27

So it looks like if you have pyzor already installed, you can take
advantage of the new service w/o much change.  I haven't looked at the
Bohuno client/server apps yet.

Rob


Re: New PYZOR

2008-11-26 Thread Chris
On Wednesday 26 November 2008 10:11 pm, you wrote:

 In the Forum section of the Buhono site, I found the following:
 ' The Bohuno service is backwards compatible to the original PyzorD and
 ' Pyzor client. To use Bohuno, update the ~/.pyzor/servers file and add
 ' 82.94.255.100:24441
 '
 ' If you want to run a local copy of the database, register/login and
 ' download the version for your OS. your local copy will be updated
 ' regularly..
 '
 ' http://www.bohuno.com/forum/read.php?3,27

 So it looks like if you have pyzor already installed, you can take
 advantage of the new service w/o much change.  I haven't looked at the
 Bohuno client/server apps yet.

 Rob

Thanks Rob, I've been using that server for a couple of years now so no 
changes should be necessary then I'd think. I'll keep a watch and see if any 
errors start popping up.

Chris

-- 
Chris
KeyID 0xE372A7DA98E6705C


pgpjpgn7kV216.pgp
Description: PGP signature


Re: night of pleasure spam

2008-11-26 Thread Henrik K
On Wed, Nov 26, 2008 at 01:15:48PM -0800, Bill Randle wrote:
 On Thu, 2008-11-27 at 09:51 +1300, Lists wrote:
  Bill Randle wrote:
   *  5.5 BOTNET Relay might be a spambot or virusbot
   *  [botnet0.8,ip=200.219.72.83,nordns]
 
  I will look into the BOTNET as I don't believe we are using this at the 
  moment. Do you get many fp's with this?
 
 Not that I'm aware of. If you're concerned, you can lower the score. I
 keep it fairly high as sometimes it's the only thing of any significance
 that hits.

Giving 5.5 points to a host without reverse DNS makes no sense. Of course
it's your rules, and you slightly advocate being cautious, but still someone
could think that Botnet is FP safe by default.

You are much better off doing all the things that Botnet does in your MTA.
Block dynamic HELOs, greylist hosts without DNS/with dynamic DNS.