RE: Another false negative

2007-03-14 Thread Rocco Scappatura
> If you can post the full email (headers and body), I'll run it over my
> system which has lots and lots of third party add on rules from
> www.rulesemporium.com and others and see if I can make SA 
> score it high
> enough for Amavisd-new to block the email..

Thanks. 

http://www.rocsca.it/INBOX

I get the following score:

>From [EMAIL PROTECTED] Wed Mar 14 07:13:02 2007
Return-Path: <[EMAIL PROTECTED]>
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on av6.stt.vir
X-Spam-Level: **
X-Spam-Status: No, score=2.5 required=5.0 tests=AWL,BAYES_50,HTML_30_40,
 
HTML_MESSAGE,HTML_TEXT_AFTER_BODY,MIME_HTML_ONLY,SARE_PROLOSTOCK_SYM3
autolearn=no version=3.1.8
X-Original-To: [EMAIL PROTECTED]
Delivered-To: [EMAIL PROTECTED]
Received: by posta.sttspa.it (Postfix, from userid 7011)
id 8F9A51098056; Wed, 14 Mar 2007 07:14:06 +0100 (CET)
Received: from av6.stt.vir (smtp02.sttspa.it [80.74.176.141])
by posta.sttspa.it (Postfix) with ESMTP id 6858B1098004;
Wed, 14 Mar 2007 07:14:06 +0100 (CET)
Received: from localhost (localhost [127.0.0.1])
by av6.stt.vir (Postfix) with ESMTP id F7500A7;
Wed, 14 Mar 2007 07:14:06 +0100 (CET)
X-Virus-Scanned: amavisd-new at stt.vir
Received: from av6.stt.vir ([127.0.0.1])
by localhost (av6.stt.vir [127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id I3LCVzlxLfiv; Wed, 14 Mar 2007 07:14:03 +0100
(CET)
Received: from kbra3qsxm9mslhj (203-118-114-113.static.asianet.co.th
[203.118.114.113])
by av6.stt.vir (Postfix) with SMTP id 362367500A2;
Wed, 14 Mar 2007 07:13:14 +0100 (CET)
Message-ID: <[EMAIL PROTECTED]>
Reply-To: "IParker NDickey" <[EMAIL PROTECTED]>
From: "IParker NDickey" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>
Subject: transmitting wolf
Date: Wed, 14 Mar 2007 13:13:02 +0700
MIME-Version: 1.0
Content-Type: text/html







Our Next Winner for March
14th
CEO AMERICA INC 
Tick : CEOA
Priced : $0.07
Won't last long at this stage, This one is going to
$1.00
Grab yourself some tomorrow avoid the
rush
And experience a 10 bagger.

FAA said the rule change -- a temporary one -- was made
for safety reasons. The NTSB's
of starting that fire with murder. A light wind was cited by federal
investigators = San Benardino National Forest to its very core and
shocked the entire world."
October 26 in Southern California's San Jacinto Mountains.=ttempted a
U-turn with only 1,300 feet of room for the turn. To make a successful
turn,







)
Spam detection software, running on the system "av6.stt.vir", has
identified this incoming email as possible spam.  The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email.  If you have any questions, see
the administrator of that system for details.

Content preview:  Our Next Winner for March 14th CEO AMERICA INC Tick :
CEOA
   Priced : $0.07 Won't last long at this stage, This one is going to
$1.00
  Grab yourself some tomorrow avoid the rush And experience a 10 bagger.
[...]


Content analysis details:   (2.5 points, 5.0 required)

 pts rule name  description
 --
--
 1.7 SARE_PROLOSTOCK_SYM3   BODY: Last week's hot stock scam
 0.1 HTML_TEXT_AFTER_BODY   BODY: HTML contains text after BODY close
tag
 0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.0 BAYES_50   BODY: Bayesian spam probability is 40 to 60%
[score: 0.5547]
 0.0 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
 0.3 AWLAWL: From: address is in the auto white-list


RE: Another false negative

2007-03-14 Thread Rocco Scappatura
> http://www.rocsca.it/INBOX

Could someone give me an hint on how to block email like the one above?

Thanks,

rocsca

> I get the following score:
> 
> From [EMAIL PROTECTED] Wed Mar 14 07:13:02 2007
> Return-Path: <[EMAIL PROTECTED]>
> X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on av6.stt.vir
> X-Spam-Level: **
> X-Spam-Status: No, score=2.5 required=5.0 
> tests=AWL,BAYES_50,HTML_30_40,
>  
> HTML_MESSAGE,HTML_TEXT_AFTER_BODY,MIME_HTML_ONLY,SARE_PROLOSTOCK_SYM3
> autolearn=no version=3.1.8
> X-Original-To: [EMAIL PROTECTED]
> Delivered-To: [EMAIL PROTECTED]
> Received: by posta.sttspa.it (Postfix, from userid 7011)
> id 8F9A51098056; Wed, 14 Mar 2007 07:14:06 +0100 (CET)
> Received: from av6.stt.vir (smtp02.sttspa.it [80.74.176.141])
> by posta.sttspa.it (Postfix) with ESMTP id 6858B1098004;
> Wed, 14 Mar 2007 07:14:06 +0100 (CET)
> Received: from localhost (localhost [127.0.0.1])
> by av6.stt.vir (Postfix) with ESMTP id F7500A7;
> Wed, 14 Mar 2007 07:14:06 +0100 (CET)
> X-Virus-Scanned: amavisd-new at stt.vir
> Received: from av6.stt.vir ([127.0.0.1])
> by localhost (av6.stt.vir [127.0.0.1]) (amavisd-new, 
> port 10024)
> with ESMTP id I3LCVzlxLfiv; Wed, 14 Mar 2007 07:14:03 +0100
> (CET)
> Received: from kbra3qsxm9mslhj (203-118-114-113.static.asianet.co.th
> [203.118.114.113])
> by av6.stt.vir (Postfix) with SMTP id 362367500A2;
> Wed, 14 Mar 2007 07:13:14 +0100 (CET)
> Message-ID: <[EMAIL PROTECTED]>
> Reply-To: "IParker NDickey" <[EMAIL PROTECTED]>
> From: "IParker NDickey" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>
> Subject: transmitting wolf
> Date: Wed, 14 Mar 2007 13:13:02 +0700
> MIME-Version: 1.0
> Content-Type: text/html
> 
> 
> 
> 
> 
> 
> 
> Our Next Winner for color="#FF"> March 14th  color="#FF">CEO AMERICA INC  Tick : CEOA 
> Priced : $0.07 Won't last 
> long at this stage, This one is going to color="#008080"> $1.00 Grab yourself some color="#FF"> tomorrow avoid the rush And 
> experience a 10 bagger.  align="center"> FAA said the rule change 
> -- a temporary one -- was made for safety reasons. The 
> NTSB's of starting that fire with murder. A light wind 
> was cited by federal investigators = San Benardino National 
> Forest to its very core and shocked the entire world." 
> October 26 in Southern California's San Jacinto 
> Mountains.=ttempted a U-turn with only 1,300 feet of room for 
> the turn. To make a successful turn, 
> 
> 
> 
> 
> 
> 
> )
> Spam detection software, running on the system "av6.stt.vir", 
> has identified this incoming email as possible spam.  The 
> original message has been attached to this so you can view it 
> (if it isn't spam) or label similar future email.  If you 
> have any questions, see the administrator of that system for details.
> 
> Content preview:  Our Next Winner for March 14th CEO AMERICA 
> INC Tick :
> CEOA
>Priced : $0.07 Won't last long at this stage, This one is 
> going to $1.00
>   Grab yourself some tomorrow avoid the rush And experience a 
> 10 bagger.
> [...]
> 
> 
> Content analysis details:   (2.5 points, 5.0 required)
> 
>  pts rule name  description
>  --
> --
>  1.7 SARE_PROLOSTOCK_SYM3   BODY: Last week's hot stock scam
>  0.1 HTML_TEXT_AFTER_BODY   BODY: HTML contains text after BODY close
> tag
>  0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
>  0.0 HTML_MESSAGE   BODY: HTML included in message
>  0.0 BAYES_50   BODY: Bayesian spam probability 
> is 40 to 60%
> [score: 0.5547]
>  0.0 MIME_HTML_ONLY BODY: Message only has text/html 
> MIME parts
>  0.3 AWLAWL: From: address is in the auto 
> white-list
> 


Re: Another false negative

2007-03-14 Thread Anthony Peacock

Hi,

Rocco Scappatura wrote:

http://www.rocsca.it/INBOX


Could someone give me an hint on how to block email like the one above?

Thanks,

rocsca



I get the following:

Content analysis details:   (5.7 points, 5.0 required)

 pts rule name  description
 -- 
--

 0.1 FORGED_RCVD_HELO   Received: contains a forged HELO
 1.7 SARE_PROLOSTOCK_SYM3   BODY: Last week's hot stock scam
 0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
 0.0 HTML_MESSAGE   BODY: HTML included in message
 3.5 BAYES_99   BODY: Bayesian spam probability is 99 to 100%
[score: 1.]
 0.0 MIME_HTML_ONLY BODY: Message only has text/html MIME parts



--
Anthony Peacock
CHIME, Royal Free & University College Medical School
WWW:http://www.chime.ucl.ac.uk/~rmhiajp/
"If you have an apple and I have  an apple and we  exchange apples
then you and I will still each have  one apple. But  if you have an
idea and I have an idea and we exchange these ideas, then each of us
will have two ideas." -- George Bernard Shaw


RE: Another false negative

2007-03-14 Thread Rocco Scappatura
> I get the following:
> 
> Content analysis details:   (5.7 points, 5.0 required)
> 
>   pts rule name  description
>  --
> --
>   0.1 FORGED_RCVD_HELO   Received: contains a forged HELO
>   1.7 SARE_PROLOSTOCK_SYM3   BODY: Last week's hot stock scam
>   0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
>   0.0 HTML_MESSAGE   BODY: HTML included in message
>   3.5 BAYES_99   BODY: Bayesian spam probability 
> is 99 to 100%
>  [score: 1.]
>   0.0 MIME_HTML_ONLY BODY: Message only has text/html 
> MIME parts

Please, could you tell me what do I miss?

TIA,

rocsca



RE: Another false negative

2007-03-14 Thread Rocco Scappatura
> > Content analysis details:   (5.7 points, 5.0 required)
> > 
> >   pts rule name  description
> >  --
> > --
> >   0.1 FORGED_RCVD_HELO   Received: contains a forged HELO
> >   1.7 SARE_PROLOSTOCK_SYM3   BODY: Last week's hot stock scam
> >   0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
> >   0.0 HTML_MESSAGE   BODY: HTML included in message
> >   3.5 BAYES_99   BODY: Bayesian spam probability 
> > is 99 to 100%
> >  [score: 1.]
> >   0.0 MIME_HTML_ONLY BODY: Message only has text/html 
> > MIME parts
> 
> Please, could you tell me what do I miss?
> 

Maybe I have to update the list of ruleset? What I have to installa
other that the default set of ruleset delivered with SA 3.1.8?

TIA,

rocsca


Re: Another false negative

2007-03-14 Thread Anthony Peacock

Hi,

Rocco Scappatura wrote:

I get the following:

Content analysis details:   (5.7 points, 5.0 required)

  pts rule name  description
 --
--
  0.1 FORGED_RCVD_HELO   Received: contains a forged HELO
  1.7 SARE_PROLOSTOCK_SYM3   BODY: Last week's hot stock scam
  0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
  0.0 HTML_MESSAGE   BODY: HTML included in message
  3.5 BAYES_99   BODY: Bayesian spam probability 
is 99 to 100%

 [score: 1.]
  0.0 MIME_HTML_ONLY BODY: Message only has text/html 
MIME parts


Assuming this is your score line:

> X-Spam-Status: No, score=2.5 required=5.0
> tests=AWL,BAYES_50,HTML_30_40,
> HTML_MESSAGE,HTML_TEXT_AFTER_BODY,MIME_HTML_ONLY,SARE_PROLOSTOCK_SYM3
> autolearn=no version=3.1.8

Then the biggest difference is that my Bayesian scoring gives it a 
BAYES_99 score and your's gives it a BAYES_50 score.


--
Anthony Peacock
CHIME, Royal Free & University College Medical School
WWW:http://www.chime.ucl.ac.uk/~rmhiajp/
"If you have an apple and I have  an apple and we  exchange apples
then you and I will still each have  one apple. But  if you have an
idea and I have an idea and we exchange these ideas, then each of us
will have two ideas." -- George Bernard Shaw


RE: Another false negative

2007-03-14 Thread Rocco Scappatura
> Assuming this is your score line:
> 
>  > X-Spam-Status: No, score=2.5 required=5.0  > 
> tests=AWL,BAYES_50,HTML_30_40,  > 
> HTML_MESSAGE,HTML_TEXT_AFTER_BODY,MIME_HTML_ONLY,SARE_PROLOSTOCK_SYM3
>  > autolearn=no version=3.1.8
> 
> Then the biggest difference is that my Bayesian scoring gives it a
> BAYES_99 score and your's gives it a BAYES_50 score.

So you are saying that I have to train SA?

rocsca


Re: Another false negative

2007-03-14 Thread Anthony Peacock

Rocco Scappatura wrote:

Assuming this is your score line:

 > X-Spam-Status: No, score=2.5 required=5.0  > 
tests=AWL,BAYES_50,HTML_30_40,  > 
HTML_MESSAGE,HTML_TEXT_AFTER_BODY,MIME_HTML_ONLY,SARE_PROLOSTOCK_SYM3

 > autolearn=no version=3.1.8

Then the biggest difference is that my Bayesian scoring gives it a
BAYES_99 score and your's gives it a BAYES_50 score.


So you are saying that I have to train SA?


That would be how you would improve your Bayes accuracy, yes.

--
Anthony Peacock
CHIME, Royal Free & University College Medical School
WWW:http://www.chime.ucl.ac.uk/~rmhiajp/
"If you have an apple and I have  an apple and we  exchange apples
then you and I will still each have  one apple. But  if you have an
idea and I have an idea and we exchange these ideas, then each of us
will have two ideas." -- George Bernard Shaw


RE: Another false negative

2007-03-14 Thread Rocco Scappatura
> > So you are saying that I have to train SA?
> 
> That would be how you would improve your Bayes accuracy, yes.

I have trained SA on my server but I still get a score lower than 5.0..

Content analysis details:   (4.3 points, 5.0 required)

 pts rule name  description
 --
--
 1.7 SARE_PROLOSTOCK_SYM3   BODY: Last week's hot stock scam
 0.1 HTML_TEXT_AFTER_BODY   BODY: HTML contains text after BODY close
tag
 2.0 BAYES_80   BODY: Bayesian spam probability is 80 to 95%
[score: 0.8738]
 0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.0 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
 0.2 AWLAWL: From: address is in the auto white-list

while on another server (that I have instructed with the same messages)
I get:

Content analysis details:   (5.7 points, 5.0 required)

 pts rule name  description
 --
--
 1.7 SARE_PROLOSTOCK_SYM3   BODY: Last week's hot stock scam
 0.1 HTML_TEXT_AFTER_BODY   BODY: HTML contains text after BODY close
tag
 0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
 0.0 HTML_MESSAGE   BODY: HTML included in message
 3.5 BAYES_99   BODY: Bayesian spam probability is 99 to
100%
[score: 0.9996]
 0.0 MIME_HTML_ONLY BODY: Message only has text/html MIME parts

what it can be the reason of the different score assigned?
why the second system doesn't assign an AWL score?

rocsca


Re: Another false negative

2007-03-14 Thread Chris
On Wednesday 14 March 2007 5:49 am, Rocco Scappatura wrote:
> > If you can post the full email (headers and body), I'll run it over my
> > system which has lots and lots of third party add on rules from
> > www.rulesemporium.com and others and see if I can make SA
> > score it high
> > enough for Amavisd-new to block the email..
>
> Thanks.
>
> http://www.rocsca.it/INBOX
>
> I get the following score:
>

>
> Content analysis details:   (2.5 points, 5.0 required)
>
>  pts rule name  description
>  --
> --
>  1.7 SARE_PROLOSTOCK_SYM3   BODY: Last week's hot stock scam
>  0.1 HTML_TEXT_AFTER_BODY   BODY: HTML contains text after BODY close
> tag
>  0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
>  0.0 HTML_MESSAGE   BODY: HTML included in message
>  0.0 BAYES_50   BODY: Bayesian spam probability is 40 to 60%
> [score: 0.5547]
>  0.0 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
>  0.3 AWLAWL: From: address is in the auto white-list

Your message scored like this here:

X-Spam-Status: Yes, score=7.4 required=5.0 tests=BAYES_80=4.1,
FORGED_RCVD_HELO=0.135,HTML_30_40=0.374,HTML_MESSAGE=0.001,
HTML_TEXT_AFTER_BODY=0.115,MIME_HTML_ONLY=0.001,SAGREY=1,
SARE_PROLOSTOCK_SYM3=1.66 autolearn=disabled version=3.1.8

Content analysis details:   (7.4 points, 5.0 required)

 pts rule name  description
 -- --
 0.1 FORGED_RCVD_HELO   Received: contains a forged HELO
 1.7 SARE_PROLOSTOCK_SYM3   BODY: Last week's hot stock scam
 0.1 HTML_TEXT_AFTER_BODY   BODY: HTML contains text after BODY close tag
 4.1 BAYES_80   BODY: Bayesian spam probability is 80 to 95%
[score: 0.9413]
 0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.0 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
 1.0 SAGREY Adds 1.0 to spam from first-time senders

-- 
Chris
KeyID 0xE372A7DA98E6705C


pgpv0CYD81VU7.pgp
Description: PGP signature


Re: Another false negative

2007-03-15 Thread Anthony Peacock

Rocco Scappatura wrote:

So you are saying that I have to train SA?

That would be how you would improve your Bayes accuracy, yes.


I have trained SA on my server but I still get a score lower than 5.0..

Content analysis details:   (4.3 points, 5.0 required)

 pts rule name  description
 --
--
 1.7 SARE_PROLOSTOCK_SYM3   BODY: Last week's hot stock scam
 0.1 HTML_TEXT_AFTER_BODY   BODY: HTML contains text after BODY close
tag
 2.0 BAYES_80   BODY: Bayesian spam probability is 80 to 95%
[score: 0.8738]
 0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.0 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
 0.2 AWLAWL: From: address is in the auto white-list

while on another server (that I have instructed with the same messages)
I get:

Content analysis details:   (5.7 points, 5.0 required)

 pts rule name  description
 --
--
 1.7 SARE_PROLOSTOCK_SYM3   BODY: Last week's hot stock scam
 0.1 HTML_TEXT_AFTER_BODY   BODY: HTML contains text after BODY close
tag
 0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
 0.0 HTML_MESSAGE   BODY: HTML included in message
 3.5 BAYES_99   BODY: Bayesian spam probability is 99 to
100%
[score: 0.9996]
 0.0 MIME_HTML_ONLY BODY: Message only has text/html MIME parts

what it can be the reason of the different score assigned?
why the second system doesn't assign an AWL score?


They give different Bayes scores so the Bayes databases have been 
trained with different messages.  Do you have autolearn switched on?


And you must understand that the Bayes system is not a one shot and you 
have if fixed kind of system.  Just training a single message will alter 
the scoring, but you may also need to train it with a few similar 
messages for it to significantly change its scoring.


--
Anthony Peacock
CHIME, Royal Free & University College Medical School
WWW:http://www.chime.ucl.ac.uk/~rmhiajp/
"If you have an apple and I have  an apple and we  exchange apples
then you and I will still each have  one apple. But  if you have an
idea and I have an idea and we exchange these ideas, then each of us
will have two ideas." -- George Bernard Shaw


RE: Another false negative

2007-03-19 Thread Rocco Scappatura
> > what it can be the reason of the different score assigned?
> > why the second system doesn't assign an AWL score?
> 
> They give different Bayes scores so the Bayes databases have 
> been trained with different messages.  Do you have autolearn 
> switched on?

#   Bayesian classifier auto-learning (default: 1)
#
# bayes_auto_learn 1

Do I have to set it to 0?

But Then how I have to instruct Spamassassin? What is the best way? Do I
have a spam folder to instruct SA?

> And you must understand that the Bayes system is not a one 
> shot and you have if fixed kind of system.  Just training a 
> single message will alter the scoring, but you may also need 
> to train it with a few similar messages for it to 
> significantly change its scoring.

You're saying right. Now I understand. 

Thank you,

rocsca


Re: Another false negative

2007-03-19 Thread Anthony Peacock

Hi,

Rocco Scappatura wrote:

what it can be the reason of the different score assigned?
why the second system doesn't assign an AWL score?
They give different Bayes scores so the Bayes databases have 
been trained with different messages.  Do you have autolearn 
switched on?


#   Bayesian classifier auto-learning (default: 1)
#
# bayes_auto_learn 1

Do I have to set it to 0?


No, but that may explain why the two servers have different Bayes scores 
for similar messages.  If they receive different message streams they 
will be learning a different view of the email world.



But Then how I have to instruct Spamassassin? What is the best way? Do I
have a spam folder to instruct SA?


I don't think you need to turn off autolearn, you may want to adjust 
your threshholds, mine are set to this:


bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 12.0

I have autolearn switched on, but I also manually train with false 
negatives, and I occasionally train a bunch of recent ham as ham.




And you must understand that the Bayes system is not a one 
shot and you have if fixed kind of system.  Just training a 
single message will alter the scoring, but you may also need 
to train it with a few similar messages for it to 
significantly change its scoring.


You're saying right. Now I understand. 


Thank you,

rocsca





--
Anthony Peacock
CHIME, Royal Free & University College Medical School
WWW:http://www.chime.ucl.ac.uk/~rmhiajp/
"If you have an apple and I have  an apple and we  exchange apples
then you and I will still each have  one apple. But  if you have an
idea and I have an idea and we exchange these ideas, then each of us
will have two ideas." -- George Bernard Shaw


RE: Another false negative

2007-03-19 Thread Rocco Scappatura
> > Do I have to set it to 0?
> 
> No, but that may explain why the two servers have different 
> Bayes scores for similar messages.  If they receive different 
> message streams they will be learning a different view of the 
> email world.

OK. Thanks all clear for me!!

> > But Then how I have to instruct Spamassassin? What is the 
> best way? Do 
> > I have a spam folder to instruct SA?
> 
> I don't think you need to turn off autolearn, you may want to 
> adjust your threshholds, mine are set to this:
> 
> bayes_auto_learn_threshold_nonspam -0.1
> bayes_auto_learn_threshold_spam 12.0
> 
> I have autolearn switched on, but I also manually train with 
> false negatives, and I occasionally train a bunch of recent 
> ham as ham.

OK. I will do that to!

rocsca