Re: Geocities closed

2009-10-27 Thread Mike Cardwell

Alex wrote:


Thought I would pass along that geocities closed up and went home today:

http://geocities.yahoo.com/

Wondering what this means in terms of the geocities SA rules? Would
sure be nice to just block them outright at the gateway, but in
From/To header and body, no?


Why have any geocities specific rules any more if geocities doesn't 
exist? It's not as if spammers can host their websites on geocities 
anymore so there's no reason why a spammer would include a geocities url 
in their spam. May as well just delete the rules...


--
Mike Cardwell - IT Consultant and LAMP developer
Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/
Technical Blog: https://secure.grepular.com/blog/


Re: Geocities closed

2009-10-27 Thread LuKreme

On 27-Oct-2009, at 04:53, Mike Cardwell wrote:
Why have any geocities specific rules any more if geocities doesn't  
exist? It's not as if spammers can host their websites on geocities  
anymore so there's no reason why a spammer would include a geocities  
url in their spam. May as well just delete the rules...



If the links are still appearing in SPAM then no, don't delete the  
rules, just bump up the scores.


--
I want a party where all the women wear new dresses and all the men
drink beer. -- Jason Gaes



sa-learn spam and Bayes_50

2009-10-27 Thread Sam

Hi,

I run spamassassin quite fine on a debian-lenny system.
But I'm having a problem with sa-learn --spam and 1 message :

http://www.pastebin.org/48668

lenny:/home/samuel# sa-learn --dump magic
0.000  0  3  0  non-token data: bayes db version
0.000  0 112507  0  non-token data: nspam
0.000  0844  0  non-token data: nham
0.000  01934989  0  non-token data: ntokens
0.000  0 1047578051  0  non-token data: oldest atime
0.000  0 1256645591  0  non-token data: newest atime
0.000  0 1256642650  0  non-token data: last journal 
sync atime

0.000  0  0  0  non-token data: last expiry atime
0.000  0  0  0  non-token data: last expire 
atime delta
0.000  0  0  0  non-token data: last expire 
reduction count


sa-learn --forget 
/home/samuel/Maildir/.Spam_Ingescom/cur/1256632851.M372194P2272V1602I0006E328_0.lenny\,S\=9970\:2\,S

Forgot tokens from 1 message(s) (1 message(s) examined)

lenny:/home/samuel# sa-learn --spam 
/home/samuel/Maildir/.Spam_Ingescom/cur/1256632851.M372194P2272V1602I0006E328_0.lenny\,S\=9970\:2\,S

Learned tokens from 1 message(s) (1 message(s) examined)

But Bayes still show BAYES_50 :

spamassassin -D  
/home/samuel/Maildir/.Spam_Ingescom/cur/1256632851.M372194P2272V1602I0006E328_0.lenny\,S\=9970\:2\,S

[9414] dbg: logger: adding facilities: all
[9414] dbg: logger: logging level is DBG
[9414] dbg: generic: SpamAssassin version 3.2.5
[9414] dbg: config: score set 0 chosen.
[9414] dbg: util: running in taint mode? yes
[9414] dbg: util: taint mode: deleting unsafe environment variables, 
resetting PATH

[9414] dbg: util: PATH included '/usr/local/sbin', keeping
[9414] dbg: util: PATH included '/usr/local/bin', keeping
[9414] dbg: util: PATH included '/usr/sbin', keeping
[9414] dbg: util: PATH included '/usr/bin', keeping
[9414] dbg: util: PATH included '/sbin', keeping
[9414] dbg: util: PATH included '/bin', keeping
[9414] dbg: util: final PATH set to: 
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

[9414] dbg: dns: is Net::DNS::Resolver available? yes
[9414] dbg: dns: Net::DNS version: 0.63
[9414] dbg: config: using /etc/spamassassin for site rules pre files
[9414] dbg: config: read file /etc/spamassassin/init.pre
[9414] dbg: config: read file /etc/spamassassin/v310.pre
[9414] dbg: config: read file /etc/spamassassin/v312.pre
[9414] dbg: config: read file /etc/spamassassin/v320.pre
[9414] dbg: config: using /var/lib/spamassassin/3.002005 for sys rules 
pre files
[9414] dbg: config: using /var/lib/spamassassin/3.002005 for default 
rules dir
[9414] dbg: config: read file 
/var/lib/spamassassin/3.002005/updates_spamassassin_org.cf

[9414] dbg: config: using /etc/spamassassin for site rules dir
[9414] dbg: config: read file /etc/spamassassin/65_debian.cf
[9414] dbg: config: read file /etc/spamassassin/iXhash.cf
[9414] dbg: config: read file /etc/spamassassin/local.cf
[9414] dbg: config: read file /etc/spamassassin/sql.cf
[9414] dbg: config: using /root/.spamassassin for user state dir
[9414] dbg: config: using /root/.spamassassin/user_prefs for user 
prefs file

[9414] dbg: config: read file /root/.spamassassin/user_prefs
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from @INC
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::Hashcash from @INC
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::SPF from @INC
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::DCC from @INC
[9414] dbg: dcc: network tests on, registering DCC
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::Pyzor from @INC
[9414] dbg: pyzor: network tests on, attempting Pyzor
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::Razor2 from @INC
[9414] dbg: razor2: razor2 is available, version 2.84
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::SpamCop from @INC
[9414] dbg: reporter: network tests on, attempting SpamCop
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::AWL from @INC
[9414] dbg: plugin: loading 
Mail::SpamAssassin::Plugin::AutoLearnThreshold from @INC
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::WhiteListSubject 
from @INC

[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::MIMEHeader from @INC
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::ReplaceTags from 
@INC

[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::Check from @INC
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::HTTPSMismatch 
from @INC

[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::URIDetail from @INC
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::Bayes from @INC
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::BodyEval from @INC
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::DNSEval from @INC
[9414] dbg: plugin: loading Mail::SpamAssassin::Plugin::HTMLEval from 

Re: Geocities closed

2009-10-27 Thread rich...@buzzhost.co.uk
On Tue, 2009-10-27 at 05:08 -0600, LuKreme wrote:
 On 27-Oct-2009, at 04:53, Mike Cardwell wrote:
  Why have any geocities specific rules any more if geocities doesn't  
  exist? It's not as if spammers can host their websites on geocities  
  anymore so there's no reason why a spammer would include a geocities  
  url in their spam. May as well just delete the rules...
 
 
 If the links are still appearing in SPAM then no, don't delete the  
 rules, just bump up the scores.
 

Would this not be almost entirely pointless? With spam the motto is
'follow the money'. if the link does not work, there is no path to the
money to follow. Other than prospecting for valid recipients {which
could be done just as easily without the link} there is no benefit for a
spammer to include a link of this nature.







Re: Geocities closed

2009-10-27 Thread DAve

rich...@buzzhost.co.uk wrote:

On Tue, 2009-10-27 at 05:08 -0600, LuKreme wrote:

On 27-Oct-2009, at 04:53, Mike Cardwell wrote:
Why have any geocities specific rules any more if geocities doesn't  
exist? It's not as if spammers can host their websites on geocities  
anymore so there's no reason why a spammer would include a geocities  
url in their spam. May as well just delete the rules...


If the links are still appearing in SPAM then no, don't delete the  
rules, just bump up the scores.




Would this not be almost entirely pointless? With spam the motto is
'follow the money'. if the link does not work, there is no path to the
money to follow. Other than prospecting for valid recipients {which
could be done just as easily without the link} there is no benefit for a
spammer to include a link of this nature.


I have been scoring any mail with a geocities URL at +5 for over a year 
now without a complaint. I believe I will leave the geocities rules in 
place until they no longer hit mail.


DAve


--
Posterity, you will know how much it cost the present generation to
preserve your freedom.  I hope you will make good use of it.  If you
do not, I shall repent in heaven that ever I took half the pains to
preserve it. John Quincy Adams

http://appleseedinfo.org



Re: Geocities closed

2009-10-27 Thread LuKreme

On 27-Oct-2009, at 06:42, rich...@buzzhost.co.uk wrote:

On Tue, 2009-10-27 at 05:08 -0600, LuKreme wrote:

On 27-Oct-2009, at 04:53, Mike Cardwell wrote:

Why have any geocities specific rules any more if geocities doesn't
exist? It's not as if spammers can host their websites on geocities
anymore so there's no reason why a spammer would include a geocities
url in their spam. May as well just delete the rules...


If the links are still appearing in SPAM then no, don't delete the
rules, just bump up the scores.


Would this not be almost entirely pointless? With spam the motto is
'follow the money'. if the link does not work, there is no path to the
money to follow. Other than prospecting for valid recipients {which
could be done just as easily without the link} there is no benefit  
for a

spammer to include a link of this nature.


Sure. that has nothing to do with what I said.

If it still shows up in spam, score it higher.

It would be stupid to remove the rules if they are still hitting.

--
Personal isn't the same as important



Re: Geocities closed

2009-10-27 Thread John Rudd
On Tue, Oct 27, 2009 at 05:42, rich...@buzzhost.co.uk
rich...@buzzhost.co.uk wrote:
 On Tue, 2009-10-27 at 05:08 -0600, LuKreme wrote:
 On 27-Oct-2009, at 04:53, Mike Cardwell wrote:
  Why have any geocities specific rules any more if geocities doesn't
  exist? It's not as if spammers can host their websites on geocities
  anymore so there's no reason why a spammer would include a geocities
  url in their spam. May as well just delete the rules...


 If the links are still appearing in SPAM then no, don't delete the
 rules, just bump up the scores.


 Would this not be almost entirely pointless? With spam the motto is
 'follow the money'. if the link does not work, there is no path to the
 money to follow. Other than prospecting for valid recipients {which
 could be done just as easily without the link} there is no benefit for a
 spammer to include a link of this nature.

You're assuming that spammers will perfectly update all existing spam.
 There might be crud floating around out there for a while to come.
And, any messages that contain that crud are just as likely to be spam
as they were before.

My suggestion: proceed as normal.  Adjust the scores for geocities
spam as the analysis tools on currnet/live* spam suggest, until such
time as there are no more spam messages showing up that are hitting
the geocities rules ... for at least 1-3 months.  Once they stop
showing up in the wild for a substantial period of time (ie. my 1-3
months suggestion), THEN remove them from the rules.  Not before.

(*  not the corpus of past/historical/stale spam)


Re: Geocities closed

2009-10-27 Thread rich...@buzzhost.co.uk
On Tue, 2009-10-27 at 05:50 -0700, John Rudd wrote:
 On Tue, Oct 27, 2009 at 05:42, rich...@buzzhost.co.uk
 rich...@buzzhost.co.uk wrote:
  On Tue, 2009-10-27 at 05:08 -0600, LuKreme wrote:
  On 27-Oct-2009, at 04:53, Mike Cardwell wrote:
   Why have any geocities specific rules any more if geocities doesn't
   exist? It's not as if spammers can host their websites on geocities
   anymore so there's no reason why a spammer would include a geocities
   url in their spam. May as well just delete the rules...
 
 
  If the links are still appearing in SPAM then no, don't delete the
  rules, just bump up the scores.
 
 
  Would this not be almost entirely pointless? With spam the motto is
  'follow the money'. if the link does not work, there is no path to the
  money to follow. Other than prospecting for valid recipients {which
  could be done just as easily without the link} there is no benefit for a
  spammer to include a link of this nature.
 
 You're assuming that spammers will perfectly update all existing spam.
  There might be crud floating around out there for a while to come.

I'm not assuming anything John. Spam with no endgame is pointless spam.
All spam has a point and purpose - or it would not exist. Most spammers
staging or springboarding from such places turn their links around
mighty fast - they know they wont be up for long, so whilst I sure there
may be the odd 'floater' around, the enemy is formidable and ahead of
the game.

 My suggestion: proceed as normal.  Adjust the scores for geocities
 spam as the analysis tools on currnet/live* spam suggest, until such
 time as there are no more spam messages showing up that are hitting
 the geocities rules ... for at least 1-3 months.  Once they stop
 showing up in the wild for a substantial period of time (ie. my 1-3
 months suggestion), THEN remove them from the rules.  Not before.
 
 (*  not the corpus of past/historical/stale spam)
John I agree. I don't think there is any need to rush to do anything. It
would make sense to phase out the rule in a period of time. A few extra
lines of regex is not going to kill most machines - but long term there
will probably be little benefit keeping it in.



Re: sa-learn spam and Bayes_50

2009-10-27 Thread RW
On Tue, 27 Oct 2009 13:33:14 +0100
Sam liste-spamassas...@ingescom.com wrote:

 Hi,
 
 I run spamassassin quite fine on a debian-lenny system.
 But I'm having a problem with sa-learn --spam and 1 message :
 
...
 But Bayes still show BAYES_50 :

If you find it surprising that that can happen, you don't understand
how Bayes works. It's a leaning system that's intended to classify mail
it hasn't seen based on mail it has seen. 


Re: Geocities closed

2009-10-27 Thread John Rudd
On Tue, Oct 27, 2009 at 06:06, rich...@buzzhost.co.uk
rich...@buzzhost.co.uk wrote:
 On Tue, 2009-10-27 at 05:50 -0700, John Rudd wrote:
 On Tue, Oct 27, 2009 at 05:42, rich...@buzzhost.co.uk
 rich...@buzzhost.co.uk wrote:
  On Tue, 2009-10-27 at 05:08 -0600, LuKreme wrote:
  On 27-Oct-2009, at 04:53, Mike Cardwell wrote:
   Why have any geocities specific rules any more if geocities doesn't
   exist? It's not as if spammers can host their websites on geocities
   anymore so there's no reason why a spammer would include a geocities
   url in their spam. May as well just delete the rules...
 
 
  If the links are still appearing in SPAM then no, don't delete the
  rules, just bump up the scores.
 
 
  Would this not be almost entirely pointless? With spam the motto is
  'follow the money'. if the link does not work, there is no path to the
  money to follow. Other than prospecting for valid recipients {which
  could be done just as easily without the link} there is no benefit for a
  spammer to include a link of this nature.

 You're assuming that spammers will perfectly update all existing spam.
  There might be crud floating around out there for a while to come.

 I'm not assuming anything John. Spam with no endgame is pointless spam.
 All spam has a point and purpose - or it would not exist. Most spammers
 staging or springboarding from such places turn their links around
 mighty fast - they know they wont be up for long, so whilst I sure there
 may be the odd 'floater' around, the enemy is formidable and ahead of
 the game.

 My suggestion: proceed as normal.  Adjust the scores for geocities
 spam as the analysis tools on currnet/live* spam suggest, until such
 time as there are no more spam messages showing up that are hitting
 the geocities rules ... for at least 1-3 months.  Once they stop
 showing up in the wild for a substantial period of time (ie. my 1-3
 months suggestion), THEN remove them from the rules.  Not before.

 (*  not the corpus of past/historical/stale spam)
 John I agree. I don't think there is any need to rush to do anything. It
 would make sense to phase out the rule in a period of time. A few extra
 lines of regex is not going to kill most machines - but long term there
 will probably be little benefit keeping it in.

I agree -- long term, there should be little to no benefit to keeping it.

Just have to figure out what the dividing line between near term and
long term is :-)


Re: Geocities closed

2009-10-27 Thread Matus UHLAR - fantomas
 On Tue, 2009-10-27 at 05:50 -0700, John Rudd wrote:
  You're assuming that spammers will perfectly update all existing spam.
   There might be crud floating around out there for a while to come.

On 27.10.09 13:06, rich...@buzzhost.co.uk wrote:
 I'm not assuming anything John. Spam with no endgame is pointless spam.
 All spam has a point and purpose - or it would not exist. Most spammers
 staging or springboarding from such places turn their links around
 mighty fast - they know they wont be up for long, so whilst I sure there
 may be the odd 'floater' around, the enemy is formidable and ahead of
 the game.

Are we talking that the spam should not exist or about the spam still
exists?

The fact is, that if we get old spam, we should detect it, regardless if
spammers make money on it or not. 

-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Windows 2000: 640 MB ought to be enough for anybody


Re: Geocities closed

2009-10-27 Thread Dan Schaefer

Matus UHLAR - fantomas wrote:

On Tue, 2009-10-27 at 05:50 -0700, John Rudd wrote:


You're assuming that spammers will perfectly update all existing spam.
 There might be crud floating around out there for a while to come.
  


On 27.10.09 13:06, rich...@buzzhost.co.uk wrote:
  

I'm not assuming anything John. Spam with no endgame is pointless spam.
All spam has a point and purpose - or it would not exist. Most spammers
staging or springboarding from such places turn their links around
mighty fast - they know they wont be up for long, so whilst I sure there
may be the odd 'floater' around, the enemy is formidable and ahead of
the game.



Are we talking that the spam should not exist or about the spam still
exists?

The fact is, that if we get old spam, we should detect it, regardless if
spammers make money on it or not. 

  
I was about to write something to that effect. Not all spam is created 
to make money. There is the annoyance factor as well. After the 
geocities rules are not enforced anymore (and I'm sure Spammers are 
monitoring this list and the the SA rules), the spammers could start up 
the geocities spam again just to annoy the users and admins, even though 
they will be broken links. SA is going to have to re-instate the rules 
at some point.


--
Dan Schaefer
Web Developer/Systems Analyst
Performance Administration Corp.



Re: Geocities closed

2009-10-27 Thread RW
On Tue, 27 Oct 2009 05:08:20 -0600
LuKreme krem...@kreme.com wrote:

 On 27-Oct-2009, at 04:53, Mike Cardwell wrote:
  Why have any geocities specific rules any more if geocities
  doesn't exist? It's not as if spammers can host their websites on
  geocities anymore so there's no reason why a spammer would include
  a geocities url in their spam. May as well just delete the rules...
 
 
 If the links are still appearing in SPAM then no, don't delete the  
 rules, just bump up the scores.
 

I wouldn't increase them, my guess is that spammers will drop the links
very quickly, but there may be geocities links in signatures that
persist for some time.


Re: Geocities closed

2009-10-27 Thread rich...@buzzhost.co.uk
I just found this one working:

http://uk.geocities.com/midsomerland/midsomerland_indexone.htm

so providence would suggest leaving things alone.



Re: sa-learn spam and Bayes_50

2009-10-27 Thread Sam

RW a écrit :

On Tue, 27 Oct 2009 13:33:14 +0100
Sam liste-spamassas...@ingescom.com wrote:

  

Hi,

I run spamassassin quite fine on a debian-lenny system.
But I'm having a problem with sa-learn --spam and 1 message :

...
But Bayes still show BAYES_50 :



If you find it surprising that that can happen, you don't understand
how Bayes works. It's a leaning system that's intended to classify mail
it hasn't seen based on mail it has seen. 

  
I agree with you for non-seen mail. But after learning with sa-learn I 
thought bayes should increase over Bayes_50 for the same learned message.


Sam.


Re: sa-learn spam and Bayes_50

2009-10-27 Thread Matus UHLAR - fantomas
 On Tue, 27 Oct 2009 13:33:14 +0100
 Sam liste-spamassas...@ingescom.com wrote:
 I run spamassassin quite fine on a debian-lenny system.
 But I'm having a problem with sa-learn --spam and 1 message :

 ...
 But Bayes still show BAYES_50 :

 RW a écrit :
 If you find it surprising that that can happen, you don't understand
 how Bayes works. It's a leaning system that's intended to classify mail
 it hasn't seen based on mail it has seen. 

On 27.10.09 15:01, Sam wrote:
 I agree with you for non-seen mail. But after learning with sa-learn I  
 thought bayes should increase over Bayes_50 for the same learned message.

what was the score in the begin?
-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
They say when you play that M$ CD backward you can hear satanic messages.
That's nothing. If you play it forward it will install Windows.


Re: sa-learn spam and Bayes_50

2009-10-27 Thread Sam

Matus UHLAR - fantomas a écrit :

On Tue, 27 Oct 2009 13:33:14 +0100
Sam liste-spamassas...@ingescom.com wrote:
  

I run spamassassin quite fine on a debian-lenny system.
But I'm having a problem with sa-learn --spam and 1 message :

...
But Bayes still show BAYES_50 :



  

RW a écrit :


If you find it surprising that that can happen, you don't understand
how Bayes works. It's a leaning system that's intended to classify mail
it hasn't seen based on mail it has seen. 
  


On 27.10.09 15:01, Sam wrote:
  
I agree with you for non-seen mail. But after learning with sa-learn I  
thought bayes should increase over Bayes_50 for the same learned message.



what was the score in the begin?
  

I got it this night :

Oct 27 00:28:24 lenny spamd[20399]: spamd: clean message (0.0/5.0) for 
samueldu...@ingescom.com:102 in 4.8 seconds, 9803 bytes.
Oct 27 00:28:24 lenny spamd[20399]: spamd: result: . 0 - 
BAYES_50,HTML_IMAGE_RATIO_08,HTML_MESSAGE,MISSING_MID 
scantime=4.8,size=9803,user=samueldu...@ingescom.com,uid=102,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=39593,mid=(unknown),bayes=0.50,autolearn=no

O

And after learning with sa-learn, it is still saying bayes_50 whereas 
sa-learn told it has learned it.


Thanks.
Sam


Re: Geocities closed

2009-10-27 Thread Martin Gregorie
On Tue, 2009-10-27 at 13:59 +, rich...@buzzhost.co.uk wrote:
 I just found this one working:
 
 http://uk.geocities.com/midsomerland/midsomerland_indexone.htm
 
 so providence would suggest leaving things alone.

The domains still exist. www.geocities.com and uk.geocities.com both
redirect at DNS level to 98.137.46.72, which is the host named
intl1.geo.vip.sp2.yahoo.com

However, there's more going on: putting www.geocities.com into a web
browser brings up the Geocities 'demolished on the 26th' web page at
www.geocities.yahoo.com while doing the same with uk.geocities.com
brings up a Yahoo login page.

From this I'd guess that the Geocities domains may be around for quite a
while: the domain hasn't simply been abandoned with a holding page.


Martin




Re: sa-learn spam and Bayes_50

2009-10-27 Thread Adam Katz
Sam wrote:
 I run spamassassin quite fine on a debian-lenny system.
 But I'm having a problem with sa-learn --spam and 1 message :
 But Bayes still show BAYES_50 :

The Bayesian algorithm adds tokens from messages it is taught.  These
tokens are then added to the database's existing tokens and
probabilities are recalculated for each token.  Sometimes those new
tokens aren't terribly useful, having been trained in both ham and
spam.  It is always possible that a message you just trained still
lacks certainty, thus getting /rounded/ to BAYES_50.

RW wrote:
 If you find it surprising that that can happen, you don't
 understand how Bayes works. It's a leaning system that's intended
 to classify mail it hasn't seen based on mail it has seen.

BAYES_50 may be the default for a new mail with no known tokens (a
pure 50.000%), but it can also be the result of conflicting tokens
already in the system (anything ranging from 45.000% to 54.999%).

If you were to tell SpamAssassin to report the actual bayes score
(e.g. add_header all Bayes _BAYES_ in your local.cf), you'd probably
find that that message wasn't a pure 50% (though I can't recall how
many significant digits it uses).


Re: sa-learn spam and Bayes_50

2009-10-27 Thread John Hardin

On Tue, 27 Oct 2009, Sam wrote:

Oct 27 00:28:24 lenny spamd[20399]: spamd: clean message (0.0/5.0) for 
samueldu...@ingescom.com:102 in 4.8 seconds, 9803 bytes.
Oct 27 00:28:24 lenny spamd[20399]: spamd: result: . 0 - 
BAYES_50,HTML_IMAGE_RATIO_08,HTML_MESSAGE,MISSING_MID 
scantime=4.8,size=9803,user=samueldu...@ingescom.com,uid=102,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=39593,mid=(unknown),bayes=0.50,autolearn=no

O

And after learning with sa-learn, it is still saying bayes_50 whereas 
sa-learn told it has learned it.


Okay, basic Bayes troubleshooting questions:

(1) Are you running sa-learn as the same user that SA itself is running 
as, so that you're training the Bayes database that SA is actually using 
to score messages?


(2) Please run sa-learn --dump magic and send us the results.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  ...the Fates notice those who buy chainsaws...
  -- www.darwinawards.com
---
 4 days until Halloween


Re: sa-learn spam and Bayes_50

2009-10-27 Thread RW
On Tue, 27 Oct 2009 15:01:39 +0100
Sam liste-spamassas...@ingescom.com wrote:

 RW a écrit :
  On Tue, 27 Oct 2009 13:33:14 +0100

  If you find it surprising that that can happen, you don't understand
  how Bayes works. It's a leaning system that's intended to classify
  mail it hasn't seen based on mail it has seen. 
 

 I agree with you for non-seen mail. But after learning with sa-learn
 I thought bayes should increase over Bayes_50 for the same learned
 message.

Most mails contain a number of hapaxes, one-off tokens that are never
seen again. If you train on a mail and then retest, hapaxes and other
rare tokens often skew the result to produce a positive match; this is
why sometimes a retest will score BAYES_99, but an almost identical spam
will hit BAYES_50.

On some retests the hapaxes don't dominate on retesting and the
probability stays close to .5. Like many such filters BAYES clusters
strongly around 0, 0.5 and 1. If it allowed you to retrain to
exhaustion (which it doesn't) you would probably see  several BAYES_50
results followed by a step change to BAYES_99.


Check that you haven't set bayes_use_hapaxes 0. Otherwise if you are
seeing a lot of trained mails hit BAYES_50 on retesting (and I mean 10%
or so) you may have a mistrained database. If you only see a few, forget
about it.


Low Score - {Brazillian Host} Lottery Spam

2009-10-27 Thread rich...@buzzhost.co.uk
Anyone else seeing these today? Or seen them recently?

http://pastebin.com/m4e25954f

score=0.1

Subject was real neat: 
Subject: =?ISO-8859-1?B?WW91IFdvbiCjMQ==?=,750,000.00 GBP

You Won £750,000.00 GBP {surprised this did not bite}


End of the message is missing on the five of them that I've had (not a
paste error).





Re: Low Score - {Brazillian Host} Lottery Spam

2009-10-27 Thread Adam Katz
rich...@buzzhost.co.uk wrote:
 Anyone else seeing these today? Or seen them recently?
 
 http://pastebin.com/m4e25954f
 
 score=0.1
 
 Subject was real neat: 
 Subject: =?ISO-8859-1?B?WW91IFdvbiCjMQ==?=,750,000.00 GBP
 
 You Won £750,000.00 GBP {surprised this did not bite}
 
 
 End of the message is missing on the five of them that I've had
 (not a paste error).

Interesting.  I'm also surprised that doesn't hit one of the many
large-sum money checks.  Scored 5.2 for me (bayes_99 plus a few custom
rules of questionable utility).

Content analysis details:   (5.2 points, 5.0 required)

 pts rule name  description
 -- -
 3.9 BAYES_99   BODY: Bayesian spam probability is 99 to 100%
[score: 0.9998]
 0.6 KHOP_SC_TOP_CIDR8  Relay listed in SpamCop top 4 IP/8 CIDRs
-0.0 SPF_PASS   SPF: sender matches SPF record
 0.8 FROM_NOT_REPLY From: and Reply-To: have different domains
 0.0 KHOP_NO_FULL_NAME  Sender does not have both First and Last names
 0.0 KHOP_NEW_TO_ME New sender in new thread

Note that FROM_NOT_REPLY and KHOP_NEW_TO_ME are non-published rules.
The former requires a plugin.  KHOP_NO_FULL_NAME (now in khop-lists)
is zeroed and KHOP_SC_TOP_CIDR8 (from khop-sc-neighbors) is arguably
unfair given its broad range (though it certainly did its work here).


Re: [sa] Re: Geocities closed

2009-10-27 Thread Charles Gregory

On Tue, 27 Oct 2009, rich...@buzzhost.co.uk wrote:

I just found this one working:
http://uk.geocities.com/midsomerland/midsomerland_indexone.htm
so providence would suggest leaving things alone.


Yes, if you go to the Yahoo FAQ on the close-down, you will find that
one option available prior to Oct. 25 was if someone upgraded to their 
'hosting' service, they could KEEP their old geocities link as a redirect 
to their new hosting site. So we may find after a few months that 
geocities links appear more frequently in ham than spam :)


- C


Re: sa-learn spam and Bayes_50

2009-10-27 Thread Sam

John Hardin a écrit :

On Tue, 27 Oct 2009, Sam wrote:

Oct 27 00:28:24 lenny spamd[20399]: spamd: clean message (0.0/5.0) 
for samueldu...@ingescom.com:102 in 4.8 seconds, 9803 bytes.
Oct 27 00:28:24 lenny spamd[20399]: spamd: result: . 0 - 
BAYES_50,HTML_IMAGE_RATIO_08,HTML_MESSAGE,MISSING_MID 
scantime=4.8,size=9803,user=samueldu...@ingescom.com,uid=102,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=39593,mid=(unknown),bayes=0.50,autolearn=no 


O

And after learning with sa-learn, it is still saying bayes_50 whereas 
sa-learn told it has learned it.


Okay, basic Bayes troubleshooting questions:

(1) Are you running sa-learn as the same user that SA itself is 
running as, so that you're training the Bayes database that SA is 
actually using to score messages?


(2) Please run sa-learn --dump magic and send us the results.

1) For all users there is only one database in /var/bayes. I've done 
some tests with su Debian-exim and it is same result.


2) lenny:/home/samuel# sa-learn --dump magic
0.000  0  3  0  non-token data: bayes db version
0.000  0 112532  0  non-token data: nspam
0.000  0844  0  non-token data: nham
0.000  01935545  0  non-token data: ntokens
0.000  0 1047578051  0  non-token data: oldest atime
0.000  0 1256661628  0  non-token data: newest atime
0.000  0 1256648676  0  non-token data: last journal 
sync atime

0.000  0  0  0  non-token data: last expiry atime
0.000  0  0  0  non-token data: last expire 
atime delta
0.000  0  0  0  non-token data: last expire 
reduction count


Thanks.
Sam.



Re: Low Score - {Brazillian Host} Lottery Spam

2009-10-27 Thread John Hardin

On Tue, 27 Oct 2009, rich...@buzzhost.co.uk wrote:


Anyone else seeing these today? Or seen them recently?

http://pastebin.com/m4e25954f


I get lots like them. I'm working on updating the Advance Fee rules, but 
they won't be released until 3.3.1


In my testbed with sandbox rules, that got:

 pts rule name  description
 -- --
 0.5 LOTTO_AGENTBODY: Claims Agent
 1.0 FILL_THIS_FORM_LONGBODY: Fill in a form with personal information
 1.0 LOTTO_YOU_WON  You won!
 0.0 LOTS_OF_MONEY  Huge... sums of money
 1.0 FILL_THIS_FORM Fill in a form with personal information
 0.5 FILL_THIS_FORM_LOANAnswer loan question(s)
 1.0 ADVANCE_FEE_2_NEW  Appears to be advance fee fraud (Nigerian 419)
 3.0 MONEY_FORM Lots of money if you fill out a form
 1.0 ADVANCE_FEE_3_NEW  Appears to be advance fee fraud (Nigerian 419)
 1.5 MONEY_LOTTERY  Lots of money from a lottery
 0.2 MONEY_FRAUDLots of money and any of the fraud rules
 1.0 ADVANCE_FEE_2_NEW_MONEY Advance Fee fraud and lots of money
 1.0 ADVANCE_FEE_3_NEW_FORM Advance Fee fraud and a form
 1.0 ADVANCE_FEE_3_NEW_MONEY Advance Fee fraud and lots of money
 1.0 ADVANCE_FEE_2_NEW_FORM Advance Fee fraud and a form
 1.0 ADVANCE_FEE_2_NEW_FRM_MNY Advance Fee fraud and lots of money
 1.0 ADVANCE_FEE_3_NEW_FRM_MNY Advance Fee fraud and lots of money
 0.2 FORM_FRAUD Fill a form and any of the fraud rules

Yes, there's some overlap; these _are_ testing rules, after all...

Contact me offlist if you want to install the sandbox rules for them, I'll 
give you instructions.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  ...the Fates notice those who buy chainsaws...
  -- www.darwinawards.com
---
 4 days until Halloween


Re: Low Score - {Brazillian Host} Lottery Spam

2009-10-27 Thread John Hardin

On Tue, 27 Oct 2009, Adam Katz wrote:


rich...@buzzhost.co.uk wrote:


You Won £750,000.00 GBP {surprised this did not bite}


Interesting.  I'm also surprised that doesn't hit one of the many
large-sum money checks.


The existing ones are weak w/r/t non-USD currencies. That's one reason I 
started on the lotsa_money stuff.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  ...the Fates notice those who buy chainsaws...
  -- www.darwinawards.com
---
 4 days until Halloween

Re: sa-learn spam and Bayes_50

2009-10-27 Thread Sam

RW a écrit :

On Tue, 27 Oct 2009 15:01:39 +0100
Sam liste-spamassas...@ingescom.com wrote:

  

RW a écrit :


On Tue, 27 Oct 2009 13:33:14 +0100
  


  

If you find it surprising that that can happen, you don't understand
how Bayes works. It's a leaning system that's intended to classify
mail it hasn't seen based on mail it has seen. 

  
  

I agree with you for non-seen mail. But after learning with sa-learn
I thought bayes should increase over Bayes_50 for the same learned
message.



Most mails contain a number of hapaxes, one-off tokens that are never
seen again. If you train on a mail and then retest, hapaxes and other
rare tokens often skew the result to produce a positive match; this is
why sometimes a retest will score BAYES_99, but an almost identical spam
will hit BAYES_50.

On some retests the hapaxes don't dominate on retesting and the
probability stays close to .5. Like many such filters BAYES clusters
strongly around 0, 0.5 and 1. If it allowed you to retrain to
exhaustion (which it doesn't) you would probably see  several BAYES_50
results followed by a step change to BAYES_99.


Check that you haven't set bayes_use_hapaxes 0. Otherwise if you are
seeing a lot of trained mails hit BAYES_50 on retesting (and I mean 10%
or so) you may have a mistrained database. If you only see a few, forget
about it.

  

There is no hapax option set.
When a spam isn't marked by spamassassin and bayes isn't bayes_99 I 
always train manually with sa-learn.
And I think that I have always seen sa-learn making message going from 
bayes_X to bayes_99 when learning and restesting.


I do not remember  this  situation anytime.

Thanks.


Re: sa-learn spam and Bayes_50

2009-10-27 Thread Sam

Adam Katz a écrit :

Sam wrote:
  

I run spamassassin quite fine on a debian-lenny system.
But I'm having a problem with sa-learn --spam and 1 message :
But Bayes still show BAYES_50 :
  


The Bayesian algorithm adds tokens from messages it is taught.  These
tokens are then added to the database's existing tokens and
probabilities are recalculated for each token.  Sometimes those new
tokens aren't terribly useful, having been trained in both ham and
spam.  It is always possible that a message you just trained still
lacks certainty, thus getting /rounded/ to BAYES_50.

RW wrote:
  

If you find it surprising that that can happen, you don't
understand how Bayes works. It's a leaning system that's intended
to classify mail it hasn't seen based on mail it has seen.



BAYES_50 may be the default for a new mail with no known tokens (a
pure 50.000%), but it can also be the result of conflicting tokens
already in the system (anything ranging from 45.000% to 54.999%).

If you were to tell SpamAssassin to report the actual bayes score
(e.g. add_header all Bayes _BAYES_ in your local.cf), you'd probably
find that that message wasn't a pure 50% (though I can't recall how
many significant digits it uses).

  

Ok, I will try to look that bayes log in detail.

Thanks.
Sam.


Re: sa-learn spam and Bayes_50

2009-10-27 Thread John Hardin

On Tue, 27 Oct 2009, Sam wrote:


John Hardin a écrit :

 On Tue, 27 Oct 2009, Sam wrote:

  And after learning with sa-learn, it is still saying bayes_50 
  whereas sa-learn told it has learned it.


 Okay, basic Bayes troubleshooting questions:

 (1) Are you running sa-learn as the same user that SA itself is
 running as, so that you're training the Bayes database that SA is
 actually using to score messages?

 (2) Please run sa-learn --dump magic and send us the results.


1) For all users there is only one database in /var/bayes. I've done
   some tests with su Debian-exim and it is same result.

2) lenny:/home/samuel# sa-learn --dump magic
0.000  0  3  0  non-token data: bayes db version
0.000  0 112532  0  non-token data: nspam
0.000  0844  0  non-token data: nham
0.000  01935545  0  non-token data: ntokens


Okay, good. About the only comment I can make based on this is, you might 
want to learn a bunch of ham. You want the database to kinda reflect your 
actual raw spam/ham ratio, but yours is a little strongly skewed towards 
spammy tokens...


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  ...the Fates notice those who buy chainsaws...
  -- www.darwinawards.com
---
 4 days until Halloween

SpamAssassin Rejecting messages

2009-10-27 Thread Rick Knight
I've noticed that I never get any messages with a spam score higher than 
14.9. I have suspected that SpamAssassin is dumping messages with a 
score of 15 or higher. While perusing my logs today I found this entry...


bash-3.1# cat /var/log/maillog | grep n9RIlpTE013636
Oct 27 11:48:07 mail milter-greylist: n9RIlpTE013636: addr 
64.187.117.248 from salforum.i...@salforum.info rcpt 
r...@rlknight.com: autowhitelisted for 72:00:00
Oct 27 11:48:07 mail sm-mta[13636]: n9RIlpTE013636: 
from=salforum.i...@salforum.info, size=1696, class=0, nrcpts=1, 
msgid=yovgscjym71tr_wt1ea...@salforum.info, bodytype=8BITMIME, 
proto=ESMTP, daemon=MTA, relay=mail.salforum.info [64.187.117.248]
Oct 27 11:48:31 mail sm-mta[13636]: n9RIlpTE013636: Milter add: header: 
X-Spam-Flag: YES
Oct 27 11:48:31 mail sm-mta[13636]: n9RIlpTE013636: Milter add: header: 
X-Spam-Status: Yes, score=25.8 required=3.6 
tests=BAYES_99,DCC_CHECK,\n\tDIGEST_MULTIPLE,HTML_IMAGE_ONLY_12,HTML_IMAGE_RATIO_04,HTML_MESSAGE,\n\tHTML_SHORT_LINK_IMG_2,MIME_HTML_ONLY,PYZOR_CHECK,RAZOR2_CF_RANGE_51_100,\n\tRAZOR2_CF_RANGE_E4_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK,SARE_UNSUB38,\n\tSPF_PASS,URIBL_BLACK,URIBL_RHS_DOB 
autolearn=spam version=3.2.5
Oct 27 11:48:31 mail sm-mta[13636]: n9RIlpTE013636: Milter: data, 
reject=550 5.7.1 Blocked by SpamAssassin
Oct 27 11:48:31 mail sm-mta[13636]: n9RIlpTE013636: 
to=r...@rlknight.com, delay=00:00:24, pri=31696, stat=Blocked by 
SpamAssassin


Clearly the message higher than 14.9 (25.8) and SpamAssassin blocked the 
delivery. This is probably due to something I put in my settings when I 
first started using SpamAssassin years ago, but I can't find it. I've 
checked the global local.cf and user local.cf files and they don't seem 
to be causing it. I've tried to find another config file but there 
aren't any. Where else can this be coming from?


I'm running Sendmail 8.14.2, SpamAssassin 3.2.5 and SpamAss-Milter 0.3.1 
on Slackware 12.0.


Thanks,
Rick



Auth questions

2009-10-27 Thread Alex
Hi,

I'm trying to figure out if this is spam:

http://pastebin.com/m64a38b1

I've had to obscure it to get around pastebin's spam filter by
changing the '@' to '%#' in this message. The exxample.com is also my
change.

Is the habeas stuff right? How about mkt058.com? Is that a valid
server for shutterfly or is it indeed blacklisted, as JMF_BL suggests?

Bayes is also marking it has ham, so I'm especially concerned.

Thanks,
Alex


Re: SpamAssassin Rejecting messages

2009-10-27 Thread John Hardin

On Tue, 27 Oct 2009, Rick Knight wrote:

I've noticed that I never get any messages with a spam score higher than 
14.9. I have suspected that SpamAssassin is dumping messages with a 
score of 15 or higher. While perusing my logs today I found this 
entry...


Oct 27 11:48:31 mail sm-mta[13636]: n9RIlpTE013636: Milter: data, 
reject=550 5.7.1 Blocked by SpamAssassin


Clearly the message higher than 14.9 (25.8) and SpamAssassin blocked the 
delivery.


SpamAssassin only assigns scores. Somethine elsg, in this case...

I'm running Sendmail 8.14.2, SpamAssassin 3.2.5 and SpamAss-Milter 0.3.1 
on Slackware 12.0.


...SpamAss-Milter, makes acceptance/delivery decisions based on that 
score.


Check your spamass-milter config, probably in /etc/mail/

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  ...the Fates notice those who buy chainsaws...
  -- www.darwinawards.com
---
 4 days until Halloween


Re: SpamAssassin Rejecting messages

2009-10-27 Thread Rick Knight

Rick Knight wrote:
I've noticed that I never get any messages with a spam score higher 
than 14.9. I have suspected that SpamAssassin is dumping messages with 
a score of 15 or higher. While perusing my logs today I found this 
entry...


bash-3.1# cat /var/log/maillog | grep n9RIlpTE013636
Oct 27 11:48:07 mail milter-greylist: n9RIlpTE013636: addr 
64.187.117.248 from salforum.i...@salforum.info rcpt 
r...@rlknight.com: autowhitelisted for 72:00:00
Oct 27 11:48:07 mail sm-mta[13636]: n9RIlpTE013636: 
from=salforum.i...@salforum.info, size=1696, class=0, nrcpts=1, 
msgid=yovgscjym71tr_wt1ea...@salforum.info, bodytype=8BITMIME, 
proto=ESMTP, daemon=MTA, relay=mail.salforum.info [64.187.117.248]
Oct 27 11:48:31 mail sm-mta[13636]: n9RIlpTE013636: Milter add: 
header: X-Spam-Flag: YES
Oct 27 11:48:31 mail sm-mta[13636]: n9RIlpTE013636: Milter add: 
header: X-Spam-Status: Yes, score=25.8 required=3.6 
tests=BAYES_99,DCC_CHECK,\n\tDIGEST_MULTIPLE,HTML_IMAGE_ONLY_12,HTML_IMAGE_RATIO_04,HTML_MESSAGE,\n\tHTML_SHORT_LINK_IMG_2,MIME_HTML_ONLY,PYZOR_CHECK,RAZOR2_CF_RANGE_51_100,\n\tRAZOR2_CF_RANGE_E4_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK,SARE_UNSUB38,\n\tSPF_PASS,URIBL_BLACK,URIBL_RHS_DOB 
autolearn=spam version=3.2.5
Oct 27 11:48:31 mail sm-mta[13636]: n9RIlpTE013636: Milter: data, 
reject=550 5.7.1 Blocked by SpamAssassin
Oct 27 11:48:31 mail sm-mta[13636]: n9RIlpTE013636: 
to=r...@rlknight.com, delay=00:00:24, pri=31696, stat=Blocked by 
SpamAssassin


Clearly the message higher than 14.9 (25.8) and SpamAssassin blocked 
the delivery. This is probably due to something I put in my settings 
when I first started using SpamAssassin years ago, but I can't find 
it. I've checked the global local.cf and user local.cf files and they 
don't seem to be causing it. I've tried to find another config file 
but there aren't any. Where else can this be coming from?


I'm running Sendmail 8.14.2, SpamAssassin 3.2.5 and SpamAss-Milter 
0.3.1 on Slackware 12.0.


Thanks,
Rick
I think I found my problem. SpamAss-Milter has a command line option to 
ignore messages marked higher than the setting. I had that option set to 
15. I've uped it to 30 and restarted. I'll see what happens.


Thanks,
Rick


Re: SpamAssassin Rejecting messages

2009-10-27 Thread Rick Knight

John Hardin wrote:

On Tue, 27 Oct 2009, Rick Knight wrote:

I've noticed that I never get any messages with a spam score higher 
than 14.9. I have suspected that SpamAssassin is dumping messages 
with a score of 15 or higher. While perusing my logs today I found 
this entry...


Oct 27 11:48:31 mail sm-mta[13636]: n9RIlpTE013636: Milter: data, 
reject=550 5.7.1 Blocked by SpamAssassin


Clearly the message higher than 14.9 (25.8) and SpamAssassin blocked 
the delivery.


SpamAssassin only assigns scores. Somethine elsg, in this case...

I'm running Sendmail 8.14.2, SpamAssassin 3.2.5 and SpamAss-Milter 
0.3.1 on Slackware 12.0.


...SpamAss-Milter, makes acceptance/delivery decisions based on that 
score.


Check your spamass-milter config, probably in /etc/mail/


John,

Thanks for your reply. The problem appears to be a com and line option 
in my spamass-milter startup script (-r 15). I've upped it to 30 and 
will see how that works.


Thanks,
Rick


Re: Auth questions

2009-10-27 Thread Adam Katz
Alex (aka MySQL Student) wrote:
 I'm trying to figure out if this is spam:
 http://pastebin.com/m64a38b1
 
 I've had to obscure it to get around pastebin's spam filter by 
 changing the '@' to '%#' in this message. The exxample.com is also
 my change.
 
 Is the habeas stuff right? How about mkt058.com? Is that a valid 
 server for shutterfly or is it indeed blacklisted, as JMF_BL
 suggests?
 
 Bayes is also marking it has ham, so I'm especially concerned.

The DKIM passes from shutterfly.messages2.com rather than
shutterfly.com .. Messages2.com appears to be an email marketing
service that is legitimately used by Shutterfly.com as noted at
http://www.siteadvisor.com/sites/shutterfly.com/summary/ (see what
our inbox looked like after we signed up here which contains two
messages from @service.shutterfly.com and one message from
@shutterfly.messages2.com).

mail2196c.mkt058.com is listed as permitted by the authoritative SPF
record for shutterfly.messages2.com, so they are affiliated.

Messages2 and/or mkt058.com have been thorough in working to ensure
their mail gets delivered cleanly, using SPF, DKIM, and Habeas (which
are all sender verification tools, the last of which is a sort of we
promise this isn't spam tool).  The message also has a
List-Unsubscribe header while lacking a Precedence header (hmm...).


However, all that does is assure you that the message came from
Shutterfly and/or its affiliates (and/or THEIR affiliates).  As to
whether it was solicited ... I can't answer that.  Obviously somebody
reported it to JMF HostKarma as a spamming relay (which differs from
the Hostkarma ruling on similar company Constant Contact ... though I
don't know specific differences between the two).

I'd lean on saying it's safe to unsubscribe and that if you want to
complain, I'd aim that primarily at Shutterfly's abuse team.


Re: Auth questions

2009-10-27 Thread Ted Mittelstaedt

Alex wrote:

Hi,

I'm trying to figure out if this is spam:

http://pastebin.com/m64a38b1



I don't have an opinion on the sender in question but we
have seen an increasing number of mails of this type - call
it pseudo-spam if you will.

What it is, is legitimate companies who are using the
-flimsiest- of excuses, for example the person put their
e-mail address down on a reader response card, or an
application for a credit card, or some such - and then
spamming them.

For example we have 2 large grocery store chains that do
this here - they both have preferred customer programs
where you can sign up to be a preferred customer then you
get a discount on certain things.  The signup asks for
your e-mail address, (It doesn't require it, just asks,
and people are so used to filling out things they do it
anyway) and it doesn't specifically say they
are going to use the address to spam you, but it doesn't
NOT say they are going to spam you.

Then they wait 6 months or so until the person has forgot
what they put their e-mail address down on, and they
start spamming.

Of course, all unsubscribe links work, unsubscribing
actually does unsubscribe the user, all the domains are
legitimate, the great deals and coupons that they e-mail
out are all legitimate.  For all intents and purposes
the messages LOOK like legitimate mailing list messages.

However, in my opinion what really gives it away AS spam
is PRECISELY because they are using SPF, DKIM, and Habeas.
In other words, if they really weren't spammers and just
a mailing list, they wouldn't bother bending over backwards
to make it appear like they are not spammers.

The lady doth protest too much, methinks

is the logic going on here.

I realize this may seem like an impossible barrier to
companies that want to run legitimate mailing lists, my
answer to that is opt-in mailing lists.

Ted


Re: Auth questions

2009-10-27 Thread Alex
Hi,

Thanks so much for everyone's help on this. I appreciate your spending
the time to school me.

Best,
Alex

On Tue, Oct 27, 2009 at 4:26 PM, Ted Mittelstaedt t...@ipinc.net wrote:
 Alex wrote:

 Hi,

 I'm trying to figure out if this is spam:

 http://pastebin.com/m64a38b1


 I don't have an opinion on the sender in question but we
 have seen an increasing number of mails of this type - call
 it pseudo-spam if you will.

 What it is, is legitimate companies who are using the
 -flimsiest- of excuses, for example the person put their
 e-mail address down on a reader response card, or an
 application for a credit card, or some such - and then
 spamming them.



Error from sa-update script

2009-10-27 Thread fugtruck

I have an SMTP gateway setup with Postfix+SpamAssassin on a CentOS 5.3 server
that is functioning.  Lately, I've noticed an increase in the amount of SPAM
getting through the gateway.  That's when I discovered an error that is
showing up daily in the cron logs (see below) from the sa-update script.  So
I suspect that my Spamassassin is not updating properly.  FYI, I am running
Spamassassin version 3.2.5.  Can anyone please advise?




/etc/cron.daily/sa-update:

[8809] info: generic: base extraction starting. this can take a while...
[8809] info: generic: extracting from rules of type body_0
.
100% Completed 1175.17 rules/sec in 00m01s
...
100% Completed 317.13 bases/sec in 00m06s
[8809] info: body_0: 1587 base strings extracted in 7 seconds
Possible unintended interpolation of @un in string at
/etc/mail/spamassassin/local.cf,
rule LOCAL_NT3SPAM, line 1.
rules: failed to compile head tests, skipping:
(Global symbol @un requires explicit package name at
/etc/mail/spamassassin/local.cf,
rule LOCAL_NT3SPAM, line 1.
)
[8809] info: rules: meta test FM_PHN_NODNS has dependency 'RDNS_NONE' with a
zero
score
sa-compile: not compiling; 'spamassassin --lint' check failed!
Stopping spamd: [  OK  ]
Starting spamd: [  OK  ]

-- 
View this message in context: 
http://www.nabble.com/Error-from-sa-update-script-tp26085317p26085317.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: sa-learn spam and Bayes_50

2009-10-27 Thread Benny Pedersen

On tir 27 okt 2009 18:44:28 CET, John Hardin wrote


0.000  0 112532  0  non-token data: nspam
0.000  0844  0  non-token data: nham


try to get them more equal numbered in your trains

reflect your actual raw spam/ham ratio, but yours is a little  
strongly skewed towards spammy tokens..


result of not scanning outgoing mails will be sign of this

--
xpoint



Re: Error from sa-update script

2009-10-27 Thread Jari Fredriksson


27.10.2009 23:14, fugtruck kirjoitti:
 
 I have an SMTP gateway setup with Postfix+SpamAssassin on a CentOS 5.3 server
 that is functioning.  Lately, I've noticed an increase in the amount of SPAM
 getting through the gateway.  That's when I discovered an error that is
 showing up daily in the cron logs (see below) from the sa-update script.  So
 I suspect that my Spamassassin is not updating properly.  FYI, I am running
 Spamassassin version 3.2.5.  Can anyone please advise?
 
 
 
 
 /etc/cron.daily/sa-update:
 
 [8809] info: generic: base extraction starting. this can take a while...
 [8809] info: generic: extracting from rules of type body_0
 .
 100% Completed 1175.17 rules/sec in 00m01s
 ...
 100% Completed 317.13 bases/sec in 00m06s
 [8809] info: body_0: 1587 base strings extracted in 7 seconds
 Possible unintended interpolation of @un in string at
 /etc/mail/spamassassin/local.cf,
 rule LOCAL_NT3SPAM, line 1.
 rules: failed to compile head tests, skipping:
   (Global symbol @un requires explicit package name at
 /etc/mail/spamassassin/local.cf,
 rule LOCAL_NT3SPAM, line 1.
 )
 [8809] info: rules: meta test FM_PHN_NODNS has dependency 'RDNS_NONE' with a
 zero
 score
 sa-compile: not compiling; 'spamassassin --lint' check failed!
 Stopping spamd: [  OK  ]
 Starting spamd: [  OK  ]
 

Did you read the error message yourself, or just pasted it here?

It clearly says that a local rule (your's) LOCAL_NT3SPAM in
/etc/mail/spamassassin/local.cf is broken.

Show the rule, so the regex masters can spot the error.


-- 
http://www.iki.fi/jarif/




signature.asc
Description: OpenPGP digital signature


Re: Error from sa-update script

2009-10-27 Thread John Hardin

On Tue, 27 Oct 2009, Jari Fredriksson wrote:




27.10.2009 23:14, fugtruck kirjoitti:


/etc/mail/spamassassin/local.cf,
rule LOCAL_NT3SPAM, line 1.
rules: failed to compile head tests, skipping:
(Global symbol @un requires explicit package name at
/etc/mail/spamassassin/local.cf,


Did you read the error message yourself, or just pasted it here?

It clearly says that a local rule (your's) LOCAL_NT3SPAM in
/etc/mail/spamassassin/local.cf is broken.

Show the rule, so the regex masters can spot the error.


The error message indicates the problem. If you use @ in a RE, you need 
to escape it (e.g. \@).


Try changing that and see if SA passes a --lint check.

_ALWAYS_ do a --lint check after editing your rules!

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  ...the Fates notice those who buy chainsaws...
  -- www.darwinawards.com
---
 4 days until Halloween


Re: sa-learn spam and Bayes_50

2009-10-27 Thread Adam Katz
Benny Pedersen wrote:
 On tir 27 okt 2009 18:44:28 CET, John Hardin wrote
 
 0.000  0 112532  0  non-token data: nspam
 0.000  0844  0  non-token data: nham
 
 try to get them more equal numbered in your trains
 
 reflect your actual raw spam/ham ratio, but yours is a little strongly
 skewed towards spammy tokens..
 
 result of not scanning outgoing mails will be sign of this

I disagree.  I see no reason to scan outbound mail, and this
particular aspect of it is more harmful than helpful.

Bayes examines both bodies and headers of messages; if you scan your
outbound mail to even out your ham:spam ratio, you are watering down
your bayes db.  Basically, it will learn from the headers that all
outbound mail is ham and all inbound mail is spam.

Instead, be more selective about the spam you train.  Only train
messages that completely missed with respect to Bayes (e.g. a spam
that got BAYES_00 or a ham that got BAYES_80) rather than corner cases
(e.g. a spam with BAYES_50 that got marked as spam, a spam that got
marked as BAYES_80).  Try to train as much inbound ham as possible
(but again, not internal messages that never hit the live internet).
If you use autolearn, bump the bayes_auto_learn_threshold_spam up some.


sa-update randomly stops

2009-10-27 Thread McDonald, Dan
I run sa-update and sa-compile from a cron job at a regular interval.
At seemingly random times, it simply fails to run.  All I get in the
cron log is:

gpg: WARNING: unsafe permissions on homedir 
`/etc/mail/spamassassin/sa-update-keys'
[8641] info: generic: base extraction starting. this can take a while...
[8641] info: generic: extracting from rules of type body_0

I'm trapping errors in the bash script that calls sa-compile, but it
never tells me that there is an error.

It looks like there is a temp directory with stuff in it left over from
the attempt.

The logs have nothing useful, just the call in /var/log/cron/info
Nothing in warnings or errors
Oct 27 14:35:00 ca CROND[8627]: (root) CMD (/usr/local/sbin/sa-update-cron)


When I run it by hand, I never have encountered this problem.

What can I do to troubleshoot this more, or should I just wait to see if
it still happens when 3.3.0 is released?


-- 
Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX
www.austinenergy.com


signature.asc
Description: This is a digitally signed message part


Re: sa-update randomly stops

2009-10-27 Thread Adam Katz
McDonald, Dan wrote:
 I run sa-update and sa-compile from a cron job at a regular interval.
 
gpg: WARNING: unsafe permissions on homedir 
 `/etc/mail/spamassassin/sa-update-keys'
[8641] info: generic: base extraction starting. this can take a while...
[8641] info: generic: extracting from rules of type body_0
 
 It looks like there is a temp directory with stuff in it left over from
 the attempt.
 
 The logs have nothing useful, just the call in /var/log/cron/info
 Nothing in warnings or errors
 Oct 27 14:35:00 ca CROND[8627]: (root) CMD (/usr/local/sbin/sa-update-cron)
 
 When I run it by hand, I never have encountered this problem.

It appears you're running it as the wrong user via cron and the
correct user by hand.  Who owns that leftover stuff in the temp
directory?  Who owns /etc/mail/spamassassin/sa-update-keys?  They
should be the same, and the sa-update-keys directory should be
rwx-- for that user (and maybe also owned by that user's primary
group).


How to reject spam where sender = receiver

2009-10-27 Thread rpc1

My spamassassin plug doesn't check mail where sender address and receiver
address are equal. Like this

Return-Path: o...@domen.com
X-Spam-Status: No, hits=0.0 required=3.2
tests=DNSBL_RELAYS.ORDB.ORG: 5.00,DNSBL_BL.SPAMCOP.NET:
5.00,DNSBL_SBL-XBL.SPAMHAUS.ORG: 5.00,
BAYES_99: 4.07,HELO_DYNAMIC_IPADDR2: 3.818,HTML_IMAGE_ONLY_32:
1.052,
HTML_MESSAGE: 0.001,MIME_HTML_ONLY: 0.001,NO_REAL_NAME: 0.961,
URIBL_AB_SURBL: 3.812,URIBL_JP_SURBL: 4.087,URIBL_OB_SURBL: 3.008,
URIBL_SBL: 1.639,URIBL_SC_SURBL: 4.498,URIBL_WS_SURBL: 2.14,
CUSTOM_RULE_FROM: ALLOW,TOTAL_SCORE: 44.087
X-Spam-Level: 
Received: from 75-148-3-221-WashingtonDC.hfc.comcastbusiness.net
([75.148.3.221])
by mail.tvtb.ru
for o...@domen.com;
Sun, 25 Oct 2009 07:53:00 +1000
To: oper...@tvtb.ru
Subject: A path leading to your well-being
From: o...@domen.com
MIME-Version: 1.0
Importance: High
Content-Type: text/html

How can I create a new rule which will check equity fields  TO and FROM ???
-- 
View this message in context: 
http://www.nabble.com/How-to-reject-spam-where-sender-%3D-receiver-tp26086971p26086971.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: sa-learn spam and Bayes_50

2009-10-27 Thread Sam

John Hardin a écrit :

On Tue, 27 Oct 2009, Sam wrote:


John Hardin a écrit :

 On Tue, 27 Oct 2009, Sam wrote:

  And after learning with sa-learn, it is still saying bayes_50   
whereas sa-learn told it has learned it.


 Okay, basic Bayes troubleshooting questions:

 (1) Are you running sa-learn as the same user that SA itself is
 running as, so that you're training the Bayes database that SA is
 actually using to score messages?

 (2) Please run sa-learn --dump magic and send us the results.


1) For all users there is only one database in /var/bayes. I've done
   some tests with su Debian-exim and it is same result.

2) lenny:/home/samuel# sa-learn --dump magic
0.000  0  3  0  non-token data: bayes db version
0.000  0 112532  0  non-token data: nspam
0.000  0844  0  non-token data: nham
0.000  01935545  0  non-token data: ntokens


Okay, good. About the only comment I can make based on this is, you 
might want to learn a bunch of ham. You want the database to kinda 
reflect your actual raw spam/ham ratio, but yours is a little strongly 
skewed towards spammy tokens...



Thanks to everybody for yours comments.
If I understand well, the few french spam I give to sa-learn are too 
little front of the tons of english spam feed to sa-learn.


It could be interesting (but not existing I think) to have one bayes for 
each langage if I understand that this the problem in my case.


Thanks a lot.
Sam.




Re: How to reject spam where sender = receiver

2009-10-27 Thread John Hardin

On Tue, 27 Oct 2009, rpc1 wrote:



My spamassassin plug doesn't check mail where sender address and receiver
address are equal. Like this

Return-Path: o...@domen.com
X-Spam-Status: No, hits=0.0 required=3.2
   tests=DNSBL_RELAYS.ORDB.ORG: 5.00,DNSBL_BL.SPAMCOP.NET:
5.00,DNSBL_SBL-XBL.SPAMHAUS.ORG: 5.00,
   BAYES_99: 4.07,HELO_DYNAMIC_IPADDR2: 3.818,HTML_IMAGE_ONLY_32:
1.052,
   HTML_MESSAGE: 0.001,MIME_HTML_ONLY: 0.001,NO_REAL_NAME: 0.961,
   URIBL_AB_SURBL: 3.812,URIBL_JP_SURBL: 4.087,URIBL_OB_SURBL: 3.008,
   URIBL_SBL: 1.639,URIBL_SC_SURBL: 4.498,URIBL_WS_SURBL: 2.14,
   CUSTOM_RULE_FROM: ALLOW,TOTAL_SCORE: 44.087
X-Spam-Level:
Received: from 75-148-3-221-WashingtonDC.hfc.comcastbusiness.net
([75.148.3.221])
   by mail.tvtb.ru
   for o...@domen.com;
   Sun, 25 Oct 2009 07:53:00 +1000
To: oper...@tvtb.ru
Subject: A path leading to your well-being
From: o...@domen.com
MIME-Version: 1.0
Importance: High
Content-Type: text/html

How can I create a new rule which will check equity fields  TO and FROM ???


I would suggest that is not really what you want to do, as you'll rarely 
see that on spam that isn't addressed to your domain. What you probably 
want to do is reject mail that is claiming to be from your domain, but 
does not actually originate from your domain - in other words, mail where 
someone is forging your domain name on the sender address.


Is that a better description of what you want to do?

That has been covered several times, I am pretty sure within the last 
month. Please check the list archives for the past two months for a thread 
having a subject like to = from. You'll find a discussion of setting up 
an SPF record for your domain and using whitelist_from_auth to enforce it, 
and another discussion (involving me) of using milter-regex to reject such 
forged sender addresses at SMTP time. Both methods work well, I would 
modestly say milter-regex works better because it bypasses SA and is thus 
a lighter solution overall.


mutterMaybe I should throw a rule like that into the sandbox and see how 
well it does.../mutter


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  ...the Fates notice those who buy chainsaws...
  -- www.darwinawards.com
---
 4 days until Halloween


Re: How to reject spam where sender = receiver

2009-10-27 Thread Ralph Bornefeld-Ettmann

John Hardin schrieb:

On Tue, 27 Oct 2009, rpc1 wrote:



My spamassassin plug doesn't check mail where sender address and receiver
address are equal. Like this

Return-Path: o...@domen.com
X-Spam-Status: No, hits=0.0 required=3.2
   tests=DNSBL_RELAYS.ORDB.ORG: 5.00,DNSBL_BL.SPAMCOP.NET:
5.00,DNSBL_SBL-XBL.SPAMHAUS.ORG: 5.00,
   BAYES_99: 4.07,HELO_DYNAMIC_IPADDR2: 3.818,HTML_IMAGE_ONLY_32:
1.052,
   HTML_MESSAGE: 0.001,MIME_HTML_ONLY: 0.001,NO_REAL_NAME: 0.961,
   URIBL_AB_SURBL: 3.812,URIBL_JP_SURBL: 4.087,URIBL_OB_SURBL: 3.008,
   URIBL_SBL: 1.639,URIBL_SC_SURBL: 4.498,URIBL_WS_SURBL: 2.14,
   CUSTOM_RULE_FROM: ALLOW,TOTAL_SCORE: 44.087
X-Spam-Level:
Received: from 75-148-3-221-WashingtonDC.hfc.comcastbusiness.net
([75.148.3.221])
   by mail.tvtb.ru
   for o...@domen.com;
   Sun, 25 Oct 2009 07:53:00 +1000
To: oper...@tvtb.ru
Subject: A path leading to your well-being
From: o...@domen.com
MIME-Version: 1.0
Importance: High
Content-Type: text/html

How can I create a new rule which will check equity fields  TO and 
FROM ???


I would suggest that is not really what you want to do, as you'll rarely 
see that on spam that isn't addressed to your domain. What you probably 
want to do is reject mail that is claiming to be from your domain, but 
does not actually originate from your domain - in other words, mail 
where someone is forging your domain name on the sender address.


Is that a better description of what you want to do?

That has been covered several times, I am pretty sure within the last 
month. Please check the list archives for the past two months for a 
thread having a subject like to = from. You'll find a discussion of 
setting up an SPF record for your domain and using whitelist_from_auth 
to enforce it, and another discussion (involving me) of using 
milter-regex to reject such forged sender addresses at SMTP time. Both 
methods work well, I would modestly say milter-regex works better 
because it bypasses SA and is thus a lighter solution overall.


mutterMaybe I should throw a rule like that into the sandbox and see 
how well it does.../mutter




If you do not like SPF and you do not have remote users who are allowed 
to send mail with local domain you can add a rule to header checks.


e.g Postfix :

/etc/postfix/header_checks :

/^From:.*example\.com/ REJECT


Cheers
Ralph



Re: [SA] How to reject spam where sender = receiver

2009-10-27 Thread Adam Katz
John Hardin wrote:
 mutterMaybe I should throw a rule like that into the sandbox and see
 how well it does.../mutter

I had a dialog with Karsten about this a few years ago ... the regex
is nontrivial and dangerous, so the recommended method is a plugin.
I've actually written such a thing already, though slightly different
in that it ignores the domain.  Easy to tailor one way or another.
It's attached.

Result:  Mixed bag.  Might be nice to see in the masscheck.

FROM_EQUALS_TO:  1.313% of spam, 0.657% of ham
FROM_NOT_REPLY:  5.840% of spam, 2.868% of ham

Spam and ham are non-authoritative and include FPs and FNs.  I also
greylist, reducing all spam numbers.
# SenderChecks v1.0
# (C) 2009 By Adam Katz antispamATkhopiscom http://khopesh.com/Anti-spam
# Apache License 2.0

=pod


# Example usage:

loadplugin Mail::SpamAssassin::Plugin::SenderChecks  sender-checks.pm
header __FROM_EQ_TO	eval:check_for_from_equals_to()
meta FROM_EQUALS_TO	!(ALL_TRUSTED || DKIM_VERIFIED)  __FROM_EQ_TO
describe FROM_EQUALS_TO	From: and To: have the same username
score FROM_EQUALS_TO	0.1

header __FROM_V_REPLY	eval:check_for_from_v_replyto_dom()
header __PREC_BULK	Precedence =~ /bulk|list/
meta FROM_NOT_REPLY !(__PREC_BULK||ALL_TRUSTED||DKIM_VERIFIED)  __FROM_V_REPLY
describe FROM_NOT_REPLY	From: and Reply-To: have different domains
score FROM_NOT_REPLY	0.1


=cut

package Mail::SpamAssassin::Plugin::SenderChecks;

use strict;
use warnings;

use Mail::SpamAssassin;
use Mail::SpamAssassin::Plugin;
our @ISA = qw(Mail::SpamAssassin::Plugin);

sub new {
  my ($class, $mailsa) = @_;
  $class = ref($class) || $class;
  my $self = $class-SUPER::new( $mailsa );
  bless ($self, $class);
  $self-register_eval_rule ( 'check_for_from_equals_to' );
  $self-register_eval_rule ( 'check_for_from_v_replyto_dom' );

  return $self;
}

# Adapted from http://wiki.apache.org/spamassassin/FromNotReplyTo
# Spammers often forge the sender email to use the same username as
# the victim, while most legitimate e-mails does not.
sub check_for_from_v_replyto_dom {
  my ($self, $msg) = @_;

  my $from = $msg-get( 'From:addr' );
  $from =~ s/.*@//;
  my $replyTo = $msg-get( 'Reply-To:addr' );
  $replyTo =~ s/.*@//;

  Mail::SpamAssassin::Plugin::dbg(
SenderChecks: matching from/replyto: $from/$replyTo );

  if ( $from ne ''  $replyTo ne ''  $from ne $replyTo ) {
return 1;
  }

  return 0;
}

# Spammers often forge the sender email to use the same username as
# the victim, while most legitimate e-mails does not.
sub check_for_from_equals_to {
  my ($self, $msg) = @_;

  my $from = $msg-get( 'From:addr' );
  $from =~ s/@.*//;
  my $to = $msg-get( 'To:addr' );
  $to =~ s/@.*//;

  Mail::SpamAssassin::Plugin::dbg(SenderChecks: matching from/to: $from/$to);

  if ( $from ne ''  $from eq $to ) {
return 1;
  }

  return 0;
}



Re: [SA] How to reject spam where sender = receiver

2009-10-27 Thread John Hardin

On Tue, 27 Oct 2009, Adam Katz wrote:


John Hardin wrote:
mutterMaybe I should throw a rule like that into the sandbox and see 
how well it does.../mutter


I had a dialog with Karsten about this a few years ago ... the regex
is nontrivial and dangerous, so the recommended method is a plugin.
I've actually written such a thing already, though slightly different
in that it ignores the domain.  Easy to tailor one way or another.
It's attached.

Result:  Mixed bag.  Might be nice to see in the masscheck.


I just threw a basic (by no means thorough) rule into my sandbox. We'll 
see how it does, I can kill it easily enough.



FROM_EQUALS_TO:  1.313% of spam, 0.657% of ham


That's a fairly low S/O.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  ...the Fates notice those who buy chainsaws...
  -- www.darwinawards.com
---
 4 days until Halloween


Re: sa-learn spam and Bayes_50

2009-10-27 Thread Alex
Hi,

 Instead, be more selective about the spam you train.  Only train
 messages that completely missed with respect to Bayes (e.g. a spam
 that got BAYES_00 or a ham that got BAYES_80) rather than corner cases
 (e.g. a spam with BAYES_50 that got marked as spam, a spam that got
 marked as BAYES_80).  Try to train as much inbound ham as possible
 (but again, not internal messages that never hit the live internet).
 If you use autolearn, bump the bayes_auto_learn_threshold_spam up some.

This sounds like really good advice. I've recently enabled autolearn
and I'm a bit concerned that my database is skewed. I found quite a
few spams in the quarantine with a bayes score less than 50. However,
I'm not sure that wasn't the case before I started the autolearn.

In either case, is there a way to exclude mails with USER_IN_WHITELIST
altogether? I have my ham level set at -0.3, but the USER_IN_WHITELIST
(and there are quite a few) adds -100.0, automatically making it ham.
I'm concerned that a spoofed mail passing through the whitelist could
skew the db without my knowledge and without my ability to control it.

Thanks,
Alex


Re: Low Score - {Brazillian Host} Lottery Spam

2009-10-27 Thread Benny Pedersen

On tir 27 okt 2009 18:27:24 CET, John Hardin wrote

Contact me offlist if you want to install the sandbox rules for  
them, I'll give you instructions.


undisclosed recipient with a freemail body hit

if i won why would i not be in the to:

:)

--
xpoint



Re: How to reject spam where sender = receiver

2009-10-27 Thread Benny Pedersen

On ons 28 okt 2009 00:36:10 CET, rpc1 wrote


My spamassassin plug doesn't check mail where sender address and receiver
address are equal. Like this


http://www.nabble.com/postfwd-stop-equal-sender-recipient-spams-td21164908.html

or setup spf for your domain and test with spf in your mta

i do the later now, but if you dont want to use spf, use the postfwd rule

--
xpoint