Re: Whitelists in 3.3.0

2010-01-29 Thread Bowie Bailey
McDonald, Dan wrote:

 Please excuse the top-post. This truly brain-damaged mua does not
 allow me to edit the body.

 Easiest way to disable whitelists is:

 grep -E score\ RCVD.+-
 /var/lib/spamassassin/updates_spamassassin_org/50_scores.cf | cut -d\ 
 -f1-3  /etc/mail/spamassassin/no-whitelists.cf


Does 3.3.0 get rid of the version number in that path, or did you just
forget to include it?  I haven't gotten around to upgrading yet.

Nice command line magic there!  It took me a bit to figure out how it
worked.  I never would have thought of doing it that way.

-- 
Bowie


Re: Whitelists in 3.3.0

2010-01-29 Thread Daniel J McDonald
On Fri, 2010-01-29 at 09:18 -0500, Bowie Bailey wrote:
 McDonald, Dan wrote:
 
  Please excuse the top-post. This truly brain-damaged mua does not
  allow me to edit the body.
 
  Easiest way to disable whitelists is:
 
  grep -E score\ RCVD.+-
  /var/lib/spamassassin/updates_spamassassin_org/50_scores.cf | cut -d\ 
  -f1-3  /etc/mail/spamassassin/no-whitelists.cf
 
 
 Does 3.3.0 get rid of the version number in that path, or did you just
 forget to include it? 

I forgot...  was transcribing from screen to iPhone.  So the path does
need to be updated.

  I haven't gotten around to upgrading yet.
 
 Nice command line magic there!  It took me a bit to figure out how it
 worked.  

It helps that whitelists are disabled in ruleset #1, so we can count on
a zero in that position.

As a one-liner, it is something that can be tacked on the end of a
script that calls sa-update (or in the middle, if you follow up your
sa-update with an sa-compile). Just watch out for the two spaces in the
cut command `cut -d\  -f1-3`

I never would have thought of doing it that way.

cut is one of my favorite tools.

-- 
Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX
www.austinenergy.com


Re: Whitelists in 3.3.0

2010-01-29 Thread Bowie Bailey
Daniel J McDonald wrote:
 On Fri, 2010-01-29 at 09:18 -0500, Bowie Bailey wrote:
   
 McDonald, Dan wrote:
 
 Please excuse the top-post. This truly brain-damaged mua does not
 allow me to edit the body.

 Easiest way to disable whitelists is:

 grep -E score\ RCVD.+-
 /var/lib/spamassassin/updates_spamassassin_org/50_scores.cf | cut -d\ 
 -f1-3  /etc/mail/spamassassin/no-whitelists.cf
   
 Nice command line magic there!  It took me a bit to figure out how it
 worked.  
 

 It helps that whitelists are disabled in ruleset #1, so we can count on
 a zero in that position.

 As a one-liner, it is something that can be tacked on the end of a
 script that calls sa-update (or in the middle, if you follow up your
 sa-update with an sa-compile). Just watch out for the two spaces in the
 cut command `cut -d\  -f1-3`

   
 I never would have thought of doing it that way.
 

 cut is one of my favorite tools.
   

It is more the searching for a negative score to identify the whitelists
that I thought was interesting.  I probably would have been trying to
figure out a text pattern in the rule names.  It did take me a bit to
figure out where the zero at the end was coming from!  :)

-- 
Bowie


Re: Whitelists in 3.3.0

2010-01-29 Thread LuKreme
 McDonald, Dan wrote:
 grep -E score\ RCVD.+-
 /var/lib/spamassassin/updates_spamassassin_org/50_scores.cf | cut -d\ 
 -f1-3  /etc/mail/spamassassin/no-whitelists.cf

Nice. Now I just need to decide if I wait for ports to update or just manually 
install 3.3


-- 
You try to shape the world to what you want the world to be.
Carving your name a thousand times won't bring you back to me.
Oh no, no I might as well go and tell it to the trees. Go and
tell it to the trees, yeah.




RE: Whitelists in 3.3.0

2010-01-28 Thread McDonald, Dan
Please excuse the top-post. This truly brain-damaged mua does not allow me to 
edit the body.

Easiest way to disable whitelists is:

grep -E score\ RCVD.+- 
/var/lib/spamassassin/updates_spamassassin_org/50_scores.cf | cut -d\  -f1-3  
/etc/mail/spamassassin/no-whitelists.cf

 


Sent with Good (www.good.com)


 -Original Message-
From:   LuKreme [mailto:krem...@kreme.com]
Sent:   Thursday, January 28, 2010 08:33 PM Central Standard Time
To: users@spamassassin.apache.org
Subject:Whitelists in 3.3.0

What whitelists are enabled in SA 3.3.0 and what's the easiest way to disable 
them all?

-- 
YOU [humans] NEED TO BELIEVE IN THINGS THAT AREN'T TRUE. HOW ELSE CAN THEY 
BECOME? --Hogfather



Re: Whitelists, not directly useful to spamassassin...

2009-12-21 Thread Matus UHLAR - fantomas
 Warren Togami wrote:
 While whitelists are not directly effective (statistically, when  
 averaged across a large corpus), whitelists are powerful tools in  
 indirect ways including:

 * Pushing the score beyond the auto-learn threshold for things like  
 Bayes to function without manual intervention.

On 17.12.09 11:27, Jason Bertoch wrote:
 This does not sound like a positive thing to me.  E-mail from any sender  
 that is malformed enough to skip auto-learning should not be forced into  
 Bayes as ham simply because some 3rd party promises, for their own  
 monetary benefit, that the sender is a nice guy.  Why should any sender  
 that I have not intentionally added to my local whitelist get a break?

If you _want_ the mail and whitelist the sender, I think its characteristics
should be pushed into the bayes.
If you don't want the mail, then autolearning it as spam is least of your
problems.

-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Linux IS user friendly, it's just selective who its friends are...


Re: Whitelists in SA

2009-12-20 Thread Charles Gregory

On Sat, 19 Dec 2009, Daryl C. W. O'Shea wrote:

More unfortunately, privacy concerns prevent me from building a useful
corpus of ham. Sigh
But otherwise such a good idea

Can you not trust yourself to use your own ham?  You don't need to
provide us with your mail.  You can scan your own mail locally on your
own machine(s).


I run an ISP. The corpus I would so love to build is the hundreds of 
messages per day that all our clients receive. It's *their* privacy that

is the cocern.

Do you think that my own private collection of saved mail (perhaps 1100 
ham) would really be of benefit? I'd have to start saving my spam as 
well


And it would always be skewed by the fact that I SMTP reject anything 
caught by Zen.


- C


Re: Whitelists in SA

2009-12-20 Thread Warren Togami

On 12/20/2009 09:20 AM, Charles Gregory wrote:

On Sat, 19 Dec 2009, Daryl C. W. O'Shea wrote:

More unfortunately, privacy concerns prevent me from building a useful
corpus of ham. Sigh
But otherwise such a good idea

Can you not trust yourself to use your own ham? You don't need to
provide us with your mail. You can scan your own mail locally on your
own machine(s).


I run an ISP. The corpus I would so love to build is the hundreds of
messages per day that all our clients receive. It's *their* privacy that
is the cocern.


Right, they would need to opt-in and the manual sorting requirements are 
a bit too difficult and time consuming for all but the most dedicated to 
this cause.




Do you think that my own private collection of saved mail (perhaps 1100
ham) would really be of benefit? I'd have to start saving my spam as
well


A Ham-only corpus is still useful, as long as it contains mail from a 
variety of sources.  (Mailing lists are not very useful.)




And it would always be skewed by the fact that I SMTP reject anything
caught by Zen.



Not a problem.

Warren


Re: Whitelists in SA

2009-12-20 Thread jdow

From: Charles Gregory cgreg...@hwcn.org
Sent: Sunday, 2009/December/20 06:20



On Sat, 19 Dec 2009, Daryl C. W. O'Shea wrote:

More unfortunately, privacy concerns prevent me from building a useful
corpus of ham. Sigh
But otherwise such a good idea

Can you not trust yourself to use your own ham?  You don't need to
provide us with your mail.  You can scan your own mail locally on your
own machine(s).


I run an ISP. The corpus I would so love to build is the hundreds of 
messages per day that all our clients receive. It's *their* privacy that

is the cocern.

Do you think that my own private collection of saved mail (perhaps 1100 
ham) would really be of benefit? I'd have to start saving my spam as 
well


And it would always be skewed by the fact that I SMTP reject anything 
caught by Zen.


I'm just a touch naive here; but, it seems to me it should be possible,
somehow, to build running spamd daemons, one with the regular rules
and one with the mass check rules. The second one is fed the email in
parallel with the first but deletes the mail once the scores are logged.

The downside is that this is not confirmed ham and confirmed spam.
It is a way to safely test new rule sets, though.

I must admit that the vast majority of email I receive is not hand
checked for ham/spam. I simply read headers on several lists to see
what the current buzz is. I read threads that look interesting and
toss the rest. So it'd be hard to mass check validly with that as a
corpus. (Besides, I suspect animal husbandry companies would hardly
be interested in passing things that look like typical LKML mailings,
would they?)

I wonder how much companies would pay for a part time SpamAssassin
honcho who can be trusted (bonded?) and can write SARE-ish rules
tailored to the company's email. Is there a job opportunity for
somebody here? (And, yes, I do suspect the burnout time would be
rather short.)

{^_^}


Re: Whitelists in SA

2009-12-20 Thread John Hardin

On Sun, 20 Dec 2009, jdow wrote:


I'm just a touch naive here; but, it seems to me it should be possible,
somehow, to build running spamd daemons, one with the regular rules
and one with the mass check rules.


There's nothing special about masscheck rules. Masscheck is just running 
the current ruleset against hand-classified corpora (ideally _large_ 
hand-classified corpora) to see what hits.


The second one is fed the email in parallel with the first but deletes 
the mail once the scores are logged.


This can easily be done by analysis of spamd logs. It logs all the rules 
hit on every message scanned.



The downside is that this is not confirmed ham and confirmed spam.


That unfortunately is the critical part. You can easily glean whether or 
not SA thinks a message is spammy and what rules led to that 
classification, the tough part is confirming whether or not it's _right_.



I wonder how much companies would pay for a part time SpamAssassin
honcho who can be trusted (bonded?) and can write SARE-ish rules
tailored to the company's email. Is there a job opportunity for
somebody here?


I'd do that.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Bother, said Pooh as he struggled with /etc/sendmail.cf, it never
  does quite what I want. I wish Christopher Robin was here.
   -- Peter da Silva in a.s.r
---
 5 days until Christmas


Re: [sa] Re: Whitelists in SA

2009-12-20 Thread Charles Gregory

On Sun, 20 Dec 2009, jdow wrote:

The downside is that this is not confirmed ham and confirmed spam.


(nod) Exactly. And that is what is needed to do a masscheck...

I wonder how much companies would pay for a part time SpamAssassin 
honcho who can be trusted (bonded?) and can write SARE-ish rules 
tailored to the company's email. Is there a job opportunity for somebody 
here? (And, yes, I do suspect the burnout time would be rather short.)


(smile) I've got my own custom rule file format (plus a script to convert 
to standard SA rules format). This reduces the effort to add a new rule 
pretty much down to a cut-n-paste operation. Must admit there are some 
days when I do feel a bit burned out, but generally I am gratified to see 
my new rules trigger on the remainder of a spam flood :)


As for trust, I never need to see the ham, just the spam, which has no 
privacy issues (smile).


- C


Re: [sa] Re: Whitelists in SA

2009-12-19 Thread Charles Gregory

On Fri, 18 Dec 2009, Warren Togami wrote:

Why wait, when you do relatively simple things to help make it happen?
http://wiki.apache.org/spamassassin/NightlyMassCheck
We can more frequently update rules if more people participate in the 
nightly masschecks.  The current documentation is a bit of a confusing mess 
unfortunately.


More unfortunately, privacy concerns prevent me from building a useful 
corpus of ham. Sigh


But otherwise such a good idea

- C




Re: [sa] Re: Whitelists in SA

2009-12-19 Thread Daryl C. W. O'Shea
On 19/12/2009 5:51 PM, Charles Gregory wrote:
 On Fri, 18 Dec 2009, Warren Togami wrote:
 Why wait, when you do relatively simple things to help make it happen?
 http://wiki.apache.org/spamassassin/NightlyMassCheck
 We can more frequently update rules if more people participate in the
 nightly masschecks.  The current documentation is a bit of a confusing
 mess unfortunately.
 
 More unfortunately, privacy concerns prevent me from building a useful
 corpus of ham. Sigh
 
 But otherwise such a good idea

Can you not trust yourself to use your own ham?  You don't need to
provide us with your mail.  You can scan your own mail locally on your
own machine(s).

Daryl




Re: Whitelists in SA

2009-12-18 Thread Charles Gregory

On Thu, 17 Dec 2009, jdow wrote:

It is a good thing this issue was raised. It led to appropriate mass
check runs. I expect that will lead to saner scoring within the SA
framework. If not and it bites me, THEN I'll raise the issue again.
Does that seem fair?


50_scores.cf:score HABEAS_ACCREDITED_COI 0 -8.0 0 -8.0
50_scores.cf:score HABEAS_ACCREDITED_SOI 0 -4.3 0 -4.3
50_scores.cf:score HABEAS_CHECKED 0 -0.2 0 -0.2

Still no changes through the sa-update channel.
Is there a time delay in the masscheck results being applied?

- Charles


Re: Whitelists in SA

2009-12-18 Thread John Hardin

On Fri, 18 Dec 2009, Charles Gregory wrote:


On Thu, 17 Dec 2009, jdow wrote:

 It is a good thing this issue was raised. It led to appropriate mass
 check runs. I expect that will lead to saner scoring within the SA
 framework. If not and it bites me, THEN I'll raise the issue again.
 Does that seem fair?


50_scores.cf:score HABEAS_ACCREDITED_COI 0 -8.0 0 -8.0
50_scores.cf:score HABEAS_ACCREDITED_SOI 0 -4.3 0 -4.3
50_scores.cf:score HABEAS_CHECKED 0 -0.2 0 -0.2

Still no changes through the sa-update channel.


There won't be until after 3.3.0 ships. Then changes to 3.2.x (including a 
possible 3.2.6 release) will be considered.


As far as I know rule promotion and rescoring are not automatic for 3.2.x, 
it's still a manual process. All of the focus right now is on getting 
3.3.0 out.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Bother, said Pooh as he struggled with /etc/sendmail.cf, it never
  does quite what I want. I wish Christopher Robin was here.
   -- Peter da Silva in a.s.r
---
 7 days until Christmas


Re: Whitelists in SA

2009-12-18 Thread LuKreme

On Dec 18, 2009, at 7:56, Charles Gregory cgreg...@hwcn.org wrote:

Still no changes through the sa-update channel.
Is there a time delay in the masscheck results being applied?


It's already been stayed no changes to 3.2.5 will be made until 3.3 is  
done, hasn't it?




Re: Whitelists in SA

2009-12-18 Thread jdow

From: Charles Gregory cgreg...@hwcn.org
Sent: Friday, 2009/December/18 06:56



On Thu, 17 Dec 2009, jdow wrote:

It is a good thing this issue was raised. It led to appropriate mass
check runs. I expect that will lead to saner scoring within the SA
framework. If not and it bites me, THEN I'll raise the issue again.
Does that seem fair?


50_scores.cf:score HABEAS_ACCREDITED_COI 0 -8.0 0 -8.0
50_scores.cf:score HABEAS_ACCREDITED_SOI 0 -4.3 0 -4.3
50_scores.cf:score HABEAS_CHECKED 0 -0.2 0 -0.2

Still no changes through the sa-update channel.
Is there a time delay in the masscheck results being applied?

- Charles


Yes, there is, Mr. Gregory. It exists between your monitor and your
keyboard.

{^_^}


Re: [sa] Re: Whitelists in SA

2009-12-18 Thread Charles Gregory

On Fri, 18 Dec 2009, LuKreme wrote:
It's already been stayed no changes to 3.2.5 will be made until 3.3 is 
done, hasn't it?


Well, at this point, I respectfully bow, and take a step back, so as not 
to sound too demanding of our great volunteers (smile), but I believe 
in another of my posts I put forward the idea that design, testnig and 
implementation of rules should be a bit more 'frequent', drawing upon 
the model of ClamAV, with signatures being frequently released, even 
while the next major 'engine' update is in the works.


I recognize, from the existence of such sites as 'rules du jour' that it 
has long been a practice for SA to release 'core' rule updates very 
infrequently. But with respect, I question whether that is still a good 
practice, particularly when an 'issue' raises concern over a particular 
set of scores, and it would *appear* that these updates require relatively 
little effort.


So, to put it bluntly, I don't see how a couple of rules changes are 
worthy of being 'held back' by the entire push to SA 3.3. I would 
think that a few quick adjustments, and presumably a 'masscheck' would 
suffice, and new/revised rules could be released at least on a monthly 
basis without any serious concern for compromising the overall score 
balance that is the critical goal of SA updates?


Or am I grossly mis-estimating the work-load? :)

- C


Re: [sa] Re: Whitelists in SA

2009-12-18 Thread John Hardin

On Fri, 18 Dec 2009, Charles Gregory wrote:

I recognize, from the existence of such sites as 'rules du jour' that it 
has long been a practice for SA to release 'core' rule updates very 
infrequently. But with respect, I question whether that is still a good 
practice, particularly when an 'issue' raises concern over a particular 
set of scores, and it would *appear* that these updates require 
relatively little effort.


We hope to get rule scoring and publication much more automated - i.e., if 
a rule in the sandbox works well based on the automated masschecks, it 
would be automatically scored and published via sa-update.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Bother, said Pooh as he struggled with /etc/sendmail.cf, it never
  does quite what I want. I wish Christopher Robin was here.
   -- Peter da Silva in a.s.r
---
 7 days until Christmas


Re: [sa] Re: Whitelists in SA

2009-12-18 Thread Charles Gregory

On Fri, 18 Dec 2009, jdow wrote:

 On Thu, 17 Dec 2009, jdow wrote:
 Still no changes through the sa-update channel.
 Is there a time delay in the masscheck results being applied?

Yes, there is, Mr. Gregory. It exists between your monitor and your
keyboard.


There is a one inch gap between those two.

Perhaps you meant CHAIR and keyboard? ;)

- C


Re: [sa] Re: Whitelists in SA

2009-12-18 Thread Charles Gregory

On Fri, 18 Dec 2009, John Hardin wrote:
We hope to get rule scoring and publication much more automated - i.e., 
if a rule in the sandbox works well based on the automated masschecks, 
it would be automatically scored and published via sa-update.


Music to my ears. I will wait (semi-)patiently. Thanks.

- C


Re: [sa] Re: Whitelists in SA

2009-12-18 Thread jdow

From: Charles Gregory cgreg...@hwcn.org
Sent: Friday, 2009/December/18 13:49



On Fri, 18 Dec 2009, jdow wrote:

 On Thu, 17 Dec 2009, jdow wrote:
 Still no changes through the sa-update channel.
 Is there a time delay in the masscheck results being applied?

Yes, there is, Mr. Gregory. It exists between your monitor and your
keyboard.


There is a one inch gap between those two.

Perhaps you meant CHAIR and keyboard? ;)


I should have guessed you've managed to short circuit the path
through your brain.

{O,o}   -- Grinning, ducking, and running REAL fast that way

(Thanks for the straight line. {^_-})


Re: [sa] Re: Whitelists in SA

2009-12-18 Thread Warren Togami

On 12/18/2009 04:56 PM, Charles Gregory wrote:

On Fri, 18 Dec 2009, John Hardin wrote:

We hope to get rule scoring and publication much more automated -
i.e., if a rule in the sandbox works well based on the automated
masschecks, it would be automatically scored and published via sa-update.


Music to my ears. I will wait (semi-)patiently. Thanks.

- C


Why wait, when you do relatively simple things to help make it happen?

http://wiki.apache.org/spamassassin/NightlyMassCheck
We can more frequently update rules if more people participate in the 
nightly masschecks.  The current documentation is a bit of a confusing 
mess unfortunately.


Warren


Re: [sa] Re: Whitelists in SA

2009-12-18 Thread Daryl C. W. O'Shea
On 18/12/2009 5:13 PM, Warren Togami wrote:
 On 12/18/2009 04:56 PM, Charles Gregory wrote:
 On Fri, 18 Dec 2009, John Hardin wrote:
 We hope to get rule scoring and publication much more automated -
 i.e., if a rule in the sandbox works well based on the automated
 masschecks, it would be automatically scored and published via
 sa-update.

 Music to my ears. I will wait (semi-)patiently. Thanks.

 - C
 
 Why wait, when you do relatively simple things to help make it happen?
 
 http://wiki.apache.org/spamassassin/NightlyMassCheck
 We can more frequently update rules if more people participate in the
 nightly masschecks.  The current documentation is a bit of a confusing
 mess unfortunately.

Exactly!  We have code to do this now.  But I'm positive that we don't
have a large and diverse enough ham corpus (on a daily basis, not the
big turn out for the legacy re-score mass-checks) to trust it.

Contributors are always welcome!

Daryl



Re: Whitelists, not directly useful to spamassassin...

2009-12-17 Thread Per Jessen
Warren Togami wrote:

 https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6247#c49
 https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6247#c51
 It turns out that the ReturnPath and DNSWL whitelists have a
 statistically insignificant impact on spamassassin's ability to
 determine ham vs. spam.  Meanwhile, both whitelists have high levels
 of accuracy.
 
 How can both of these statements be true?  I suspect this is because
 the scores are balanced by the rescoring algorithm to be safe in the
 majority case where no whitelist rule has triggered.  Thus whitelists
 are not needed or relied upon to prevent false positive
 classification.

I concur, that is what my analysis of HABEAS hits over the last four
months showed too. 


/Per Jessen, Zürich



Re: Whitelists, not directly useful to spamassassin...

2009-12-17 Thread Jason Bertoch

Warren Togami wrote:



While whitelists are not directly effective (statistically, when 
averaged across a large corpus), whitelists are powerful tools in 
indirect ways including:


* Pushing the score beyond the auto-learn threshold for things like 
Bayes to function without manual intervention.


This does not sound like a positive thing to me.  E-mail from any sender 
that is malformed enough to skip auto-learning should not be forced into 
Bayes as ham simply because some 3rd party promises, for their own 
monetary benefit, that the sender is a nice guy.  Why should any sender 
that I have not intentionally added to my local whitelist get a break?


I've had enough problems with DNSWL, HABEAS, and JMF that they have all 
been disabled here.  Unfortunately, that also means I have no recent 
data to add to the debate.  Although I believe that whitelists should be 
included in the default install for those that want them, I also believe 
they should be disabled by default so that an admin must knowingly 
enable them after reading the manual and considering the consequences.


The argument has also been made that whitelists should be included 
simply because blacklists are.  I think that argument is flawed. 
Blacklists are part of the spam fighting community while whitelists are 
part of the bulk delivery community.  Their goals and motives are 
completely different.  For one, blacklists will normally have evidence 
of abuse to support their listing.  Whitelists only have policies and 
promises.  Second, the scoring of whitelists is currently favored over 
blacklists, and will continue to be at the proposed settings for 3.3.0. 
 Why can a whitelist override the score of a blacklist when it is the 
blacklist that has evidence of abuse?



After reading up on Bug6247, I found that ReturnPath included 
interesting stats on their lists:


Certified
Active: 4407
Suspended: 1300
Total: 5707

Safe
Active: 6561
Suspended: 283
Total: 6844


The Certified list is supposedly difficult to get on so I'm not sure how 
to interpret these results.  Is 1/5 of the list suspended because of due 
diligence on the part of ReturnPath?  If so, how did they get certified 
in the first place?


If whitelists are to be enabled by default, I believe their score should 
be moved considerably more toward zero.


/Jason


Re: Whitelists, not directly useful to spamassassin...

2009-12-17 Thread Charles Gregory


Thank you, Warren. That (finally) gives some real perspective to this 
mess, and gets some of the 'real' questions answered.


- C

On Wed, 16 Dec 2009, Warren Togami wrote:
I made a discovery today that surprised even myself.  Using the rescore 
masscheck and weekly masscheck logs while working on Bug #6247 I found some 
interesting details that throws a wrench into this lively debate.


https: //issues.apache.org/SpamAssassin/show_bug.cgi?id=6247#c49
https: //issues.apache.org/SpamAssassin/show_bug.cgi?id=6247#c51
It turns out that the ReturnPath and DNSWL whitelists have a statistically 
insignificant impact on spamassassin's ability to determine ham vs. spam. 
Meanwhile, both whitelists have high levels of accuracy.


How can both of these statements be true?  I suspect this is because the 
scores are balanced by the rescoring algorithm to be safe in the majority 
case where no whitelist rule has triggered.  Thus whitelists are not needed 
or relied upon to prevent false positive classification.


While whitelists are not directly effective (statistically, when averaged 
across a large corpus), whitelists are powerful tools in indirect ways 
including:


* Pushing the score beyond the auto-learn threshold for things like Bayes to 
function without manual intervention.
* The albeit controversial method where some automated spam trap blacklists 
use whitelists to help determine if they really should list an IP address.


https: //issues.apache.org/SpamAssassin/show_bug.cgi?id=6247
https: //issues.apache.org/SpamAssassin/show_bug.cgi?id=6251
spamassassin-3.3.0 has reduced the score impact of these whitelists to more 
modest levels, maxing out at -5 points.  -5 is PLENTY for spamassassin, as 5 
points is the level which the scoreset is tuned. Mail from a whitelisted host 
would need greater than 10 points to be blocked, which is statistically very 
rare for ham.  I believe that we are striking the right balance with these 
modest whitelist scores in this release.


That being said, whitelists should be constantly policed to maintain their 
reputation and trust levels.  For example, while I currently am impressed by 
DNSWL's performance, I am not pleased that they seem to lack automated 
trap-based enforcement.  Relying only on manual reports and manual 
intervention requires too much effort in the long-term for any organization, 
be it company or volunteer run.


Warren Togami
wtog...@redhat.com




Re: Whitelists, not directly useful to spamassassin...

2009-12-17 Thread Warren Togami

On 12/17/2009 11:27 AM, Jason Bertoch wrote:


If whitelists are to be enabled by default, I believe their score should
be moved considerably more toward zero.

/Jason


I don't necessarily disagree with this desire, as now we know the 
whitelists actually are making almost zero difference to spamassassin's 
results.


We did at least reduce the scores from their default values that were in 
spamassassin-3.2.x as a reasonable compromise.


Warren


Re: Whitelists, not directly useful to spamassassin...

2009-12-17 Thread J.D. Falk
Very interesting data indeed -- and a testament to the accuracy of the 
SpamAssassin rules weighting process.

On Dec 16, 2009, at 4:10 PM, Warren Togami wrote:

 While whitelists are not directly effective (statistically, when averaged 
 across a large corpus), whitelists are powerful tools in indirect ways 
 including:
 
 * Pushing the score beyond the auto-learn threshold for things like Bayes to 
 function without manual intervention.
 * The albeit controversial method where some automated spam trap blacklists 
 use whitelists to help determine if they really should list an IP address.

Another indirect benefit (according to other users of our whitelists) is that 
when they implement a new spam-blocking method, the whitelists serve as kind of 
a safety valve to let legitimate mail through even when the new rule turns out 
to have false positives.

Site-specific whitelists are important for this, too.

 That being said, whitelists should be constantly policed to maintain their 
 reputation and trust levels.

Agreed.

--
J.D. Falk jdf...@returnpath.net
Return Path Inc






Re: Whitelists in SA

2009-12-17 Thread J.D. Falk
On Dec 16, 2009, at 8:35 AM, LuKreme wrote:

 The fact is I *AM* their customer. The people writing them checks are not, 
 they're just their funders. Whitelist companies ha to convince admins to use 
 their list. The only way to do that is to have really really really high 
 quality lists that really do prevent spam delivery. If I don't use their 
 whitelist, and others don't use their whitelist, then their model falls apart 
 and they don't make money

Exactly what Return path has been saying (and acting upon) for years.

(We could debate whether Habeas followed that rule before we bought the 
company, but it's impolite to speak ill of the dead.)

 but no company is enlightened enough to realise this.

Heh.

--
J.D. Falk jdf...@returnpath.net
Return Path Inc






Re: Whitelists in SA

2009-12-17 Thread jdow

From: J.D. Falk jdfalk-li...@cybernothing.org
Sent: Thursday, 2009/December/17 11:21


On Dec 16, 2009, at 8:35 AM, LuKreme wrote:

The fact is I *AM* their customer. The people writing them checks are not, 
they're just their funders. Whitelist companies ha to convince admins to 
use their list. The only way to do that is to have really really really 
high quality lists that really do prevent spam delivery. If I don't use 
their whitelist, and others don't use their whitelist, then their model 
falls apart and they don't make money


Exactly what Return path has been saying (and acting upon) for years.

(We could debate whether Habeas followed that rule before we bought the 
company, but it's impolite to speak ill of the dead.)



but no company is enlightened enough to realise this.


Heh.

jdowLukreme seems to not have much of an engineering education
and zero experience with statistics. It is statistically impossible
to remove all spam perfectly and let all ham through perfectly. Perfect
is a goal you can never reach. If you obsess about it, you will find
yourself round the bend before long. All you can do is adjust the
ratio of missed ham to missed spam one way or the other. Where you
slice is pretty much up to you. What is the cost, the real cost in
lost customers or dollars spent, for a missed ham and for a missed
spam. If you can hit that balance point for minimum overall cost you've
done your job. If you sit and bitch about something not being perfect,
then you're not doing your job.

It is a good thing this issue was raised. It led to appropriate mass
check runs. I expect that will lead to saner scoring within the SA
framework. If not and it bites me, THEN I'll raise the issue again.
Does that seem fair?

{^_^} 



Re: Whitelists in SA

2009-12-16 Thread LuKreme
On 16-Dec-2009, at 08:03, Marc Perkel wrote:
 Res wrote:
 
 no whitelist should ever become default part of SA
 
 the day it is, is the day I look elsewhere.
 
 Why shouldn't white lists become part of SA? Blacklists are part of SA. My 
 hostkarma whitelists are one of the things that keeps me in business because 
 my false positive rates are far far better than SA because of white listing. 
 There are millions of email servers out there that do nothing but send good 
 email 100% of the time that are easy to detect because, unlike spammers, they 
 aren't trying to be evasive. I continue to be of the opinion that SA need 
 more white rules to detect HAM and not just SPAM.

I would say that no COMMERCIAL whitelist should be part of SA. I use 
whitelisting myself, but I'm not going to trust someone who was a financial 
interest in getting mail delivered to me to be diligent in their whitelisting. 
After all, their bean-counters don't see me as the customer because I'm not 
writing them checks.

The fact is I *AM* their customer. The people writing them checks are not, 
they're just their funders. Whitelist companies ha to convince admins to use 
their list. The only way to do that is to have really really really high 
quality lists that really do prevent spam delivery. If I don't use their 
whitelist, and others don't use their whitelist, then their model falls apart 
and they don't make money, but no company is enlightened enough to realise this.


-- 
They say only the good die young. If it works the other way too 
I'm immortal



Re: whitelists (was Re: Barracuda Blacklist)

2009-05-30 Thread ANTICOM-STINGER
On Fri, 2009-05-29 at 12:16 -0600, J.D. Falk wrote:
 Rob McEwen wrote:
 
  Additionally, I'd like to ask, other than being a superb cash-generating
  machine, what good is a whitelist built upon pay-to-enter and NOT based
  on editorial decisions made by non-biased e-mail administrators?
 
 Those two aren't necessarily exclusive.  The standards for inclusion in a 
 whitelist can (and in many cases do) include the same performance metrics 
 that help e-mail administrators stay non-biased, such as user complaint 
 rate, spamtrap hits, and so forth.
 
 (I don't know whether Barracuda's whitelist includes those metrics.)
 
 The additional value to admins is that they don't have to keep watch over 
 the whitelisted IPs -- the whitelist operator handles that.  The fees cover 
 that monitoring, and consulting on improving practices where necessary.
 
 And, of course, if the whitelist operator is lying or slow or otherwise not 
 living up to expectations, the admin simply stops using that whitelist. 
 Lists that nobody uses don't get much business, so there's a direct 
 incentive for the whitelist operator to keep their list squeaky-clean.

The Barracuda white list is an 'exclusive' club and I suspect money has
changed hands. It includes eBay, Amazon, Microsoft etc along with some
very big 'marketing' companies that Micheal Perone (former alleged
spammer now part of Barracuda) may have some involvement in.

For the ordinary 'mongs' there is email.reg which is a 'pay to spam'
service :-)

I guess everyone knows that the Barracuda is basically SpamAssasin on a
cheap Linux box. It's full of great open source software glued together
with some very flaky scripts. I cannot believe people pay the money they
do for it. I don't think Barracuda can believe it either.

 



Re: whitelists (was Re: Barracuda Blacklist)

2009-05-30 Thread Res

On Fri, 29 May 2009, ANTICOM-STINGER wrote:


The Barracuda white list is an 'exclusive' club and I suspect money has


This applies to any whitelists, and I never use them, I think, I and my 
staff are the *only* ones in a position to decide who to whitelist, and I 
think most ISP/ASP's are of the same opinion



For the ordinary 'mongs' there is email.reg which is a 'pay to spam'
service :-)


Tongue in cheek or not, it's essentially true!

--
Res

-Beware of programmers who carry screwdrivers


Re: Whitelists

2005-08-09 Thread salist
Someone can correct me if I am wrong, but I belive you can do it like so...

[EMAIL PROTECTED]




 Indulge me for a moment.

 It has been much too long since I thanked the developers of this program.
 You have no idea what a difference it has made in my life. I have an old
 address, one that's been around for almost ten years, and spamassassin
 catches more than 1000 spams a day aimed directly my address.

 Now... onto business.

 I am trying to pass CNN breaking news alerts through the filters. My
 user_prefs contains:

 whitelist_from [EMAIL PROTECTED]
 and even
 whitelist_from [EMAIL PROTECTED]

 The problem is that they are sending mail from [EMAIL PROTECTED]

 and it is being flagged as spam. What is the easiest way around this?

 Thanks

 Jack





RE: Whitelists

2005-08-09 Thread Randal, Phil
It's also preferable to use whitelist_from_rcvd.

Unless you really want to let spam from spoofed cnn.com email addresses
through.

Phil


Phil Randal
Network Engineer
Herefordshire Council
Hereford, UK  

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
 Sent: 09 August 2005 14:24
 To: Jack Gostl
 Cc: users@spamassassin.apache.org
 Subject: Re: Whitelists
 
 Someone can correct me if I am wrong, but I belive you can do 
 it like so...
 
 [EMAIL PROTECTED]
 
 
 
 
  Indulge me for a moment.
 
  It has been much too long since I thanked the developers of 
 this program.
  You have no idea what a difference it has made in my life. 
 I have an old
  address, one that's been around for almost ten years, and 
 spamassassin 
  catches more than 1000 spams a day aimed directly my address.
 
  Now... onto business.
 
  I am trying to pass CNN breaking news alerts through the 
 filters. My 
  user_prefs contains:
 
  whitelist_from [EMAIL PROTECTED]
  and even
  whitelist_from [EMAIL PROTECTED]
 
  The problem is that they are sending mail from [EMAIL PROTECTED]
 
  and it is being flagged as spam. What is the easiest way 
 around this?
 
  Thanks
 
  Jack
 
 
 


Re: Whitelists

2005-08-09 Thread Robert Menschel
Hello Jack,

Tuesday, August 9, 2005, 6:15:22 AM, you wrote:

JG I am trying to pass CNN breaking news alerts through the filters. My
JG user_prefs contains:
JG whitelist_from [EMAIL PROTECTED]
JG and even
JG whitelist_from [EMAIL PROTECTED]
JG The problem is that they are sending mail from [EMAIL PROTECTED]
JG and it is being flagged as spam. What is the easiest way around this?

1) Grab the SARE whitelist config file, which uses the
whitelist_from_rcvd directive rather than whitelist_from, and includes
an entry for
 whitelist_from_rcvd [EMAIL PROTECTED]  cnn.com  # C.N.N.
See http://www.rulesemporium.com/rules.htm#whitelist

2) If you have other important non-spam emails from CNN coming from
other @*.cnn.com email addresses, send me copies with full headers, so
I can add them to the file.

That way others will benefit besides you (though you're more than
welcome to use the sample above for your own use if you want).

Bob Menschel





Re: whitelists

2005-05-20 Thread Matt Kettler
Thomas Deaton wrote:
 Should local whitelists go into /etc/mail/spamassassin/local.cf
 or /etc/MailScanner/rules/spam.whitelist.rules
 ?
 Is one more effective than the other?

They operate differently, and in general the MailScanner level whitelist
(spam.whitelist.rules) is better than using SA's whitelists in local.cf.

However, beware that this file is a MailScanner file, and does not accept
SpamAssassin whitelist_from type syntax.

MailScanner's whitelist can also act on the relay IP.

SA's whitelists, while useful, are in general a little bit of a hack intended to
help those who can't do whitelisting at a higher layer. Tools above SA have
clear access to the message envelope, and tend to suffer less from the
ambiguities that SA suffers from when guessing at the envelope based on hints in
the message headers.