Re: Feature Request: Whitelist_DNSRBL

2004-12-09 Thread Jeff Chan
On Wednesday, December 8, 2004, 8:47:18 AM, Larry Rosenbaum wrote:
 How about a way to use wildcards with uridnsbl_skip_domain?  I'd like to
 be able to tell the SURBL code not to look up

 *.gov
 *.mil
 *.edu
 and even *.??.us

 since these are unlikely to be hosting spammer web pages.

True, but most people believe that whitelisting entire
TLDs is too broad, and I agree.

Jeff C.
-- 
Jeff Chan
mailto:[EMAIL PROTECTED]
http://www.surbl.org/



Re: Feature Request: Whitelist_DNSRBL

2004-12-09 Thread Jeff Chan
On Wednesday, December 8, 2004, 8:03:35 AM, Bill Landry wrote:
 - Original Message - 
 From: Daryl C. W. O'Shea [EMAIL PROTECTED]

   Was the whitelist you were referring to really the SURBL server-side
 whitelist?
  
  
   Yes! But local SURBL whitelists are needed to reduce traffic and time.


 I'd much rather see SURBL respond with 127.0.0.0 with a really large TTL
 for white listed domains.  Any sensible setup will run a local DNS cache
 which will take care of the load and time issue.

 I agree, and have suggested a whitelist SURBL several times on the SURBL
 discussion list, but it has always fallen on deaf ears - nary a response.
 It would be nice if someone would at least respond as to why this is not a
 reasonable suggestion.

Bill,
We did discuss several times before.  Some of the discussion
may have been behind the scenes in the development of
uridnsbl_skip_domain:

  http://bugzilla.spamassassin.org/show_bug.cgi?id=3805

but we also discussed it on the SURBL discussion list.  As I
recall some of the arguments against it included:

1.  Possible misuse: i.e. mistakenly using it as a blacklist.

2.  Performance: A relatively small number of domains appear
most frequently in hams, like yahoo.com, w3.org, etc.  The
point of diminishing returns in publishing as a DNS list
more than a few hundred whitelisted domains is reached quickly
in terms of decreasing frequency of hits.  Some of this can
be seen in the whitelist sample hit count stats at:

  http://www.surbl.org/dns-queries.whitelist.counts.txt

A cursory statistical analysis will prove my point.

3.  Whitehat domains are pretty stable.  They tend not to
change over the course of many months or even years.

4.  Blackhat domains in contrast tend to change rapidly.
There is statistical research showing that most spam domains
are only used for a few days, then discarded.

5.  Therefore the size and rapid changes of spam domains
are more appropriately communicated in DNS lists than
whitehat domains.

There may have been other arguments, but these are probably
the key ones.

Jeff C.
-- 
Jeff Chan
mailto:[EMAIL PROTECTED]
http://www.surbl.org/



Re: Feature Request: Whitelist_DNSRBL

2004-12-09 Thread Jeff Chan
On Wednesday, December 8, 2004, 9:07:44 AM, Chris Santerre wrote:
 Actually I was only saying to list the top look ups from the whitelist, not
 the 66,500. That is more of a research and exclusion tool. So no more then
 200-300 domains. Check it every month for changes and update. 

This is already answered in other messages, but the top 125
most often hit SURBL whitelisted domains are currently listed
in the default 25_uribl.cf file:

  http://spamassassin.apache.org/full/3.0.x/dist/rules/25_uribl.cf

# Top 125 domains whitelisted by SURBL
uridnsbl_skip_domain yahoo.com w3.org msn.com com.com yimg.com
uridnsbl_skip_domain hotmail.com doubleclick.net flowgo.com ebaystatic.com 
aol.com
[...]

Jeff C.
-- 
Jeff Chan
mailto:[EMAIL PROTECTED]
http://www.surbl.org/



Re: Feature Request: Whitelist_DNSRBL

2004-12-09 Thread Jeff Chan
On Wednesday, December 8, 2004, 9:49:55 AM, Daryl O'Shea wrote:
 Additionally, assuming there isn't an extreme query frequency drop off
 after the top 100 or 200 excluded domains, it would be nice to have 
 access to the rest of the exclusion list which wouldn't be realistic to 
 be storing (and currently copying around) in memory.

 There's got to be a reason why SpamAssassin currently only includes the 
 top 100 or whatever excluded domains... either the rest of the data
 wasn't useful or it wasn't worth the performance hit having them in memory.

I believe the 125 cutoff was entirely arbitrary, but it happens to
correspond almost exactly with the 50th percentile of DNS queries
against whitelisted domains, which is a happy coincidence and
a perfectly reasonable cut off point.

 New additions to the exclusion list would immediately be available too, 
 not that that is really a huge concern.

Remember that the only reason to build this hard-coded exclusion
list into SA was to prevent unnecessary DNS queries from
happening in the first place:

  http://bugzilla.spamassassin.org/show_bug.cgi?id=3805

The much larger global whitelist is applied internally in
SURBLs to prevent those domains from ever getting listed.
It is an exclusion list there.

Jeff C.
-- 
Jeff Chan
mailto:[EMAIL PROTECTED]
http://www.surbl.org/



Re: Feature Request: Whitelist_DNSRBL

2004-12-09 Thread Daryl C. W. O'Shea
Jeff Chan wrote:
 On Wednesday, December 8, 2004, 9:06:26 AM, Daryl O'Shea wrote:
It doesn't cause more lookups for anyone.  A local white list file would
reduces lookups at the expense of process size (and time if the white
list is very large).


 The SA developers chose an appropriately small exclusion list
 to hard code as the top 125 most often hit whitelist entries.
 Those top hits are largely invariant and would represent a
 large portion of the DNS queries if not excluded.  It doesn't
 make much sense to serve up a small, nearly invariant list
 with a DNS list, long TTLs or not.

 Jeff C.
Yes, as I noted later in the thread.
There's got to be a reason why SpamAssassin currently only includes the 
top 100 or whatever excluded domains... either the rest of the data 
wasn't useful or it wasn't worth the performance hit having them in memory.

I only suggested another solution to what Chris was suggesting (having 
Rules-du-jour style (assumedly) masssive .cf file exlusions lists... 
which in my opinion aren't appropriate (massive lists that is) due to 
the memory overhead.

I'm fully aware, as I think everyone is now, of the exlusion list 
included with 3.0.

Daryl


RE: Feature Request: Whitelist_DNSRBL

2004-12-09 Thread hamann . w
 How about a way to use wildcards with uridnsbl_skip_domain?  I'd like to
 be able to tell the SURBL code not to look up
 
 *.gov
 *.mil
 *.edu
 and even *.??.us
 
 since these are unlikely to be hosting spammer web pages.
 
 Larry
 
 

Hi,

I have received obscure web traffic from a .mil site recently - it looked like 
an infected
windows box trying to inflict pain on windows web server
(or would visitors from .mil sites conduct a vulnerability scan on remote 
sites before they
view them?)

Wolfgang Hamann





Re: Feature Request: Whitelist_DNSRBL

2004-12-09 Thread Jeff Chan
On Wednesday, December 8, 2004, 11:41:41 PM, hamann w wrote:
 How about a way to use wildcards with uridnsbl_skip_domain?  I'd like to
 be able to tell the SURBL code not to look up
 
 *.gov
 *.mil
 *.edu
 and even *.??.us
 
 since these are unlikely to be hosting spammer web pages.

 I have received obscure web traffic from a .mil site recently - it looked 
 like an infected
 windows box trying to inflict pain on windows web server
 (or would visitors from .mil sites conduct a vulnerability scan on remote 
 sites before they
 view them?)

That's bad, but remember that SURBLs are usually used to
check message body URIs and not sender domains.

Jeff C.
-- 
Jeff Chan
mailto:[EMAIL PROTECTED]
http://www.surbl.org/



RE: Feature Request: Whitelist_DNSRBL

2004-12-09 Thread Rosenbaum, Larry M.

 -Original Message-
 From: Jeff Chan [mailto:[EMAIL PROTECTED]
 Posted At: Wednesday, December 08, 2004 8:45 PM
 Posted To: sa-users
 Conversation: Feature Request: Whitelist_DNSRBL
 Subject: Re: Feature Request: Whitelist_DNSRBL
 
 On Wednesday, December 8, 2004, 8:47:18 AM, Larry Rosenbaum wrote:
  How about a way to use wildcards with uridnsbl_skip_domain?  I'd
like to
  be able to tell the SURBL code not to look up
 
  *.gov
  *.mil
  *.edu
  and even *.??.us
 
  since these are unlikely to be hosting spammer web pages.
 
 True, but most people believe that whitelisting entire
 TLDs is too broad, and I agree.
 
 Jeff C.

I understand, which is why I suggested a configuration option rather
than hardwiring the TLDs to skip into the code.  We exchange a lot of
mail with folks in these domains, so there is likely to be an upside in
not having to look up any .gov or .mil addresses that appear in the
message.  And if we do get spam advertising .gov or .mil web addresses,
there's something very wrong going on and we can report it.  Most other
email admins won't see the same tradeoffs.

By the way, if you have a message that's been forwarded in such a way
that the original recipient addresses become part of the message text,
the URI extraction code will extract these too.  Therefore, if you get
one of those forward this to everyone you know messages, it could
result in a lot of SURBL lookups.

From  Thu Dec  9 11:17:29 2004
Return-Path: 
Received: from emroute1.cind.ornl.gov (localhost [127.0.0.1])
 by emroute1.cind.ornl.gov (PMDF V6.2-X27 #30899)
 with ESMTP id [EMAIL PROTECTED] for
 [EMAIL PROTECTED] (ORCPT [EMAIL PROTECTED]); Thu,
 09 Dec 2004 11:17:29 -0500 (EST)
Received: from www2.ornl.gov (www2.ornl.gov [160.91.4.32])
 by emroute1.cind.ornl.gov (PMDF V6.2-X27 #30899)
 with ESMTP id [EMAIL PROTECTED] for
 [EMAIL PROTECTED] (ORCPT [EMAIL PROTECTED]); Thu,
 09 Dec 2004 11:17:12 -0500 (EST)
Received: from PROCESS-DAEMON.www2.ornl.gov by www2.ornl.gov
 (PMDF V6.2-1 #31038) id [EMAIL PROTECTED] for
 [EMAIL PROTECTED]; Thu, 09 Dec 2004 11:05:14 -0500 (EST)
Received: from www2.ornl.gov (PMDF V6.2-1 #31038)
 id [EMAIL PROTECTED]; Thu, 09 Dec 2004 10:12:34 -0500 (EST)
Date: Thu, 09 Dec 2004 10:12:34 -0500 (EST)
From: PMDF Internet Messaging [EMAIL PROTECTED]
Subject: Successful page transmission
In-reply-to: Your message dated Thu, 09 Dec 2004 10:11:56 -0500 (EST)
 [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Message-id: [EMAIL PROTECTED]
MIME-version: 1.0
Content-type: multipart/report;
 boundary=Boundary_(ID_Z1dAJ1KMtcGZAOKBpgNNIg); report-type=delivery-status


--Boundary_(ID_Z1dAJ1KMtcGZAOKBpgNNIg)
Content-type: text/plain; charset=us-ascii
Content-language: EN-US
Content-transfer-encoding: 7BIT

This report relates to a message you sent with the following header fields:

  Message-id: [EMAIL PROTECTED]
  Date: Thu, 09 Dec 2004 10:11:56 -0500 (EST)
  From: [EMAIL PROTECTED]
  To: [EMAIL PROTECTED]
  Subject: test

Your message has been successfully delivered to the following recipients:

  Recipient address: [EMAIL PROTECTED]
  Reason: Transmission of your pages has been completed: delivered in 1 part
Subject: test
2 transmission attempts were made

The content of the pages follow.



F:root S:test M:Testing paging on www2




--Boundary_(ID_Z1dAJ1KMtcGZAOKBpgNNIg)
Content-type: message/delivery-status

Reporting-MTA: dns;www2.ornl.gov (cingularpage.ornl.gov)

Action: delivered
Status: 2.1.5
 (Transmission of your pages has been completed: delivered in 1 part)
Original-recipient: rfc822;[EMAIL PROTECTED]
Final-recipient: rfc822;[EMAIL PROTECTED]

--Boundary_(ID_Z1dAJ1KMtcGZAOKBpgNNIg)
Content-type: TEXT/RFC822-HEADERS

Return-path: [EMAIL PROTECTED]
Received: from cingularpage.ornl.gov by www2.ornl.gov (PMDF V6.2-1 #31038)
 id [EMAIL PROTECTED]; Thu,  9 Dec 2004 10:12:34 -0500 (EST)
Received: from emroute1.cind.ornl.gov (smtp.ornl.gov [160.91.4.119])
 by www2.ornl.gov (PMDF V6.2-1 #31038)
 with ESMTP id [EMAIL PROTECTED] for
 [EMAIL PROTECTED]; Thu, 09 Dec 2004 10:11:57 -0500 (EST)
Received: from emroute1.cind.ornl.gov by emroute1.cind.ornl.gov
 (PMDF V6.2-X27 #30899) id [EMAIL PROTECTED] for
 [EMAIL PROTECTED]; Thu, 09 Dec 2004 10:11:56 -0500 (EST)
Date: Thu, 09 Dec 2004 10:11:56 -0500 (EST)
From: [EMAIL PROTECTED]
Subject: test
To: [EMAIL PROTECTED]
Message-id: [EMAIL PROTECTED]
MIME-version: 1.0
Content-type: TEXT/PLAIN
Content-transfer-encoding: QUOTED-PRINTABLE



--Boundary_(ID_Z1dAJ1KMtcGZAOKBpgNNIg)--



Re: Feature Request: Whitelist_DNSRBL

2004-12-09 Thread Jeff Chan
On Thursday, December 9, 2004, 8:14:16 AM, Larry Rosenbaum wrote:
 By the way, if you have a message that's been forwarded in such a way
 that the original recipient addresses become part of the message text,
 the URI extraction code will extract these too.  Therefore, if you get
 one of those forward this to everyone you know messages, it could
 result in a lot of SURBL lookups.

Code using SURBLs is supposed to look for URIs, which
message headers don't usually look like.

  http://www.surbl.org/implementation.html

Jeff C.
-- 
Jeff Chan
mailto:[EMAIL PROTECTED]
http://www.surbl.org/



Re: Feature Request: Whitelist_DNSRBL

2004-12-08 Thread Matt Kettler
At 10:17 AM 12/8/2004 -0500, Chris Santerre wrote:
OK, we know that the popular domains like yahoo.com and such are hard coded
into SA to be skipped on DNSRBL lookups. But it would be great to have a
function to add more locally.
Um. They are?? AFAIK there are absolutely no whitelists to the DNSRBLs in 
SA itself.

Don't confuse the EXISTING_DOMAINS list in DNS.pm with a whitelist.
That's actually a list of domains that are used to test if your DNS is 
working if you don't have dns_available set to yes. SA does a quick MX 
query for one of the domains in the list, and if it gets an answer, it 
knows it's working...

However, I do agree it would be nice to be able to have a DNSBL whitelist 
capability, if for no other reason than fixing any listings that might 
cause short-term FPs.

Thinking one step bigger, it would be even better to feed this a file. This
way maybe SURBL can create a file for the top hit legit domains. Then using
SARE and RDJ, people could update that. This would reduce a lot of traffic
and time.
Wait, now you're bringing SURBL into this.. are you talking normal DNSRBLS, 
or URIDNSBLS? Or both?

Was the whitelist you were referring to really the SURBL server-side whitelist?
This might also help with the mysterious bug we have seen where some local
domains are being flagged as SURBL hit, when they aren't in SURBL. Perhaps
whitelisting local domains so they are skipped would do away with this.
Agreed.. It would provide users a short-term fix, although really the 
problem does need to be rooted out at some point..

Thoughts, suggestions, or coffee?
All of the above?




Re: Feature Request: Whitelist_DNSRBL

2004-12-08 Thread Daryl C. W. O'Shea
Chris Santerre wrote:
 Was the whitelist you were referring to really the SURBL server-side 
whitelist?


 Yes! But local SURBL whitelists are needed to reduce traffic and time.

I'd much rather see SURBL respond with 127.0.0.0 with a really large TTL 
for white listed domains.  Any sensible setup will run a local DNS cache 
which will take care of the load and time issue.

Daryl


Re: Feature Request: Whitelist_DNSRBL

2004-12-08 Thread Michael Barnes
On Wed, Dec 08, 2004 at 10:26:15AM -0500, Matt Kettler wrote:
 At 10:17 AM 12/8/2004 -0500, Chris Santerre wrote:
 OK, we know that the popular domains like yahoo.com and such are hard coded
 into SA to be skipped on DNSRBL lookups. But it would be great to have a
 function to add more locally.
 
 Um. They are?? AFAIK there are absolutely no whitelists to the DNSRBLs in 
 SA itself.

I'm not sure if DNSRBLs are the same as URIDNSBLs, or if this was the
intent of the original poster, but SA 3.0.1 added the configuration
option 'uridnsbl_skip_domain' which does not check the urls in emails
that are from the listed domains.  The following domains have been added
to this list by default in 25_uribl.cf:

4at1.com
5iantlavalamp.com
adobe.com
advertising.com
afa.net
akamai.net
akamaitech.net
amazon.com
aol.com
apache.org
apple.com
arcamax.com
atdmt.com
att.net
bbc.co.uk
bfi0.com
bravenet.com
bridgetrack.com
cc-dt.com
chase.com
cheaptickets.com
chtah.com
citibank.com
citizensbank.com
classmates.com
click-url.com
cnet.com
cnn.com
com.com
comcast.net
constantcontact.com
debian.org
directtrack.com
doubleclick.net
dsbl.org
dsi-enews.net
e-trend.co.jp
earthlink.net
ebay.com
ebaystatic.com
ed10.net
ed4.net
edgesuite.net
ediets.com
exacttarget.com
extm.us
flowgo.com
geocities.com
gmail.com
go.com
google.com
grisoft.com
gte.net
hitbox.com
hotbar.com
hotmail.com
hyperpc.co.jp
ibm.com
ientrymail.com
incredimail.com
investorplace.com
jexiste.fr
joingevalia.com
m0.net
mac.com
macromedia.com
mail.com
marketwatch.com
mcafee.com
mediaplex.com
messagelabs.com
microsoft.com
monster.com
moveon.org
msn.com
mycomicspage.com
myweathercheck.com
netatlantic.com
netflix.com
norman.com
nytimes.com
p0.com
pandasoftware.com
partner2profit.com
paypal.com
pcmag.com
plaxo.com
postdirect.com
prserv.net
quickinspirations.com
redhat.com
rm04.net
roving.com
rr.com
rs6.net
sbcglobal.net
sears.com
sf.net
shockwave.com
si.com
sitesolutions.it
smileycentral.com
sourceforge.net
spamcop.net
speedera.net
sportsline.com
sun.com
suntrust.com
terra.com.br
tiscali.co.uk
topica.com
ual.com
uclick.com
unitedoffers.com
ups.com
verizon.net
w3.org
washingtonpost.com
weatherbug.com
xmr3.com
yahoo.co.uk
yahoo.com
yahoogroups.com
yimg.com
yourfreedvds.com

Mike 

-- 
/-\
| Michael Barnes [EMAIL PROTECTED] |
| UNIX Systems Administrator  |
| College of William and Mary |
| Phone: (757) 879-3930   |
\-/


Re: Feature Request: Whitelist_DNSRBL

2004-12-08 Thread Bill Landry
- Original Message - 
From: Daryl C. W. O'Shea [EMAIL PROTECTED]

   Was the whitelist you were referring to really the SURBL server-side
 whitelist?
  
  
   Yes! But local SURBL whitelists are needed to reduce traffic and time.


 I'd much rather see SURBL respond with 127.0.0.0 with a really large TTL
 for white listed domains.  Any sensible setup will run a local DNS cache
 which will take care of the load and time issue.

I agree, and have suggested a whitelist SURBL several times on the SURBL
discussion list, but it has always fallen on deaf ears - nary a response.
It would be nice if someone would at least respond as to why this is not a
reasonable suggestion.

Bill



RE: Feature Request: Whitelist_DNSRBL

2004-12-08 Thread Chris Santerre


-Original Message-
From: Bill Landry [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 08, 2004 11:04 AM
To: users@spamassassin.apache.org; [EMAIL PROTECTED]
Subject: Re: Feature Request: Whitelist_DNSRBL


- Original Message - 
From: Daryl C. W. O'Shea [EMAIL PROTECTED]

   Was the whitelist you were referring to really the SURBL 
server-side
 whitelist?
  
  
   Yes! But local SURBL whitelists are needed to reduce 
traffic and time.


 I'd much rather see SURBL respond with 127.0.0.0 with a 
really large TTL
 for white listed domains.  Any sensible setup will run a 
local DNS cache
 which will take care of the load and time issue.

I agree, and have suggested a whitelist SURBL several times on 
the SURBL
discussion list, but it has always fallen on deaf ears - nary 
a response.
It would be nice if someone would at least respond as to why 
this is not a
reasonable suggestion.

Well we have talked about it and  didn't come up with a solid answer.
The idea would cause more lookups and time for those who don't cache dns. We
do have a whitelist that our private research tools do poll. The idea is
that if it isn't in SURBL then it is white. 

This also puts more work to the already overworked contributors. ;)

--Chris


Re: Feature Request: Whitelist_DNSRBL

2004-12-08 Thread David Hooton
On Wed, 8 Dec 2004 08:03:35 -0800, Bill Landry [EMAIL PROTECTED] wrote:
 I agree, and have suggested a whitelist SURBL several times on the SURBL
 discussion list, but it has always fallen on deaf ears - nary a response.
 It would be nice if someone would at least respond as to why this is not a
 reasonable suggestion.

The floor in offering a DNS based whitelist is that it encourages
people to place a negative score on it.  The problem with this is that
spammers can poison messages with whitelisted domains, thereby
bypassing the power of the SURBL

The concept of Whitelist in the SURBL world is more of an Exclusion
List as in we exclude these domains from being listed rather than
we consider the presence of these domains in an email to be a good
sign of ham.

An excluded domain is therefore ignored in all data and not allocated
a score positively or negatively, so trying to poison a message with
whitelisted domains is therefore pointless.

I think we either need to look at a DNS version of
uridnsbl_skip_domain with long TTL's or we should look at releasing a
.cf file.  I personally think the more proper implementation may be
the DNS based version in order to avoid BigEvil type situations.

Cheers!
-- 
Regards,

David Hooton


Re: Feature Request: Whitelist_DNSRBL

2004-12-08 Thread Bill Landry
- Original Message - 
From: Chris Santerre [EMAIL PROTECTED]

 -Original Message-
 From: Bill Landry [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, December 08, 2004 11:04 AM
 To: users@spamassassin.apache.org; [EMAIL PROTECTED]
 Subject: Re: Feature Request: Whitelist_DNSRBL
 
 
 - Original Message - 
 From: Daryl C. W. O'Shea [EMAIL PROTECTED]
 
Was the whitelist you were referring to really the SURBL
 server-side
  whitelist?
   
   
Yes! But local SURBL whitelists are needed to reduce
 traffic and time.
 
 
  I'd much rather see SURBL respond with 127.0.0.0 with a
 really large TTL
  for white listed domains.  Any sensible setup will run a
 local DNS cache
  which will take care of the load and time issue.
 
 I agree, and have suggested a whitelist SURBL several times on
 the SURBL
 discussion list, but it has always fallen on deaf ears - nary
 a response.
 It would be nice if someone would at least respond as to why
 this is not a
 reasonable suggestion.

 Well we have talked about it and  didn't come up with a solid answer.
 The idea would cause more lookups and time for those who don't cache dns.
We
 do have a whitelist that our private research tools do poll. The idea is
 that if it isn't in SURBL then it is white.

 This also puts more work to the already overworked contributors. ;)

Actually, I was thinking of the whitelist that Jeff has already compiled at
http://spamcheck.freeapp.net/whitelist-domains.sort (currently over 66,500
whitelisted domains).  If you set a long TTL on the query responses, it
would certainly cut down on follow-up queries for anyone that is running a
caching dns.  It would also be a lot less resource intensive then trying to
run a local whitelist.cf of over 66,500 whitelisted domains.

Anyway, just a thought...

Bill



Re: Feature Request: Whitelist_DNSRBL

2004-12-08 Thread Bill Landry
- Original Message - 
From: David Hooton [EMAIL PROTECTED]

 On Wed, 8 Dec 2004 08:03:35 -0800, Bill Landry [EMAIL PROTECTED]
wrote:
  I agree, and have suggested a whitelist SURBL several times on the SURBL
  discussion list, but it has always fallen on deaf ears - nary a
response.
  It would be nice if someone would at least respond as to why this is not
a
  reasonable suggestion.

 The floor in offering a DNS based whitelist is that it encourages
 people to place a negative score on it.  The problem with this is that
 spammers can poison messages with whitelisted domains, thereby
 bypassing the power of the SURBL

I agree, it should not be used as a HAM indicator, way too easy to abuse.  I
was suggesting that the whitelist be used as a way to exclude the domain
from being constantly queried against the SURBL name servers.

 The concept of Whitelist in the SURBL world is more of an Exclusion
 List as in we exclude these domains from being listed rather than
 we consider the presence of these domains in an email to be a good
 sign of ham.

Exactly.

 An excluded domain is therefore ignored in all data and not allocated
 a score positively or negatively, so trying to poison a message with
 whitelisted domains is therefore pointless.

Yep, agree wholeheartedly.

 I think we either need to look at a DNS version of
 uridnsbl_skip_domain with long TTL's or we should look at releasing a
 .cf file.  I personally think the more proper implementation may be
 the DNS based version in order to avoid BigEvil type situations.

Indeed, my thoughts exactly.

Bill



RE: Feature Request: Whitelist_DNSRBL

2004-12-08 Thread Rosenbaum, Larry M.
How about a way to use wildcards with uridnsbl_skip_domain?  I'd like to
be able to tell the SURBL code not to look up

*.gov
*.mil
*.edu
and even *.??.us

since these are unlikely to be hosting spammer web pages.

Larry



RE: Feature Request: Whitelist_DNSRBL

2004-12-08 Thread Chris Santerre


-Original Message-
From: Rosenbaum, Larry M. [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 08, 2004 11:47 AM
To: users@spamassassin.apache.org
Subject: RE: Feature Request: Whitelist_DNSRBL


How about a way to use wildcards with uridnsbl_skip_domain?  
I'd like to
be able to tell the SURBL code not to look up

*.gov
*.mil
*.edu
and even *.??.us


LOL we've listed a few edu so far :)

LOL @ BigEvil situation , its now famous!

Actually I was only saying to list the top look ups from the whitelist, not
the 66,500. That is more of a research and exclusion tool. So no more then
200-300 domains. Check it every month for changes and update. 

I'll probably make up a .cf file and start testing it. 

--Chris


Re: Feature Request: Whitelist_DNSRBL

2004-12-08 Thread Daryl C. W. O'Shea
Bill Landry wrote:
 From: Chris Santerre [EMAIL PROTECTED]

 Well we have talked about it and  didn't come up with a solid
 answer. The idea would cause more lookups and time for those who
 don't cache dns.
It doesn't cause more lookups for anyone.  A local white list file would 
reduces lookups at the expense of process size (and time if the white 
list is very large).

Besides, if someone doesn't want to take the 1-5 minutes it takes to 
setup a local DNS cache they're probably not too interested in saving 
time anyway.

 We do have a whitelist that our private research tools do poll. The 
 idea is that if it isn't in SURBL then it is white.

 This also puts more work to the already overworked contributors. ;)

How so?  The lookup code is already compatible as is, it's just a matter 
of adding the records to each of the SURBL zones... from the already 
existing data files.  That'd take some effort, but I can't imagine it 
would require anything more than trivial changes... although I've been 
wrong before.

 Actually, I was thinking of the whitelist that Jeff has already
 compiled at http://spamcheck.freeapp.net/whitelist-domains.sort
 (currently over 66,500 whitelisted domains).  If you set a long TTL 
on  the query responses, it would certainly cut down on follow-up queries
 for anyone that is running a caching dns.  It would also be a lot less
 resource intensive then trying to run a local whitelist.cf of over
 66,500 whitelisted domains.

Ditto.  Even if someone isn't running a caching name server, it's highly 
unlikely that their ISP isn't.

Daryl



RE: Feature Request: Whitelist_DNSRBL

2004-12-08 Thread Chris Santerre


  We do have a whitelist that our private research tools do 
poll. The 
  idea is that if it isn't in SURBL then it is white.
 
  This also puts more work to the already overworked contributors. ;)


How so?  The lookup code is already compatible as is, it's 
just a matter 
of adding the records to each of the SURBL zones... from the already 
existing data files.  That'd take some effort, but I can't imagine it 
would require anything more than trivial changes... although I've been 
wrong before.

Assuming that this whitelist would be used to LOWER the score of an email,
and not just exclude them from SURBL. Then we would go thru even
moreresearch before we whitelist a domain. There is a LOT of work that goes
into adding a domain to our whitelist, and that is JUST for exclusion! 

It takes at least twice as long to see if someone is white vs black. 

Thats where the more work would come from. You should see some of the long
threads on a single domain up for being whitelisted. Its a good thing Jeff
and I have a sense of humor with eachother ;) 

My whole idea was skipping the lookup entirley. Why would you want to do a
lookup for google even if it is cached? 

--Chris