Todd,

    1) Possibly. Still yes IMO.
    2) Not presently.
    3) Definitely, just help me make a list.

It is a little stickly, but my results showed it to not be problematic.  I am scoring this as only 30% of my fail weight, and DUL lists can have similar false positive rates (only 2% in my test of 5,500 messages).  I suppose that some regions where DSL is in wider use, this might produce more FP's.

The thing is though that anything that is 95% or more indicative of spam, especially something where the FP's don't tend to fail other high scoring tests, should be used...at least that's my feeling.

Knowing the source of FP's is critical to improving this test, as you pointed out.  My first attempt at stats was to look at the numbers and not at the trends in FP's outside of the type of folks mailing.  I have noted that some DSL lines are definitely marked as ADSL in the reverse, and if for a particular provider they gave separation with SDSL, we could then counterbalance.  I'm intending on monitoring this.  I have also noted that certain providers have different domains for business and residential, or so it would seem, but I haven't been keeping track as of yet.  I think this is the next step, but if there is just as much or near as much spam coming from SDSL or business-class domains, counterbalancing would be non-productive.  I'm not sure what the trend would be.

So unlike scoring all AOL, this produces less FP's and it is a big problem area for spam, especially the stuff that I see getting through.  Trustworthy mail servers on such networks shouldn't see any negative impact from having 30% of fail weight added.  It's a little wide in scope, but I can't figure out any better way of doing this except for some counterbalances which I hope folks will add to the list.

Not everything business-class network will be excludable because of mixing or amorphous naming conventions, but some certainly can be.  Road Runner business customers might be a problem for instance.  One thing that I'm looking into right now is the Declude server which is at cpe-24-107-232-14.ma.charter.com.  I'm thinking that "cpe" is for customer premise equipment, and if that is exclusive to charter business customers, it could be excluded.  All we need are some Charter residential customers to test this against...but we also have to make sure that it isn't a residential class of service that allows servers (I have one on mine).

In the mean time, you might try it out with a 0 score and see what hits it gets on spam that would otherwise get through, or just got through.  Also note the potential scoring hits on real E-mail and the scores such E-mail receives on other tests (which should be low unless they operate a third-rate bulk mailing script, and some will, Dimac JMail ActiveX E-mail component for instance which allows BADHEADERS configurations).

Your points are of course good and everyone should consider them before taking my view on the accuracy of this filter.

Matt



Todd Holt wrote:

I hate to open up old wounds, but…going back to the AOL filtering issue…

 

I would agree that the few (very few) number of legitimate mail servers connecting with cable modem or consumer DSL are acceptable false positives when filtered out as SPAM sources.

 

But one must consider that many legitimate mail servers are operated over business class DSL connections.  I used an SDSL (768k) connection for 2 years very successfully.

 

The key difference here is between consumer DSL (ADSL) and business DSL (SDSL).  If a filter cannot distinguish between ADSL and SDSL then the false positive rate (IMHO) is too high as it will lump all of the SDSL connections in with the SPAM sources.

 

  1. Can this filter distinguish between ADSL and SDSL? If not, is this acceptable?
  2. Is the filter doing this?
  3. Are there any unique instructions for doing this?

 

Any thoughts?

 

Todd Holt

Xidix Technologies, Inc

Las Vegas, NV  USA

www.xidix.com

 

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Matthew Bramble
Sent: Wednesday, September 17, 2003 3:54 PM
To: [EMAIL PROTECTED]
Subject: [Declude.JunkMail] DYNAMIC - 09/17/2003 - A new filter to detect IP'd reverse DNS entries

 

Ok, I've been testing this one for about a week with very positive results.  It's still a work in progress as far as exclusions go (candidates welcome), but I have been using it with a good deal of success as is for the past week.  The filter is called DYNAMIC and it can be downloaded at the following location:

http://www.mailpure.com/decludefilters/dynamic/Dynamic_09-17-2003.txt

(Links to the most recent versions of the filters that I have been testing are located at the bottom of this message.  I will put up some HTML soon to help enable the process since I have noted a few people downloading older versions from older postings to this group)

What the DYNAMIC filter does is detect E-mail from a sender with a reverse DNS lookup that has the tell-tale marks of being used for dial-up, DSL or cable broadband access.  I have found it to be very useful in scoring spam and it has a good impact on messages that don't fail many tests without being responsible for rejecting messages due to false positives.  As an extra added bonus, the use of the WHITELIST AUTH functionality that Scott announced yesterday is beneficial to this filter's use (explained in the file).

The method is a little controversial because it doesn't look for direct signs of spam such as OBFUSCATION, GIBBERISH or GIBBERISHSUB, but instead looks at where the message is coming from, knowing that dial-up, DSL and cable broadband address space is becoming increasingly problematic for spam origination, maybe due to recent virus outbreaks installing SMTP servers or backdoors on always-on connected machines.  There are plenty of examples where such space though hosts legitimate mail servers without customized reverse DNS, typically being business users.  Declude's own servers should trip this test if not whitelisted.  Therefore the scoring is low, however in a recent thorough test of over 1,000 filter hits (excluding Declude of course), the false positive rate was still only 2.0% of filter hits and nothing failed because of this test alone.  Unlike the other filters that I have recently been testing, this one doesn't tend to catch opt-in advertising, just small-busuness false positives that have mostly properly configured machines that score very low, so adding a few points to some of them is of no real harm.

This test also often crosses over into DUL territory, especially the less than pure EASYNET-DYNA blocklist.  Because of that, one should be careful to adjust the scores so that a double hit won't fail a message alone.  I also use SORBS-DUL which seems remarkably pure to the idea of being dynamic addresses where mail servers aren't allowed to be hosted on, so I don't feel there is any danger in having that test as a part of the mix.  Please see the detailed comments in the filter file for more information on configuration.  For those statistically inclined, I did a painstaking review on 2 days of traffic in order to get an impression on exactly what the impact was:

DYNAMIC FILTER STATISTICS
==================================================================
5,530 - Unique Incoming Messages
4,183 - Messages Rejected as Spam from All Filters (75.6% of Unique Incoming Messages, approximate)
1,053 - Filter Hits (19% of Unique Incoming Messages)
==================================================================
1,032 - Positives (98.0% of Filter Hits)
   21 - False Positives (2.0% of Filter Hits)
=================================================================
   70 - Hits That Made a Difference* (6.6% of Filter Hits)
   23 - Spams Failed or at Least Scored Because of Filter (2.2% of Filter Hits)
    0 - False Positives Failed Because of the Addition of This Filter (0.0% of Filter Hits)


OTHER NOTABLES
==================================================================
  604 - EASYNET-DYNA & DYNAMIC Hits (57.4% of DYNAMIC Filter Hits)
   86 - SORBS-DUL & DYNAMIC Hits (8.2% of DYNAMIC Filter Hits)
    6 - Number of Spammers That Spoofed Local User (0.1% of Unique Messages)

*I define "Hits That Made a Difference" as spams that would have scored at or below 150% of fail weight without test.  My scoring has improved immensly with many new filters added, so default configurations should benefit much more in this area.


APPROXIMATE EASYNET-DYNA COMPARATIVE STATISTICS*
===================================================================
 873 - Filter Hits (15.8% of Unique Incoming Messages)
===================================================================
 604 - EASYNET-DYNA Filter Hits in Common with DYNAMIC Filter (69.2% of Filter Hits)
 369 - EASYNET-DYNA Filter Hits Not in Common with DYNAMIC FILTER (30.8% of Filter Hits)
 449 - DYNAMIC Filter Hits Not in Common with EASYNET-DYNA (42.6% of Filter Hits)

*Approximated because I wasn't capturing and instead assumed a similar percentage of hits out of the total on Unique Incoming Mail as seen with the DYNAMIC filter, and checked against all individually logged messaged.


Links to the most recent versions of all of the recent filters that I've shared:

DYNAMIC
http://www.mailpure.com/decludefilters/dynamic/Dynamic_09-17-2003.txt


GIBBERISH and ANTIGIBBERISH (use in combination)
http://www.mailpure.com/decludefilters/gibberish/Gibberish_09-16-2003.txt
http://www.mailpure.com/decludefilters/gibberish/AntiGibberish_09-16-2003.txt


GIBBERISHSUB and ANTIGIBBERISHSUB (use in combination)
http://www.mailpure.com/decludefilters/gibberishsub/GibberishSub_09-15-2003.txt
http://www.mailpure.com/decludefilters/gibberishsub/AntiGibberishSub_09-15-2003.txt


OBFUSCATION
http://www.mailpure.com/decludefilters/obfuscation/Obfuscation_09-14-2003c.txt

 
Feedback is important, so please feel free to post a comment or send me an E-mail even if you aren't sure about your conclusion.

Thanks,

Matt


Reply via email to