On Tue, Jul 08, 2003 at 06:43:53PM -0400, Steven W. Orr wrote: > On Tuesday, Jul 8th 2003 at 16:51 -0400, quoth Jeff Kinz: > =>So RBL problem #1: RBL doesn't stop the smartest spammers so you will > => have to filter on content no matter what you do. > > We're in 100% agreement. That's why I use my RBLs, *plus* other traps and > filters from inside sendmail *plus* the latest version of spamassassin. > The amount of spam that actually makes it through is *very* low. But the > amount that has to be processed by spamassassin is very low to begin with.
Spamassasin is extremely slow. (it does pattern matching among other things). Bayesian filtering has been shown to be many times faster. Drop Spamassasin and replace it with Bogofilter. Then drop your RBL filtering and you won't have to worry about the false positives Further you won't have scrutinize a huge stream of possible SPAM to make sure that there are no false positives in it. Something you've just indicated your going to start doing everyday. Whee - What fun! Oh - wait - you've just given yourself the job of being a spam filter! > Some RBLs are better than others. This episode underscored a change that I > will be making to my daily reject analyser to make it more obvious to me > what domains were rejected. Congratulations on your new position! :-) > =>RBL problem #2: RBL incorrectly rejects large amounts of non-spam. > => > =>RBL problem #3: Any possible notice about false positive (see #2), will > => be completely buried in the mass of true-positive > => notices and not be noticed. RBL problem #4: Some RBL list keepers do not update their lists or have extremely stringent requirements for getting taken off the their list. > > Ah, but they all *do* respond to people fixing their problems. Otherwise > they would all run out of disk space. :-) Seriously, I've never heard of > an RBL that won't take someone off their list if the problems get > corrected. I think there was one but they're gone now. Ah- some do-ish, some don't-ish. No system is perfect and when an RBL isn't perfect (and none of them are), innocent people get their mail rejected. This causes serious hardship in some cases! Its a Bad Thing(TM). Its also the most frequently heard complaint about RBL's (This is my impression only, not a published fact that I'm aware of, after all who's counting? :-) ) > Not sure I agree on this one. Apparently this addres *was* ligitimately > tagged and never made any effort before to be taken off. This is the first > we've heard f it. Sounds like easynet didn't do anything wrong. Did they notify the owners of the site that they were being put on a blacklist? If you're going to run a "McCarthy style blacklisting operation" you should at least have the guts to tell each victim that you are putting them on the list. Clearly no one at GNHLUG was aware that they were somehow being blacklisted. If your going to claim that the RBL owners would get too much grief from people if they actually notified folks they were being blacklisted then I would have to say that running an RBL should require a substantial amount of fortitude. A group or person without the courage to stand up and defend the principles that they are operating a blacklist under is not the kind of person(s) that should be in charge of such an effort. It degenerates into a revenge list or falls prey to other vices. > > =>For Non-ISP's, using RBL's will ultimately do more harm than good. > =>(False positives causing missed email) > =>(ISP's aren't harmed when they block their customer's real email as spam > =>so they can use an RBL without fear. Most of them will not lose enough > =>customers for it to ever be a problem, unfortunately.) > > No. The amount of mail that I get vs the false positives I've had are so > incredibly disproportionate as to not even allow me to consider changing > what I have going currently. Are you an ISP? Wouldn't No SPAM and NO false positives be even better? > > The spamassassin builtin bayesian seems to be doing a noticably better job > than previous versions. OK, the new Spamassasin does better than the old one. Try a well trained Bayesian solution. (SA don't cut it... yet.. and when it does all the other stuff SA does will become necessary, just slowing things down terribly like it does now.) > But my experience is that I do better with my > sendmail tricks (which you can see at > http://steveo.syslang.net/sendmail.mc) in conjunction with spamassassin > than I do with just spamassassin. And spamassassin does a lot more than > just simple bayesian filtering. > > =>(http://bogofilter.sourceforge.net/bogofilter-faq.html) > > =>I really really want to urge everyone who hasn't tried it yet to take a > =>look at the Bastian based spam-faltering solutions. It is really the > =>best solution. > > Definitely a good blade on the swiss army knive of mail tricks. :-) Bayesian filtering uses a technique similar to the methods used for Large Vocabulary Speech Recognition by companies like Dragon Systems (May they rest in peace). These techniques require a large, well distributed population of known samples to train on. (I was a developer at Dragon). A well trained Bayesian filter will do better than any assembly of tricks you can create. It will always be more accurate and it will always produce fewer false positives (or none as has been my experience.) Remember that a Bayes filter will notice everything about an email. From the apparent source IP to the embedded html and even the punctuation. It can be set up to train on everything! -- Jeff Kinz, Open-PC, Emergent Research, Hudson, MA. [EMAIL PROTECTED] copyright 2003. Use is restricted. Any use is an acceptance of the offer at http://www.kinz.org/policy.html. Don't forget to change your password often. _______________________________________________ gnhlug-discuss mailing list [EMAIL PROTECTED] http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss