On Tue, Jul 08, 2003 at 06:43:53PM -0400, Steven W. Orr wrote:
> On Tuesday, Jul 8th 2003 at 16:51 -0400, quoth Jeff Kinz:
> =>So RBL problem #1: RBL doesn't stop the smartest spammers so you will
> =>                   have to filter on content no matter what you do.
> 
> We're in 100% agreement. That's why I use my RBLs, *plus* other traps and 
> filters from inside sendmail *plus*  the latest version of spamassassin. 
> The amount of spam that actually makes it through is *very* low. But the 
> amount that has to be processed by spamassassin is very low to begin with.

Spamassasin is extremely slow. (it does pattern matching among other
things).  Bayesian filtering has been shown to be many times faster.

Drop Spamassasin and replace it with Bogofilter.  Then drop your RBL
filtering and you won't have to worry about the false positives 

Further you won't have scrutinize a huge stream of possible SPAM to make
sure that there are no false positives in it.  Something you've just
indicated your going to start doing everyday.  Whee - What fun!  Oh -
wait - you've just given yourself the job of being a spam filter!

> Some RBLs are better than others. This episode underscored a change that I 
> will be making to my daily reject analyser to make it more obvious to me 
> what domains were rejected.

Congratulations on your new position!   :-)

> =>RBL problem #2: RBL incorrectly rejects large amounts of non-spam.
> =>
> =>RBL problem #3: Any possible notice about false positive (see #2), will
> =>                be completely buried in the mass of true-positive
> =>            notices and not be noticed.

RBL problem #4:  Some RBL list keepers do not update their lists or
have extremely stringent requirements for getting taken off the their
list.
> 
> Ah, but they all *do* respond to people fixing their problems. Otherwise 
> they would all run out of disk space. :-) Seriously, I've never heard of 
> an RBL that won't take someone off their list if the problems get 
> corrected. I think there was one but they're gone now.

Ah- some do-ish, some don't-ish.  No system is perfect and when an RBL
isn't perfect (and none of them are), innocent people get their mail
rejected.  This causes serious hardship in some cases!

Its a Bad Thing(TM).

Its also the most frequently heard complaint about RBL's (This is my
impression only, not a published fact that I'm aware of, after all 
who's counting? :-)  )

> Not sure I agree on this one. Apparently this addres *was* ligitimately 
> tagged and never made any effort before to be taken off. This is the first 
> we've heard f it. Sounds like easynet didn't do anything wrong.

Did they notify the owners of the site that they were being put on 
a blacklist?

If you're going to run a "McCarthy style blacklisting operation" you
should at least have the guts to tell each victim that you are putting
them on the list.  Clearly no one at GNHLUG was aware that they were
somehow being blacklisted.

If your going to claim that the RBL owners would get too much grief from
people if they actually notified folks they were being blacklisted then I would
have to say that running an RBL should require a substantial amount of
fortitude.  A group or person without the courage to stand up and defend 
the principles that they are operating a blacklist under is not the kind
of person(s) that should be in charge of such an effort.  It degenerates
into a revenge list or falls prey to other vices.

> 
> =>For Non-ISP's, using RBL's will ultimately do more harm than good.
> =>(False positives causing missed email)
> =>(ISP's aren't harmed when they block their customer's real email as spam
> =>so they can use an RBL without fear.  Most of them will not lose enough
> =>customers for it to ever be a problem, unfortunately.)
> 
> No. The amount of mail that I get vs the false positives I've had are so 
> incredibly disproportionate as to not even allow me to consider changing 
> what I have going currently. 
Are you an ISP?

Wouldn't No SPAM and NO false positives be even better?
> 
> The spamassassin builtin bayesian seems to be doing a noticably better job 
> than previous versions. 

OK, the new Spamassasin does better than the old one.  Try a well
trained Bayesian solution.  (SA don't cut it... yet.. and when it does
all the other stuff SA does will become necessary, just slowing things
down terribly like it does now.)

> But my experience is that I do better with my 
> sendmail tricks (which you can see at 
> http://steveo.syslang.net/sendmail.mc) in conjunction with spamassassin 
> than I do with just spamassassin. And spamassassin does a lot more than 
> just simple bayesian filtering.
> 
> =>(http://bogofilter.sourceforge.net/bogofilter-faq.html)
> 
> =>I really really want to urge everyone who hasn't tried it yet to take a
> =>look at the Bastian based spam-faltering solutions.  It is really the
> =>best solution.
> 
> Definitely a good blade on the swiss army knive of mail tricks. :-)

Bayesian filtering uses a technique similar to the methods used for
Large Vocabulary Speech Recognition by companies like Dragon Systems
(May they rest in peace).  These techniques require a large, well
distributed population of known samples to train on. (I was a 
developer at Dragon).

A well trained Bayesian filter will do better than any assembly of
tricks you can create.  It will always be more accurate and it will
always produce fewer false positives (or none as has been my
experience.)   Remember that a Bayes filter will notice everything
about an email.  From the apparent source IP to the embedded html and
even the punctuation.  It can be set up to train on everything!

-- 
Jeff Kinz, Open-PC, Emergent Research,  Hudson, MA.  [EMAIL PROTECTED]
copyright 2003.  Use is restricted. Any use is an 
acceptance of the offer at http://www.kinz.org/policy.html.
Don't forget to change your password often.
_______________________________________________
gnhlug-discuss mailing list
[EMAIL PROTECTED]
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss

Reply via email to