On Tue, Dec 23, 2003 at 01:36:20PM +0000, Dale Amon wrote: > > I have yet to see a false positive caused by this even though I get > > quite a lot of this stuff and routinely mark it as spam. > > I can't think of any other reason for someone to do it > though. There has to be a point. Someone is going to a > lot of trouble.
Could it be the case that they're using all these non-spam words to generate false-negatives, thus bypassing bayesian filters? I've seen lots of these messages get through spamassassin in the past week or so, all with very low bayes scores. Training the bayesian classifier with these messages is obviously not going to do me much good, because the next spam is going to have a completely different set of tokens. This method is especially effective in the case where the bayesian classifier only looks at the first MIME attachment, because the second is then free to contain whatever spam tokens they want to put in it. IIRC, this is how most bayesian filters behave. noah
pgp00000.pgp
Description: PGP signature