Hi Guys
This might be a crazy idea, and I haven't thought through all the
consequences yet but I thought I'd put it out there anyway.
When building the Bayesian DB you ideally want a 1:1 ratio of spam/ham,
at least in my experience it's fairly difficult to populate the spam
side of that to keep up with all the mail that's getting whitelisted and
so my ratio is more like 0.6:1, the accuracy is still very good so I'm
not too concerned about it but I'd like to get it closer to 1:1 if I
can.
With that in mind, would it make sense to use relay attempts to do this,
perhaps mail to non-existant users too? Obviously you don't ever
actually want to accept the final message of the attempt or the various
automated testing systems will flag your server as a relay host, but
would you be able to get enough of the message to be useful in the
Bayesian DB? Maybe accept right up to the final piece of data and then
reject it? Also you may be able to use the PB data to mitigate this so
you only ever go down this road if the sending hosts is already at a
certain score in the PB, which should indicate that the mail itself is
in fact spam since normal mail servers shouldn't get into the PB anyway.
Thoughts?
Dave
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Assp-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/assp-user